# Statistics for genomic data science in health

## Project

### Due date:

Send one report per group as a .pdf file (cyril.dalmasso@univ-evry.fr) by Sunday, the $12^{th}$ of January, 2020 (deadline extended to the $20^{th}$ of January). For the constitution of the groups (whose size is limited to 3), it is strongly recommended to mix students from GENIOMHE and DATA SCIENCES.

### Datasets:

Choose a dataset using the following link. You are allowed to propose another dataset (by adding rows to the table).

### Language:

English (M2 GENIOMHE/DATA SCIENCES) or French (M2 DATA SCIENCES only).

### Length:

Approximately 10 pages of text, in addition to figures, tables, R commands, R outputs and appendices. All R commands should be included, but only useful outputs should be printed (in particular, please avoid pages of unnecessary tables).

### Content:

The report should be subdivided into different sections including the title page, an introduction (description of the dataset, design, objectives,…), a 'methods' section (description and justification of the methods), the results, a discussion (difficulties encountered, comparison with published results, …), conclusion and possibly annexes.