This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

Packages & chemin

setwd("~/Dropbox/Evry/M1GENIOMHE/TP/TP1/")
The working directory was changed to /Users/agatheguilloux/Dropbox/Evry/M1GENIOMHE/TP/TP1 inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the the working directory for notebook chunks.

Cheat sheets : https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf

Data

Récuperer les données https://archive.ics.uci.edu/ml/datasets/Student+Performance. ### Chargement des données

glimpse(data)
Observations: 201
Variables: 38
$ school     <fctr> GP, GP, GP, GP, GP, GP,...
$ sex        <fctr> F, F, M, F, F, M, F, F,...
$ age        <int> 18, 17, 16, 17, 15, 15, ...
$ address    <fctr> U, U, U, U, U, U, U, U,...
$ famsize    <fctr> GT3, GT3, LE3, GT3, GT3...
$ Pstatus    <fctr> A, T, T, A, T, A, T, T,...
$ Medu       <int> 4, 1, 2, 4, 2, 2, 4, 3, ...
$ Fedu       <int> 4, 1, 2, 4, 1, 2, 4, 3, ...
$ Mjob       <fctr> at_home, at_home, other...
$ Fjob       <fctr> teacher, other, other, ...
$ reason     <fctr> course, course, home, h...
$ guardian   <fctr> mother, father, mother,...
$ traveltime <int> 2, 1, 1, 2, 3, 1, 1, 3, ...
$ studytime  <int> 2, 2, 2, 2, 3, 3, 1, 2, ...
$ failures.x <int> 0, 0, 0, 0, 0, 0, 0, 0, ...
$ schoolsup  <fctr> yes, no, no, yes, no, n...
$ famsup     <fctr> no, yes, no, yes, yes, ...
$ paid       <fctr> no, no, no, no, no, no,...
$ activities <fctr> no, no, no, no, yes, no...
$ nursery    <fctr> yes, no, yes, yes, yes,...
$ higher     <fctr> yes, yes, yes, yes, yes...
$ internet   <fctr> no, yes, yes, no, yes, ...
$ romantic   <fctr> no, no, no, no, no, yes...
$ famrel     <int> 4, 5, 4, 4, 5, 4, 4, 5, ...
$ freetime   <int> 3, 3, 4, 1, 2, 5, 4, 3, ...
$ goout      <int> 4, 3, 4, 4, 2, 2, 4, 2, ...
$ Dalc       <int> 1, 1, 1, 1, 1, 1, 1, 1, ...
$ Walc       <int> 1, 1, 1, 1, 1, 1, 2, 1, ...
$ health     <int> 3, 3, 3, 1, 4, 3, 2, 4, ...
$ absences.x <int> 6, 4, 0, 6, 4, 0, 4, 4, ...
$ G1.x       <int> 5, 5, 12, 6, 10, 14, 14,...
$ G2.x       <int> 6, 5, 12, 5, 12, 16, 14,...
$ G3.x       <int> 6, 6, 11, 6, 12, 16, 14,...
$ failures.y <int> 0, 0, 0, 0, 0, 0, 0, 0, ...
$ absences.y <int> 4, 2, 0, 2, 0, 0, 6, 2, ...
$ G1.y       <int> 0, 9, 13, 10, 10, 14, 17...
$ G2.y       <int> 11, 11, 12, 13, 12, 14, ...
$ G3.y       <int> 11, 11, 13, 13, 13, 15, ...
data = mutate(data, Y = (G1.x+G2.x+G3.x+G1.y+G2.y+G3.y)/6)
data = select(data , - G1.x, -G2.x - G3.x - G1.y - G2.y -G3.y )

Construisez un modèle à partir avec les variables explicatives

sex + age + address + famsize + Pstatus + Medu + Fjob + traveltime + failures.x + schoolsup + famsup + nursery + internet + romantic + freetime + goout + failures.y

Faites tous les diagnostics pour obtenir un bon modèle.

LS0tCnRpdGxlOiAiVFAgcmVuZHUgMSA6IHLDqWdyZXNzaW9uIGxpbsOpYWlyZSIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKVGhpcyBpcyBhbiBbUiBNYXJrZG93bl0oaHR0cDovL3JtYXJrZG93bi5yc3R1ZGlvLmNvbSkgTm90ZWJvb2suIFdoZW4geW91IGV4ZWN1dGUgY29kZSB3aXRoaW4gdGhlIG5vdGVib29rLCB0aGUgcmVzdWx0cyBhcHBlYXIgYmVuZWF0aCB0aGUgY29kZS4gCgpUcnkgZXhlY3V0aW5nIHRoaXMgY2h1bmsgYnkgY2xpY2tpbmcgdGhlICpSdW4qIGJ1dHRvbiB3aXRoaW4gdGhlIGNodW5rIG9yIGJ5IHBsYWNpbmcgeW91ciBjdXJzb3IgaW5zaWRlIGl0IGFuZCBwcmVzc2luZyAqQ21kK1NoaWZ0K0VudGVyKi4gCgojUGFja2FnZXMgJiBjaGVtaW4KYGBge3J9CmxpYnJhcnkodGlkeXZlcnNlKQpzZXR3ZCgifi9Ecm9wYm94L0V2cnkvTTFHRU5JT01IRS9UUC9UUDEvIikKYGBgCgpDaGVhdCBzaGVldHMgOgpodHRwczovL3d3dy5yc3R1ZGlvLmNvbS93cC1jb250ZW50L3VwbG9hZHMvMjAxNS8wMi9kYXRhLXdyYW5nbGluZy1jaGVhdHNoZWV0LnBkZgpodHRwczovL3d3dy5yc3R1ZGlvLmNvbS93cC1jb250ZW50L3VwbG9hZHMvMjAxNS8wMy9nZ3Bsb3QyLWNoZWF0c2hlZXQucGRmCgojIERhdGEKUsOpY3VwZXJlciBsZXMgZG9ubsOpZXMgaHR0cHM6Ly9hcmNoaXZlLmljcy51Y2kuZWR1L21sL2RhdGFzZXRzL1N0dWRlbnQrUGVyZm9ybWFuY2UuCiMjIyBDaGFyZ2VtZW50IGRlcyBkb25uw6llcwpgYGB7cn0KZDE9cmVhZC50YWJsZSgic3R1ZGVudC9zdHVkZW50LW1hdC5jc3YiLHNlcD0iOyIsaGVhZGVyPVRSVUUpCmQyPXJlYWQudGFibGUoInN0dWRlbnQvc3R1ZGVudC1wb3IuY3N2IixzZXA9IjsiLGhlYWRlcj1UUlVFKQoKI2RhdGE9bWVyZ2UoZDEsZDIsYnk9Yygic2Nob29sIiwic2V4IiwiYWdlIiwiYWRkcmVzcyIsImZhbXNpemUiLCJQc3RhdHVzIiwiTWVkdSIsIkZlZHUiLCJNam9iIiwiRmpvYiIsInJlYXNvbiIsIm51cnNlcnkiLCJpbnRlcm5ldCIsImFjdGl2aXRpZXMiKSkKZGF0YSA9IGRwbHlyOjpsZWZ0X2pvaW4oZDEsZDIsYnkgPSBjKCJzY2hvb2wiLCAic2V4IiwgImFnZSIsICJhZGRyZXNzIiwgImZhbXNpemUiLCAiUHN0YXR1cyIsICJNZWR1IiwgIkZlZHUiLCAiTWpvYiIsICJGam9iIiwgInJlYXNvbiIsICJndWFyZGlhbiIsICJ0cmF2ZWx0aW1lIiwgInN0dWR5dGltZSIsICJzY2hvb2xzdXAiLCAiZmFtc3VwIiwgInBhaWQiLCAiYWN0aXZpdGllcyIsICJudXJzZXJ5IiwgImhpZ2hlciIsICJpbnRlcm5ldCIsICJyb21hbnRpYyIsICJmYW1yZWwiLCAiZnJlZXRpbWUiLCAiZ29vdXQiLCAiRGFsYyIsICJXYWxjIiwgImhlYWx0aCIpKQoKZGF0YSA9IGRhdGFbY29tcGxldGUuY2FzZXMoZGF0YSksXQoKZ2xpbXBzZShkYXRhKQpgYGAKCmBgYHtyfQpkYXRhID0gbXV0YXRlKGRhdGEsIFkgPSAoRzEueCtHMi54K0czLngrRzEueStHMi55K0czLnkpLzYpCmRhdGEgPSBzZWxlY3QoZGF0YSAsIC0gRzEueCwgLUcyLnggLSBHMy54IC0gRzEueSAtIEcyLnkgLUczLnkgKQpgYGAKCiMjIyBDb25zdHJ1aXNleiB1biBtb2TDqGxlIMOgIHBhcnRpciBhdmVjIGxlcyB2YXJpYWJsZXMgZXhwbGljYXRpdmVzCnNleCArIGFnZSArIGFkZHJlc3MgKyBmYW1zaXplICsgUHN0YXR1cyArIE1lZHUgKyBGam9iICsgdHJhdmVsdGltZSArIGZhaWx1cmVzLnggKyBzY2hvb2xzdXAgKyBmYW1zdXAgKyBudXJzZXJ5ICsgaW50ZXJuZXQgKyByb21hbnRpYyArIGZyZWV0aW1lICsgZ29vdXQgKyBmYWlsdXJlcy55CgoKCiMjIyBGYWl0ZXMgdG91cyBsZXMgZGlhZ25vc3RpY3MgcG91ciBvYnRlbmlyIHVuIGJvbiBtb2TDqGxlLg==