Extraction de connaissances à partir de données structurées et non structurées

class: center, middle, inverse, title-slide

.title[
# Extraction de connaissances à partir de données structurées et non structurées
]
.subtitle[
## Séance 6 : Modélisation supervisée via régression
]

---

## Introduction

- Modèle linéaire
  - Régression multiple

- Modèle linéaire général
  - Limites du modèle linéaire
  - Solution adoptée

- Régression logistique
  - Pourquoi ?

---
class: center, middle, section
## Régression linéraire

---
### But de la régression

- Corrélation : liaison *symétrique* entre deux variables

- Rôle pas forcément symétrique (entre taille et poids, par exemple)
  - Poids : variable **dépendante** (dénommée souvent `$Y$`)
  - Âge : variable **explicative** (dénommée souvent `$X$`)
  - Le poids est fonction de l'âge (et non l'inverse)

- Quelle est la relation de dépendance de `$Y$` par rapport à `$X$` ?

- Recherche d'une **régression** de `$Y$` en fonction de `$X$`

---
### Exemple

---
### Description

- Trois fonctions essentielles de la régression
  - Décrire la façon dont `$Y$` est liée à `$X$`
	- Tester l'existence de cette liaison
	- Estimer une valeur de `$Y$` pour une valeur de `$X$` donnée

- Obtenir les valeurs de l'équation `$Y = bX + a$`
	- `$b$` : pente de la droite
	- `$a$` : ordonnée à l'origine

- Minimisation de la somme des carrés des distances entre les points et la droite

---
### Calcul et test

- *Droite des moindres carrés* appelée aussi droite de régression

- Pente `$b$` fournie par le rapport `$\frac{cov(XY)}{var(X)} = \frac{\sum xy}{\sum x^2}$`
- Coordonnée à l'origine `$a$` obtenue avec `$\mu_y - b \hat \mu_x X$`
- Passe par le point `$(\mu_x, \mu_y)$` (point moyen)

- Si `$X$` et `$Y$` non liée, droite horizontale avec pente nulle - `$b = 0$`

- Sous `$H_0$`, le rapport `$\frac{|b - 0|}{\sigma_b}$` suit une loi `$T$` de Student - `$ddl = n-2$`

- Test de nullité de la pente `$b$` et de `$R^2$`

- Estimation d'une valeur de `$Y$` obtenue grâce à la formule `$bx + a$` (avec un intervalle de confiance)

---
### Exemple

---
### Exemple

<div id="htmlwidget-7e6a19bd8b68c3b4411a" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-7e6a19bd8b68c3b4411a">{"x":{"filter":"none","vertical":false,"data":[["y1 = x","y2 = x","y3 = x"],[-2.02202476717662,1.22550489334776,-0.0249242459473261],[3.02580524103324,-2.31468548454934,-0.0514335618182818],[0.782265242160913,0.691349056376858,0.000342043823914556]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>a<\/th>\n      <th>b<\/th>\n      <th>r2<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"paging":false,"seaching":false,"ordering":false,"columnDefs":[{"targets":1,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 2, 3, \",\", \".\", null);\n  }"},{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 2, 3, \",\", \".\", null);\n  }"},{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 2, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[1,2,3]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render"],"jsHooks":[]}</script>

---
### Avec plus de deux variables

- Obtenir la fonction `$f$` tel que `$Y = f(X_1, X_2, X_3, \ldots, X_p)$`

- Plusieurs variables *explicatives*

- Exemple avec `$p=2$` : `$Y = b_0 + b_1 X_1 + b_2 X_2$`

- Recherche de l'hyper-plan minimisant la distance entre les points et celui-ci

- Evaluer la force de la liaison entre `$Y$` et chacun des `$X_i$`

- Etablir une *hiérarchie* entre ces différentes liaisons

---
### Régression multiple

- Utilisation de la régression multiple
  - Eliminer le biais éventuellement créé par d'autre variables
	- Améliorer l'estimation de nouvelles valeurs en réduisant les intervalles de confiance

- Ajustement au sens des moindres carrés

- Test de nullité du coefficient pour chaque variable explicative

- Test de nullité du coefficient de corrélation globale `$R^2$`

---
### Exemple - données `wine`

<div id="htmlwidget-277579f0c154d777e3a4" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-277579f0c154d777e3a4">{"x":{"filter":"none","vertical":false,"data":[[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3],[14.23,13.2,13.16,14.37,13.24,14.2,14.39,14.06,14.83,13.86,14.1,14.12,13.75,14.75,14.38,13.63,14.3,13.83,14.19,13.64,14.06,12.93,13.71,12.85,13.5,13.05,13.39,13.3,13.87,14.02,13.73,13.58,13.68,13.76,13.51,13.48,13.28,13.05,13.07,14.22,13.56,13.41,13.88,13.24,13.05,14.21,14.38,13.9,14.1,13.94,13.05,13.83,13.82,13.77,13.74,13.56,14.22,13.29,13.72,12.37,12.33,12.64,13.67,12.37,12.17,12.37,13.11,12.37,13.34,12.21,12.29,13.86,13.49,12.99,11.96,11.66,13.03,11.84,12.33,12.7,12,12.72,12.08,13.05,11.84,12.67,12.16,11.65,11.64,12.08,12.08,12,12.69,12.29,11.62,12.47,11.81,12.29,12.37,12.29,12.08,12.6,12.34,11.82,12.51,12.42,12.25,12.72,12.22,11.61,11.46,12.52,11.76,11.41,12.08,11.03,11.82,12.42,12.77,12,11.45,11.56,12.42,13.05,11.87,12.07,12.43,11.79,12.37,12.04,12.86,12.88,12.81,12.7,12.51,12.6,12.25,12.53,13.49,12.84,12.93,13.36,13.52,13.62,12.25,13.16,13.88,12.87,13.32,13.08,13.5,12.79,13.11,13.23,12.58,13.17,13.84,12.45,14.34,13.48,12.36,13.69,12.85,12.96,13.78,13.73,13.45,12.82,13.58,13.4,12.2,12.77,14.16,13.71,13.4,13.27,13.17,14.13],[1.71,1.78,2.36,1.95,2.59,1.76,1.87,2.15,1.64,1.35,2.16,1.48,1.73,1.73,1.87,1.81,1.92,1.57,1.59,3.1,1.63,3.8,1.86,1.6,1.81,2.05,1.77,1.72,1.9,1.68,1.5,1.66,1.83,1.53,1.8,1.81,1.64,1.65,1.5,3.99,1.71,3.84,1.89,3.98,1.77,4.04,3.59,1.68,2.02,1.73,1.73,1.65,1.75,1.9,1.67,1.73,1.7,1.97,1.43,0.94,1.1,1.36,1.25,1.13,1.45,1.21,1.01,1.17,0.94,1.19,1.61,1.51,1.66,1.67,1.09,1.88,0.9,2.89,0.99,3.87,0.92,1.81,1.13,3.86,0.89,0.98,1.61,1.67,2.06,1.33,1.83,1.51,1.53,2.83,1.99,1.52,2.12,1.41,1.07,3.17,2.08,1.34,2.45,1.72,1.73,2.55,1.73,1.75,1.29,1.35,3.74,2.43,2.68,0.74,1.39,1.51,1.47,1.61,3.43,3.43,2.4,2.05,4.43,5.8,4.31,2.16,1.53,2.13,1.63,4.3,1.35,2.99,2.31,3.55,1.24,2.46,4.72,5.51,3.59,2.96,2.81,2.56,3.17,4.95,3.88,3.57,5.04,4.61,3.24,3.9,3.12,2.67,1.9,3.3,1.29,5.19,4.12,3.03,1.68,1.67,3.83,3.26,3.27,3.45,2.76,4.36,3.7,3.37,2.58,4.6,3.03,2.39,2.51,5.65,3.91,4.28,2.59,4.1],[2.43,2.14,2.67,2.5,2.87,2.45,2.45,2.61,2.17,2.27,2.3,2.32,2.41,2.39,2.38,2.7,2.72,2.62,2.48,2.56,2.28,2.65,2.36,2.52,2.61,3.22,2.62,2.14,2.8,2.21,2.7,2.36,2.36,2.7,2.65,2.41,2.84,2.55,2.1,2.51,2.31,2.12,2.59,2.29,2.1,2.44,2.28,2.12,2.4,2.27,2.04,2.6,2.42,2.68,2.25,2.46,2.3,2.68,2.5,1.36,2.28,2.02,1.92,2.16,2.53,2.56,1.7,1.92,2.36,1.75,2.21,2.67,2.24,2.6,2.3,1.92,1.71,2.23,1.95,2.4,2,2.2,2.51,2.32,2.58,2.24,2.31,2.62,2.46,2.3,2.32,2.42,2.26,2.22,2.28,2.2,2.74,1.98,2.1,2.21,1.7,1.9,2.46,1.88,1.98,2.27,2.12,2.28,1.94,2.7,1.82,2.17,2.92,2.5,2.5,2.2,1.99,2.19,1.98,2,2.42,3.23,2.73,2.13,2.39,2.17,2.29,2.78,2.3,2.38,2.32,2.4,2.4,2.36,2.25,2.2,2.54,2.64,2.19,2.61,2.7,2.35,2.72,2.35,2.2,2.15,2.23,2.48,2.38,2.36,2.62,2.48,2.75,2.28,2.1,2.32,2.38,2.64,2.7,2.64,2.38,2.54,2.58,2.35,2.3,2.26,2.6,2.3,2.69,2.86,2.32,2.28,2.48,2.45,2.48,2.26,2.37,2.74],[15.6,11.2,18.6,16.8,21,15.2,14.6,17.6,14,16,18,16.8,16,11.4,12,17.2,20,20,16.5,15.2,16,18.6,16.6,17.8,20,25,16.1,17,19.4,16,22.5,19.1,17.2,19.5,19,20.5,15.5,18,15.5,13.2,16.2,18.8,15,17.5,17,18.9,16,16,18.8,17.4,12.4,17.2,14,17.1,16.4,20.5,16.3,16.8,16.7,10.6,16,16.8,18,19,19,18.1,15,19.6,17,16.8,20.4,25,24,30,21,16,16,18,14.8,23,19,18.8,24,22.5,18,18,22.8,26,21.6,23.6,18.5,22,20.7,18,18,19,21.5,16,18.5,18,17.5,18.5,21,19.5,20.5,22,19,22.5,19,20,19.5,21,20,21,22.5,21.5,20.8,22.5,16,19,20,28.5,26.5,21.5,21,21,21.5,28.5,24.5,22,18,20,24,21.5,17.5,18.5,21,25,19.5,24,21,20,23.5,20,18.5,21,20,21.5,21.5,21.5,24,22,25.5,18.5,20,22,19.5,27,25,22.5,21,20,22,18.5,22,22.5,23,19.5,24.5,25,19,19.5,20,20.5,23,20,20,24.5],[127,100,101,113,118,112,96,121,97,98,105,95,89,91,102,112,120,115,108,116,126,102,101,95,96,124,93,94,107,96,101,106,104,132,110,100,110,98,98,128,117,90,101,103,107,111,102,101,103,108,92,94,111,115,118,116,118,102,108,88,101,100,94,87,104,98,78,78,110,151,103,86,87,139,101,97,86,112,136,101,86,86,78,85,94,99,90,88,84,70,81,86,80,88,98,162,134,85,88,88,97,88,98,86,85,90,80,84,92,94,107,88,103,88,84,85,86,108,80,87,96,119,102,86,82,85,86,92,88,80,122,104,98,106,85,94,89,96,88,101,96,89,97,92,112,102,80,86,92,113,123,112,116,98,103,93,89,97,98,89,88,107,106,106,90,88,111,88,105,112,96,86,91,95,102,120,120,96],[2.8,2.65,2.8,3.85,2.8,3.27,2.5,2.6,2.8,2.98,2.95,2.2,2.6,3.1,3.3,2.85,2.8,2.95,3.3,2.7,3,2.41,2.61,2.48,2.53,2.63,2.85,2.4,2.95,2.65,3,2.86,2.42,2.95,2.35,2.7,2.6,2.45,2.4,3,3.15,2.45,3.25,2.64,3,2.85,3.25,3.1,2.75,2.88,2.72,2.45,3.88,3,2.6,2.96,3.2,3,3.4,1.98,2.05,2.02,2.1,3.5,1.89,2.42,2.98,2.11,2.53,1.85,1.1,2.95,1.88,3.3,3.38,1.61,1.95,1.72,1.9,2.83,2.42,2.2,2,1.65,2.2,2.2,1.78,1.92,1.95,2.2,1.6,1.45,1.38,2.45,3.02,2.5,1.6,2.55,3.52,2.85,2.23,1.45,2.56,2.5,2.2,1.68,1.65,1.38,2.36,2.74,3.18,2.55,1.75,2.48,2.56,2.46,1.98,2,1.63,2,2.9,3.18,2.2,2.62,2.86,2.6,2.74,2.13,2.22,2.1,1.51,1.3,1.15,1.7,2,1.62,1.38,1.79,1.62,2.32,1.54,1.4,1.55,2,1.38,1.5,0.98,1.7,1.93,1.41,1.4,1.48,2.2,1.8,1.48,1.74,1.8,1.9,2.8,2.6,2.3,1.83,1.65,1.39,1.35,1.28,1.7,1.48,1.55,1.98,1.25,1.39,1.68,1.68,1.8,1.59,1.65,2.05],[3.06,2.76,3.24,3.49,2.69,3.39,2.52,2.51,2.98,3.15,3.32,2.43,2.76,3.69,3.64,2.91,3.14,3.4,3.93,3.03,3.17,2.41,2.88,2.37,2.61,2.68,2.94,2.19,2.97,2.33,3.25,3.19,2.69,2.74,2.53,2.98,2.68,2.43,2.64,3.04,3.29,2.68,3.56,2.63,3,2.65,3.17,3.39,2.92,3.54,3.27,2.99,3.74,2.79,2.9,2.78,3,3.23,3.67,0.57,1.09,1.41,1.79,3.1,1.75,2.65,3.18,2,1.3,1.28,1.02,2.86,1.84,2.89,2.14,1.57,2.03,1.32,1.85,2.55,2.26,2.53,1.58,1.59,2.21,1.94,1.69,1.61,1.69,1.59,1.5,1.25,1.46,2.25,2.26,2.27,0.99,2.5,3.75,2.99,2.17,1.36,2.11,1.64,1.92,1.84,2.03,1.76,2.04,2.92,2.58,2.27,2.03,2.01,2.29,2.17,1.6,2.09,1.25,1.64,2.79,5.08,2.13,2.65,3.03,2.65,3.15,2.24,2.45,1.75,1.25,1.22,1.09,1.2,0.58,0.66,0.47,0.6,0.48,0.6,0.5,0.5,0.52,0.8,0.78,0.55,0.34,0.65,0.76,1.39,1.57,1.36,1.28,0.83,0.58,0.63,0.83,0.58,1.31,1.1,0.92,0.56,0.6,0.7,0.68,0.47,0.92,0.66,0.84,0.96,0.49,0.51,0.7,0.61,0.75,0.69,0.68,0.76],[0.28,0.26,0.3,0.24,0.39,0.34,0.3,0.31,0.29,0.22,0.22,0.26,0.29,0.43,0.29,0.3,0.33,0.4,0.32,0.17,0.24,0.25,0.27,0.26,0.28,0.47,0.34,0.27,0.37,0.26,0.29,0.22,0.42,0.5,0.29,0.26,0.34,0.29,0.28,0.2,0.34,0.27,0.17,0.32,0.28,0.3,0.27,0.21,0.32,0.32,0.17,0.22,0.32,0.39,0.21,0.2,0.26,0.31,0.19,0.28,0.63,0.53,0.32,0.19,0.45,0.37,0.26,0.27,0.55,0.14,0.37,0.21,0.27,0.21,0.13,0.34,0.24,0.43,0.35,0.43,0.3,0.26,0.4,0.61,0.22,0.3,0.43,0.4,0.48,0.42,0.52,0.5,0.58,0.25,0.17,0.32,0.14,0.29,0.24,0.45,0.26,0.29,0.34,0.37,0.32,0.66,0.37,0.48,0.39,0.29,0.24,0.26,0.6,0.42,0.43,0.52,0.3,0.34,0.43,0.37,0.32,0.47,0.43,0.3,0.21,0.37,0.39,0.58,0.4,0.42,0.21,0.24,0.27,0.17,0.6,0.63,0.53,0.63,0.58,0.53,0.53,0.37,0.5,0.47,0.29,0.43,0.4,0.47,0.45,0.34,0.22,0.24,0.26,0.61,0.53,0.61,0.48,0.63,0.53,0.52,0.5,0.5,0.6,0.4,0.41,0.52,0.43,0.4,0.39,0.27,0.4,0.48,0.44,0.52,0.43,0.43,0.53,0.56],[2.29,1.28,2.81,2.18,1.82,1.97,1.98,1.25,1.98,1.85,2.38,1.57,1.81,2.81,2.96,1.46,1.97,1.72,1.86,1.66,2.1,1.98,1.69,1.46,1.66,1.92,1.45,1.35,1.76,1.98,2.38,1.95,1.97,1.35,1.54,1.86,1.36,1.44,1.37,2.08,2.34,1.48,1.7,1.66,2.03,1.25,2.19,2.14,2.38,2.08,2.91,2.29,1.87,1.68,1.62,2.45,2.03,1.66,2.04,0.42,0.41,0.62,0.73,1.87,1.03,2.08,2.28,1.04,0.42,2.5,1.46,1.87,1.03,1.96,1.65,1.15,1.46,0.95,2.76,1.95,1.43,1.77,1.4,1.62,2.35,1.46,1.56,1.34,1.35,1.38,1.64,1.63,1.62,1.99,1.35,3.28,1.56,1.77,1.95,2.81,1.4,1.35,1.31,1.42,1.48,1.42,1.63,1.63,2.08,2.49,3.58,1.22,1.05,1.44,1.04,2.01,1.53,1.61,0.83,1.87,1.83,1.87,1.71,2.01,2.91,1.35,1.77,1.76,1.9,1.35,0.94,0.83,0.83,0.84,1.25,0.94,0.8,1.1,0.88,0.81,0.75,0.64,0.55,1.02,1.14,1.3,0.68,0.86,1.25,1.14,1.25,1.26,1.56,1.87,1.4,1.55,1.56,1.14,2.7,2.29,1.04,0.8,0.96,0.94,1.03,1.15,1.46,0.97,1.54,1.11,0.73,0.64,1.24,1.06,1.41,1.35,1.46,1.35],[5.64,4.38,5.68,7.8,4.32,6.75,5.25,5.05,5.2,7.22,5.75,5,5.6,5.4,7.5,7.3,6.2,6.6,8.7,5.1,5.65,4.5,3.8,3.93,3.52,3.58,4.8,3.95,4.5,4.7,5.7,6.9,3.84,5.4,4.2,5.1,4.6,4.25,3.7,5.1,6.13,4.28,5.43,4.36,5.04,5.24,4.9,6.1,6.2,8.9,7.2,5.6,7.05,6.3,5.85,6.25,6.38,6,6.8,1.95,3.27,5.75,3.8,4.45,2.95,4.6,5.3,4.68,3.17,2.85,3.05,3.38,3.74,3.35,3.21,3.8,4.6,2.65,3.4,2.57,2.5,3.9,2.2,4.8,3.05,2.62,2.45,2.6,2.8,1.74,2.4,3.6,3.05,2.15,3.25,2.6,2.5,2.9,4.5,2.3,3.3,2.45,2.8,2.06,2.94,2.7,3.4,3.3,2.7,2.65,2.9,2,3.8,3.08,2.9,1.9,1.95,2.06,3.4,1.28,3.25,6,2.08,2.6,2.8,2.76,3.94,3,2.12,2.6,4.1,5.4,5.7,5,5.45,7.1,3.85,5,5.7,4.92,4.6,5.6,4.35,4.4,8.21,4,4.9,7.65,8.42,9.4,8.6,10.8,7.1,10.52,7.6,7.9,9.01,7.5,13,11.75,7.65,5.88,5.58,5.28,9.58,6.62,10.68,10.26,8.66,8.5,5.5,9.899999,9.7,7.7,7.3,10.2,9.3,9.2],[1.04,1.05,1.03,0.86,1.04,1.05,1.02,1.06,1.08,1.01,1.25,1.17,1.15,1.25,1.2,1.28,1.07,1.13,1.23,0.96,1.09,1.03,1.11,1.09,1.12,1.13,0.92,1.02,1.25,1.04,1.19,1.09,1.23,1.25,1.1,1.04,1.09,1.12,1.18,0.89,0.95,0.91,0.88,0.82,0.88,0.87,1.04,0.91,1.07,1.12,1.12,1.24,1.01,1.13,0.92,0.98,0.94,1.07,0.89,1.05,1.25,0.98,1.23,1.22,1.45,1.19,1.12,1.12,1.02,1.28,0.906,1.36,0.98,1.31,0.99,1.23,1.19,0.96,1.06,1.19,1.38,1.16,1.31,0.84,0.79,1.23,1.33,1.36,1,1.07,1.08,1.05,0.96,1.15,1.16,1.16,0.95,1.23,1.04,1.42,1.27,1.04,0.8,0.94,1.04,0.86,1,0.88,0.86,0.96,0.75,0.9,1.23,1.1,0.93,1.71,0.95,1.06,0.7,0.93,0.8,0.93,0.92,0.73,0.75,0.86,0.69,0.97,0.89,0.79,0.76,0.74,0.66,0.78,0.75,0.73,0.75,0.82,0.81,0.89,0.77,0.7,0.89,0.91,0.65,0.6,0.58,0.54,0.55,0.57,0.59,0.48,0.61,0.56,0.58,0.6,0.57,0.67,0.57,0.57,0.56,0.96,0.87,0.68,0.7,0.78,0.85,0.72,0.74,0.67,0.66,0.57,0.62,0.64,0.7,0.59,0.6,0.61],[3.92,3.4,3.17,3.45,2.93,2.85,3.58,3.58,2.85,3.55,3.17,2.82,2.9,2.73,3,2.88,2.65,2.57,2.82,3.36,3.71,3.52,4,3.63,3.82,3.2,3.22,2.77,3.4,3.59,2.71,2.88,2.87,3,2.87,3.47,2.78,2.51,2.69,3.53,3.38,3,3.56,3,3.35,3.33,3.44,3.33,2.75,3.1,2.91,3.37,3.26,2.93,3.2,3.03,3.31,2.84,2.87,1.82,1.67,1.59,2.46,2.87,2.23,2.3,3.18,3.48,1.93,3.07,1.82,3.16,2.78,3.5,3.13,2.14,2.48,2.52,2.31,3.13,3.12,3.14,2.72,2.01,3.08,3.16,2.26,3.21,2.75,3.21,2.27,2.65,2.06,3.3,2.96,2.63,2.26,2.74,2.77,2.83,2.96,2.77,3.38,2.44,3.57,3.3,3.17,2.42,3.02,3.26,2.81,2.78,2.5,2.31,3.19,2.87,3.33,2.96,2.12,3.05,3.39,3.69,3.12,3.1,3.64,3.28,2.84,2.44,2.78,2.57,1.29,1.42,1.36,1.29,1.51,1.58,1.27,1.69,1.82,2.15,2.31,2.47,2.06,2.05,2,1.68,1.33,1.86,1.62,1.33,1.3,1.47,1.33,1.51,1.55,1.48,1.64,1.73,1.96,1.78,1.58,1.82,2.11,1.75,1.68,1.75,1.56,1.75,1.8,1.92,1.83,1.63,1.71,1.74,1.56,1.56,1.62,1.6],[1065,1050,1185,1480,735,1450,1290,1295,1045,1045,1510,1280,1320,1150,1547,1310,1280,1130,1680,845,780,770,1035,1015,845,830,1195,1285,915,1035,1285,1515,990,1235,1095,920,880,1105,1020,760,795,1035,1095,680,885,1080,1065,985,1060,1260,1150,1265,1190,1375,1060,1120,970,1270,1285,520,680,450,630,420,355,678,502,510,750,718,870,410,472,985,886,428,392,500,750,463,278,714,630,515,520,450,495,562,680,625,480,450,495,290,345,937,625,428,660,406,710,562,438,415,672,315,510,488,312,680,562,325,607,434,385,407,495,345,372,564,625,465,365,380,380,378,352,466,342,580,630,530,560,600,650,695,720,515,580,590,600,780,520,550,855,830,415,625,650,550,500,480,425,675,640,725,480,880,660,620,520,680,570,675,615,520,695,685,750,630,510,470,660,740,750,835,840,560],["> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","<= 13","> 13","<= 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","> 13","<= 13","<= 13","<= 13","> 13","<= 13","<= 13","<= 13","> 13","<= 13","> 13","<= 13","<= 13","> 13","> 13","<= 13","<= 13","<= 13","> 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","> 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","> 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","<= 13","> 13","<= 13","<= 13","> 13","> 13","> 13","<= 13","> 13","> 13","<= 13","> 13","> 13","> 13","<= 13","> 13","> 13","<= 13","> 13","> 13","<= 13","> 13","> 13","<= 13","> 13","<= 13","<= 13","> 13","> 13","> 13","<= 13","> 13","> 13","<= 13","<= 13","> 13","> 13","> 13","> 13","> 13","> 13"]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th>class<\/th>\n      <th>Alcohol<\/th>\n      <th>Malic acid<\/th>\n      <th>Ash<\/th>\n      <th>Alcalinity of ash<\/th>\n      <th>Magnesium<\/th>\n      <th>Total phenols<\/th>\n      <th>Flavanoids<\/th>\n      <th>Nonflavanoid phenols<\/th>\n      <th>Proanthocyanins<\/th>\n      <th>Color intensity<\/th>\n      <th>Hue<\/th>\n      <th>OD280/OD315 of diluted wines<\/th>\n      <th>Proline<\/th>\n      <th>Alcohol_bin<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":5,"ordering":false,"scrollX":true,"searching":false,"columnDefs":[{"className":"dt-right","targets":[0,1,2,3,4,5,6,7,8,9,10,11,12,13]}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[5,10,25,50,100]}},"evals":[],"jsHooks":[]}</script>

---
### Exemple - données `wine`

- Modèle cherché :
$$ Alcohol = a + b_1 \times MalicAcid + b_2 \times ColorIntensity + b_3 \times Magnesium $$

<div id="htmlwidget-b47274118e08b9b6fe56" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-b47274118e08b9b6fe56">{"x":{"filter":"none","vertical":false,"data":[["(Intercept)","Malic acid","Color intensity","Magnesium"],[11.186085337627,-0.018882211927907,0.182008398798448,0.00940463635150786],[0.377734193038012,0.0469724297055676,0.0230673733507332,0.00363171704652947],[29.6136424602188,-0.401984995161298,7.89029578838734,2.58958399870251],[7.40220830188961e-70,0.688188409583848,3.21624654743482e-13,0.0104230275688765]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Estimate<\/th>\n      <th>Std. Error<\/th>\n      <th>t value<\/th>\n      <th>Pr(>|t|)<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"paging":false,"searching":false,"ordering":false,"columnDefs":[{"targets":1,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[1,2,3,4]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render"],"jsHooks":[]}</script>

- Qualité du modèle : `$R^2$` = 0.3263256

---
### Exemple - données `wine`

---
### Suppression de facteurs

- Importance des facteurs différents

- Si coefficient `$b_i$` considéré comme nul, enlever `$X_i$`

- Faire aussi intervenir la connaissance métier

- Omission de variables exogènes entraînant du biais
  - Tirage aléatoire

- Exemple avec suppression des variables avec un coefficient `$b_i$` nul

---
### Choix du *bon* modèle

- Obtenir le modèle le plus proche de la réalité
  - `$R^2$` le plus proche de 1 possible
	- Simple à interpréter (pas trop de variables)

- Critère de choix du modèle ( `$AIC$` par exemple - d'autres existent)

$$ AIC(m) = -2L(m) + 2 \nu(m) $$

- Plusieurs possibilités pour optimiser le critère à chaque étape
  - **Entrée** (ou *forward*) : ajout de variable
	- **Sortie** (ou *backward*) : suppression de variable
	- **Pas à pas** (ou *stepwise*) : ajout et suppression de variables jusqu'à stabilisation

---
### Exemple - données `wine`

- Modèle complet (en enlevant la variable `class` bien évidemment)

<div id="htmlwidget-68a6d7847e0b693a8f01" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-68a6d7847e0b693a8f01">{"x":{"filter":"none","vertical":false,"data":[["(Intercept)","Malic acid","Ash","Alcalinity of ash","Magnesium","Total phenols","Flavanoids","Nonflavanoid phenols","Proanthocyanins","Color intensity","Hue","Proline","OD280/OD315 of diluted wines"],[11.071849541592,0.131636222537816,0.137853611780376,-0.03778771014026,4.17911053903323e-06,0.0520835243405853,0.0091251451307688,-0.207795701016036,-0.152497193288194,0.163034870628244,0.216879740358394,0.00101585935208045,0.16079631859654],[0.596313824847121,0.045276978125069,0.21685252968791,0.0178109822629756,0.00335917832269795,0.133974324073532,0.106946082279654,0.433622132582677,0.0982341648134287,0.0274419929801743,0.281066985577168,0.000199937348970262,0.109703228442002],[18.5671521944514,2.90735442136168,0.635702115067656,-2.12159607944874,0.00124408713606986,0.388757507834107,0.085324725658556,-0.479209167157574,-1.55238448433724,5.94107252873471,0.771630079260409,5.08088837484561,1.46573916629582],[2.90034456800483e-42,0.004146584230572,0.525851306236738,0.0353650572756023,0.999008865187894,0.69795673306111,0.932106679434771,0.632424333614864,0.122486224454931,1.63117127643367e-08,0.441437242219743,1.00794226467427e-06,0.144622425191393]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Estimate<\/th>\n      <th>Std. Error<\/th>\n      <th>t value<\/th>\n      <th>Pr(>|t|)<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":5,"searching":false,"ordering":false,"columnDefs":[{"targets":1,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[1,2,3,4]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[5,10,25,50,100]}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render"],"jsHooks":[]}</script>

---
### Exemple - données `wine`

- Modèle obtenu après sélection pas à pas

<div id="htmlwidget-38847cad5950dd6b0a95" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-38847cad5950dd6b0a95">{"x":{"filter":"none","vertical":false,"data":[["(Intercept)","Malic acid","Alcalinity of ash","Proanthocyanins","Color intensity","Proline","OD280/OD315 of diluted wines"],[11.3332831155648,0.114312669577041,-0.0324404734887689,-0.12963622642088,0.158520050620385,0.00113577599113237,0.225452840030956],[0.394362340068626,0.0397877920299683,0.0137533047261324,0.0846087555748549,0.0231626812834099,0.00017075805632754,0.0834109449267732],[28.7382489757835,2.87305888929298,-2.35874025441532,-1.53218453031363,6.84376945314721,6.65137572750171,2.70291674826225],[2.30193677580508e-67,0.0045806531611797,0.0194664503833548,0.127324865580686,1.32046776304561e-10,3.76057118826811e-10,0.00756701072609274]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Estimate<\/th>\n      <th>Std. Error<\/th>\n      <th>t value<\/th>\n      <th>Pr(>|t|)<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"paging":false,"searching":false,"ordering":false,"columnDefs":[{"targets":1,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[1,2,3,4]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render"],"jsHooks":[]}</script>

---
### Modèle linéaire

- Régression multiple

- Interprétation des coefficients `$b_i$`
  - Variation de `$Y$` = `$b_i$` `$\times$` Variation de `$X_i$`
	- *Toute chose étant égale par ailleurs*

- Colinéarité
  - Corrélation (très) importante entre deux variables explicatives
  - Coefficients des varaibles impactées entre eux
  - Faire attention et à éviter au maximum

- Interaction
  - Prise en compte des effets conjoints
  - Pas facile à interpréter mais potentiellement avec un effet très important

---
### Modèle linéaire général

- Limitations du modèle linéaire
  - Non prise en compte de variables qualitatives
	- Possibilité si ordinale = considérée comme quantitative
	  - mais peut mener à des conclusions erronées

- Solution adoptée
  - Introduction de variables muettes (0-1)
	- Prise en compte dans le calcul
	- Pour `$X_i=1$`, variation de `$b_i$`

---
### Variable muette

- Variable muette `$D$` = variable binaire (prenant la valeur 0 ou 1)

- Equation de la droite de régression inchangée
$$ Y = b_0 + b_1 X + b_2 D $$

- Variation lorsque `$D=1$` de `$b_2$`
  - Les valeurs de `$Y$` pour les objets pour lesquels `$D=1$` sont globalement différente d'un niveau `$b_2$` par rapport à celles des objets pour lesquels `$D=0$`
	- Deux droites de régression (parallèles)

- Pour le groupe `$D=0$` : `$Y = b_0 + b_1 X$`
- Pour le groupe `$D=1$` : `$Y = (b_0 + b_2) + b_1 X$`

---
### Variable muette à plusieurs catégories

- Binaire : traitement vs absence de traitement
- Ternaire : traitement 1, traitement 2 et absence de traitement

- Créer deux variables muettes `$D_1$` et `$D_2$`
  - `$D_1=1$` si Traitement 1 et `$D_1=0$` sinon
	- `$D_2=1$` si Traitement 2 et `$D_2=0$` sinon

- Codage disjonctif partiel

- Equation : `$Y = b_0 + b_1 X + b_2 D_1 + b_3 D_2$`

- Pour le groupe `$D=0$` : `$Y = b_0 + b_1 X$`
- Pour le groupe `$D=1$` : `$Y = (b_0 + b_2) + b_1 X$`
- Pour le groupe `$D=2$` : `$Y = (b_0 + b_3) + b_1 X$`

---
### Pentes différentes

- Extension au cas de droites avec des pentes différentes

- Variable muette pas suffisante (droites non parallèles)

- Introduction de la variable `$D \times X$` (interaction)

- Equation de la droite de régression
$$ Y = b_0 + b_1 X + b_2 D + b_3 D X $$

- Pour le groupe `$D=0$` : `$Y = b_0 + b_1 X$`
- Pour le groupe `$D=1$` : `$Y = (b_0 + b_2) + (b_1 + b_3) X$`

---
### Non-linéarité et logarithmes

- Pourquoi ?
  - Modèle linéaire valable si linéarité en `$b$`

- Modèle multiplicatif

- Exemple de modèle
	$$ Y = b_0 {X_1}^{b_1} {X_2}^{b_2}$$

- Transformation de l'équation pour la rendre linéaire en `$b$`
$$ log(Y) = log(b_0) + b_1 log(X_1) + b_2 log(X_2) $$

---
class: center, middle, section
## Régression logistique

---
### Régression logistique

- Pourquoi ?
  - Adaptation aux variables expliquées qualitatives

- Cas d'une variable binaire `$Y$`

- Recherche de la probabilité `$\pi$` que `$Y=1$` (comprise entre 0 et 1)

- Modèle linéaire inutilisable dans un tel cas (car valeur pouvant être en dehors de `$[0;1]$`)

- Transformation la plus utilisée : fonction logit
$$ logit(\pi) = log \left( \frac{\pi}{1-\pi} \right) = b_0 + b_1 X_1 + \ldots + b_p X_p $$
$$ \pi(X_1,\ldots,X_p) = \frac{exp(b_0 + b_1 X_1 + \ldots + b_p X_p)}{1 + exp(b_0 + b_1 X_1 + \ldots + b_p X_p)}$$

---
### Régression logistique

- Alors que `$logit(\pi)$` peut prendre n'importe quelle valeur, `$\pi \in [0,1]$` toujours

- A partir du coefficient `$b_i$`, calcul de l'odd-ratio `$exp(b_i)$`
  - Odd-ratio : *chance* que `$Y$` prenne la valeur 1 lorsque `$X_i$` augmente de 1
	- *Toute chose étant égale par ailleurs*

- Estimation par le maximum de vraisemblance
$$ L = \prod_j P(Y(j) = 1 / X(j))^{Y(j)} \times (1 - P(Y(j) = 1 / X(j)))^{1 - Y(j)} $$

- Affectation de la valeur 1 si `$P(Y/X) > 0.5$`
- Modulation possible en fonction des probabilités a priori des modalités

---
### Apprentissage supervisé

- Cadre globale, dans lequel se place la régression (entre autres)

- Plusieurs exemples
- Apprendre les règles d'estimation de la valeur d'une variable (qualitative ou quantitative)
- *Professeur* indiquant les erreurs, pour permettre de s'améliorer

- But : élargir ces règles sur de nouvelles données
  - Ne pas apprendre par coeur
  - Vérification des résultats sur d'autres données

- Variables quantitatives : erreurs ( `$R^2$` )
- Variables qualitatives : matrice de confusion (taux d'erreur)

---
### Exemple - données `adult`

- Niveau de salaire (plus ou moins de 50K$) en fonction du nombre d’heures par semaine

<div id="htmlwidget-7c4453cb3f64eaf59e94" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-7c4453cb3f64eaf59e94">{"x":{"filter":"none","vertical":false,"data":[[39,50,38,53,28,37,49,52,31,42,37,30,23,32,40,34,25,32,38,43,40,54,35,43,59,56,19,54,39,49,23,20,45,30,22,48,21,19,31,48,31,53,24,49,25,57,53,44,41,29,25,18,47,50,47,43,46,35,41,30,30,32,48,42,29,36,28,53,49,25,19,31,29,23,79,27,40,67,18,31,18,52,46,59,44,53,49,33,30,43,57,37,28,30,34,29,48,37,48,32],["State-gov","Self-emp-not-inc","Private","Private","Private","Private","Private","Self-emp-not-inc","Private","Private","Private","State-gov","Private","Private","Private","Private","Self-emp-not-inc","Private","Private","Self-emp-not-inc","Private","Private","Federal-gov","Private","Private","Local-gov","Private","?","Private","Private","Local-gov","Private","Private","Federal-gov","State-gov","Private","Private","Private","Private","Self-emp-not-inc","Private","Self-emp-not-inc","Private","Private","Private","Federal-gov","Private","Private","State-gov","Private","Private","Private","Private","Federal-gov","Self-emp-inc","Private","Private","Private","Private","Private","Private","?","Private","Private","Private","Private","Private","Private","Self-emp-inc","?","Private","Private","Self-emp-not-inc","Private","Private","Private","Private","?","Private","Local-gov","Private","Private","Private","Private","Private","Private","Local-gov","Private","Private","Federal-gov","Private","Private","Private","Private","Local-gov","Local-gov","Self-emp-not-inc","Private","Private","Federal-gov"],[77516,83311,215646,234721,338409,284582,160187,209642,45781,159449,280464,141297,122272,205019,121772,245487,176756,186824,28887,292175,193524,302146,76845,117037,109015,216851,168294,180211,367260,193366,190709,266015,386940,59951,311512,242406,197200,544091,84154,265477,507875,88506,172987,94638,289980,337895,144361,128354,101603,271466,32275,226956,51835,251585,109832,237993,216666,56352,147372,188146,59496,293936,149640,116632,105598,155537,183175,169846,191681,200681,101509,309974,162298,211678,124744,213921,32214,212759,309634,125927,446839,276515,51618,159937,343591,346253,268234,202051,54334,410867,249977,286730,212563,117747,226296,115585,191277,202683,171095,249409],["Bachelors","Bachelors","HS-grad","11th","Bachelors","Masters","9th","HS-grad","Masters","Bachelors","Some-college","Bachelors","Bachelors","Assoc-acdm","Assoc-voc","7th-8th","HS-grad","HS-grad","11th","Masters","Doctorate","HS-grad","9th","11th","HS-grad","Bachelors","HS-grad","Some-college","HS-grad","HS-grad","Assoc-acdm","Some-college","Bachelors","Some-college","Some-college","11th","Some-college","HS-grad","Some-college","Assoc-acdm","9th","Bachelors","Bachelors","HS-grad","HS-grad","Bachelors","HS-grad","Masters","Assoc-voc","Assoc-voc","Some-college","HS-grad","Prof-school","Bachelors","HS-grad","Some-college","5th-6th","Assoc-voc","HS-grad","HS-grad","Bachelors","7th-8th","HS-grad","Doctorate","Some-college","HS-grad","Some-college","HS-grad","Some-college","Some-college","Some-college","Bachelors","Bachelors","Some-college","Some-college","HS-grad","Assoc-acdm","10th","11th","7th-8th","HS-grad","Bachelors","HS-grad","HS-grad","HS-grad","HS-grad","HS-grad","Masters","9th","Doctorate","Assoc-voc","Some-college","Some-college","HS-grad","Bachelors","Some-college","Doctorate","Some-college","Assoc-acdm","HS-grad"],[13,13,9,7,13,14,5,9,14,13,10,13,13,12,11,4,9,9,7,14,16,9,5,7,9,13,9,10,9,9,12,10,13,10,10,7,10,9,10,12,5,13,13,9,9,13,9,14,11,11,10,9,15,13,9,10,3,11,9,9,13,4,9,16,10,9,10,9,10,10,10,13,13,10,10,9,12,6,7,4,9,13,9,9,9,9,9,14,5,16,11,10,10,9,13,10,16,10,12,9],["Never-married","Married-civ-spouse","Divorced","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Married-spouse-absent","Married-civ-spouse","Never-married","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Never-married","Never-married","Married-civ-spouse","Married-civ-spouse","Never-married","Never-married","Married-civ-spouse","Divorced","Married-civ-spouse","Separated","Married-civ-spouse","Married-civ-spouse","Divorced","Married-civ-spouse","Never-married","Married-civ-spouse","Divorced","Married-civ-spouse","Never-married","Never-married","Divorced","Married-civ-spouse","Married-civ-spouse","Never-married","Never-married","Married-AF-spouse","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Separated","Never-married","Married-civ-spouse","Married-civ-spouse","Divorced","Married-civ-spouse","Never-married","Married-civ-spouse","Never-married","Married-civ-spouse","Divorced","Divorced","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Married-spouse-absent","Married-civ-spouse","Married-civ-spouse","Divorced","Married-civ-spouse","Divorced","Married-civ-spouse","Married-civ-spouse","Never-married","Never-married","Separated","Married-civ-spouse","Never-married","Married-civ-spouse","Never-married","Married-civ-spouse","Married-civ-spouse","Never-married","Married-civ-spouse","Never-married","Married-civ-spouse","Married-civ-spouse","Married-civ-spouse","Divorced","Divorced","Married-civ-spouse","Married-civ-spouse","Never-married","Never-married","Married-civ-spouse","Divorced","Divorced","Married-civ-spouse","Married-civ-spouse","Never-married","Married-civ-spouse","Married-civ-spouse","Divorced","Never-married"],["Adm-clerical","Exec-managerial","Handlers-cleaners","Handlers-cleaners","Prof-specialty","Exec-managerial","Other-service","Exec-managerial","Prof-specialty","Exec-managerial","Exec-managerial","Prof-specialty","Adm-clerical","Sales","Craft-repair","Transport-moving","Farming-fishing","Machine-op-inspct","Sales","Exec-managerial","Prof-specialty","Other-service","Farming-fishing","Transport-moving","Tech-support","Tech-support","Craft-repair","?","Exec-managerial","Craft-repair","Protective-serv","Sales","Exec-managerial","Adm-clerical","Other-service","Machine-op-inspct","Machine-op-inspct","Adm-clerical","Sales","Prof-specialty","Machine-op-inspct","Prof-specialty","Tech-support","Adm-clerical","Handlers-cleaners","Prof-specialty","Machine-op-inspct","Exec-managerial","Craft-repair","Prof-specialty","Exec-managerial","Other-service","Prof-specialty","Exec-managerial","Exec-managerial","Tech-support","Machine-op-inspct","Other-service","Adm-clerical","Machine-op-inspct","Sales","?","Transport-moving","Prof-specialty","Tech-support","Craft-repair","Adm-clerical","Adm-clerical","Exec-managerial","?","Prof-specialty","Sales","Sales","Machine-op-inspct","Prof-specialty","Other-service","Adm-clerical","?","Other-service","Farming-fishing","Sales","Other-service","Other-service","Sales","Craft-repair","Sales","Protective-serv","Prof-specialty","Sales","Prof-specialty","Prof-specialty","Craft-repair","Machine-op-inspct","Sales","Protective-serv","Handlers-cleaners","Prof-specialty","Sales","Exec-managerial","Other-service"],["Not-in-family","Husband","Not-in-family","Husband","Wife","Wife","Not-in-family","Husband","Not-in-family","Husband","Husband","Husband","Own-child","Not-in-family","Husband","Husband","Own-child","Unmarried","Husband","Unmarried","Husband","Unmarried","Husband","Husband","Unmarried","Husband","Own-child","Husband","Not-in-family","Husband","Not-in-family","Own-child","Own-child","Own-child","Husband","Unmarried","Own-child","Wife","Husband","Husband","Husband","Husband","Husband","Unmarried","Not-in-family","Husband","Husband","Unmarried","Husband","Not-in-family","Wife","Own-child","Wife","Not-in-family","Not-in-family","Husband","Husband","Husband","Husband","Husband","Husband","Not-in-family","Husband","Husband","Not-in-family","Husband","Not-in-family","Wife","Husband","Own-child","Own-child","Own-child","Husband","Not-in-family","Other-relative","Own-child","Husband","Husband","Own-child","Husband","Not-in-family","Husband","Wife","Husband","Not-in-family","Own-child","Husband","Husband","Not-in-family","Not-in-family","Husband","Unmarried","Unmarried","Wife","Husband","Not-in-family","Husband","Husband","Unmarried","Own-child"],["White","White","White","Black","Black","White","Black","White","White","White","Black","Asian-Pac-Islander","White","Black","Asian-Pac-Islander","Amer-Indian-Eskimo","White","White","White","White","White","Black","Black","White","White","White","White","Asian-Pac-Islander","White","White","White","Black","White","White","Black","White","White","White","White","White","White","White","White","White","White","Black","White","White","White","White","Other","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","Black","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","White","Black","Asian-Pac-Islander","White","White","White","White","White","Black"],["Male","Male","Male","Male","Female","Female","Female","Male","Female","Male","Male","Male","Female","Male","Male","Male","Male","Male","Male","Female","Male","Female","Male","Male","Female","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Female","Male","Male","Male","Male","Male","Female","Male","Male","Male","Female","Male","Male","Female","Female","Female","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Female","Female","Male","Male","Male","Female","Male","Male","Male","Male","Male","Male","Female","Male","Male","Male","Female","Male","Female","Female","Male","Male","Male","Female","Male","Female","Female","Female","Male","Male","Male","Male","Female","Male"],[2174,0,0,0,0,0,0,0,14084,5178,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5013,2407,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,14344,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2042,0,0,0,0,0,0,0,0,1408,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1902,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1573,0,0,1902,0,0,0],[40,13,40,40,40,40,16,45,50,40,80,40,30,50,40,45,35,40,50,45,60,20,40,40,40,40,40,60,80,40,52,44,40,40,15,40,40,25,38,40,43,40,50,40,35,40,38,40,40,43,40,30,60,55,60,40,40,40,48,40,40,40,40,45,58,40,40,40,50,40,32,40,70,40,20,40,40,2,22,40,30,40,40,48,40,35,40,50,40,50,40,40,25,35,40,50,60,48,40,40],["United-States","United-States","United-States","United-States","Cuba","United-States","Jamaica","United-States","United-States","United-States","United-States","India","United-States","United-States","?","Mexico","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","South","United-States","United-States","United-States","United-States","United-States","United-States","United-States","Puerto-Rico","United-States","United-States","?","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","?","Honduras","United-States","United-States","United-States","Mexico","Puerto-Rico","United-States","United-States","United-States","?","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","Mexico","United-States","United-States","United-States","United-States","United-States","Cuba","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","United-States","?","United-States","United-States","United-States","United-States","England","United-States"],["<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K",">50K",">50K",">50K",">50K",">50K","<=50K","<=50K",">50K","<=50K","<=50K","<=50K","<=50K",">50K",">50K","<=50K","<=50K","<=50K","<=50K",">50K","<=50K",">50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K",">50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K",">50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K",">50K",">50K","<=50K",">50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K",">50K","<=50K","<=50K","<=50K",">50K",">50K","<=50K","<=50K","<=50K",">50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K","<=50K",">50K","<=50K",">50K","<=50K","<=50K",">50K","<=50K","<=50K","<=50K","<=50K",">50K","<=50K",">50K",">50K","<=50K","<=50K"]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th>age<\/th>\n      <th>workclass<\/th>\n      <th>fnlwgt<\/th>\n      <th>education<\/th>\n      <th>education_num<\/th>\n      <th>marital_status<\/th>\n      <th>occupation<\/th>\n      <th>relationship<\/th>\n      <th>race<\/th>\n      <th>sex<\/th>\n      <th>capital_gain<\/th>\n      <th>capital_loss<\/th>\n      <th>hours_per_week<\/th>\n      <th>native_country<\/th>\n      <th>class<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":5,"ordering":false,"scrollX":true,"searching":false,"columnDefs":[{"className":"dt-right","targets":[0,2,4,10,11,12]}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[5,10,25,50,100]}},"evals":[],"jsHooks":[]}</script>

---
### Exemple - données `adult`

- Niveau de salaire (plus ou moins de 50K$) en fonction du nombre d’heures par semaine

<div id="htmlwidget-ed458772dc52b5f6a35b" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-ed458772dc52b5f6a35b">{"x":{"filter":"none","vertical":false,"data":[["(Intercept)","hours_per_week"],[-3.10007434590566,0.0464501272366479],[0.0528976169908537,0.0011844486131249],[-58.605179632982,39.2166673352758],[0,0]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Estimate<\/th>\n      <th>Std. Error<\/th>\n      <th>z value<\/th>\n      <th>Pr(>|z|)<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"paging":false,"searching":false,"ordering":false,"columnDefs":[{"targets":1,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[1,2,3,4]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render"],"jsHooks":[]}</script>

---
### Exemple - données `adult`

- Matrice de confusion, (avec seuil de décision à 0.5)
  - prédit en lignes et observé en colonnes

<div id="htmlwidget-7e158c182da098f80442" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-7e158c182da098f80442">{"x":{"filter":"none","vertical":false,"data":[["1","2"],["FALSE","TRUE"],[24199,521],[7557,284]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Prédit<\/th>\n      <th>Observé: <=50K<\/th>\n      <th>Observé: >50K<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"paging":false,"searching":false,"ordering":false,"columnDefs":[{"className":"dt-right","targets":[2,3]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":[],"jsHooks":[]}</script>

---
### Exemple - données `adult`

- Et avec le modèle complet

<div id="htmlwidget-7745002c4c63bd12e1d2" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-7745002c4c63bd12e1d2">{"x":{"filter":"none","vertical":false,"data":[["(Intercept)","age","workclassFederal-gov","workclassLocal-gov","workclassNever-worked","workclassPrivate","workclassSelf-emp-inc","workclassSelf-emp-not-inc","workclassState-gov","workclassWithout-pay","education11th","education12th","education1st-4th","education5th-6th","education7th-8th","education9th","educationAssoc-acdm","educationAssoc-voc","educationBachelors","educationDoctorate","educationHS-grad","educationMasters","educationPreschool","educationProf-school","educationSome-college","marital_statusMarried-AF-spouse","marital_statusMarried-civ-spouse","marital_statusMarried-spouse-absent","marital_statusNever-married","marital_statusSeparated","marital_statusWidowed","occupationAdm-clerical","occupationArmed-Forces","occupationCraft-repair","occupationExec-managerial","occupationFarming-fishing","occupationHandlers-cleaners","occupationMachine-op-inspct","occupationOther-service","occupationPriv-house-serv","occupationProf-specialty","occupationProtective-serv","occupationSales","occupationTech-support","relationshipNot-in-family","relationshipOther-relative","relationshipOwn-child","relationshipUnmarried","relationshipWife","raceAsian-Pac-Islander","raceBlack","raceOther","raceWhite","sexMale","capital_gain","capital_loss","hours_per_week","native_countryCambodia","native_countryCanada","native_countryChina","native_countryColumbia","native_countryCuba","native_countryDominican-Republic","native_countryEcuador","native_countryEl-Salvador","native_countryEngland","native_countryFrance","native_countryGermany","native_countryGreece","native_countryGuatemala","native_countryHaiti","native_countryHoland-Netherlands","native_countryHonduras","native_countryHong","native_countryHungary","native_countryIndia","native_countryIran","native_countryIreland","native_countryItaly","native_countryJamaica","native_countryJapan","native_countryLaos","native_countryMexico","native_countryNicaragua","native_countryOutlying-US(Guam-USVI-etc)","native_countryPeru","native_countryPhilippines","native_countryPoland","native_countryPortugal","native_countryPuerto-Rico","native_countryScotland","native_countrySouth","native_countryTaiwan","native_countryThailand","native_countryTrinadad&Tobago","native_countryUnited-States","native_countryVietnam","native_countryYugoslavia"],[-8.95409825347388,0.0251422800593388,1.08829759287928,0.406676311834005,-10.4902196214901,0.590958229212508,0.759120718774562,0.0971073113516552,0.276640728190165,-12.2195596416041,0.0772767741459271,0.480744134737299,-0.530566847180439,-0.255341814433433,-0.485299954943753,-0.200728799528897,1.33077237761274,1.33977932850282,1.92887820234961,2.98428439533344,0.806100767646745,2.27803343266143,-21.1801848319912,2.78319042568135,1.15215934948514,2.6961747412877,2.20861043215906,-0.0212117122786685,-0.480669020138153,-0.121574942375215,0.125139465200911,0.112399206484333,-1.00209463293258,0.181010637194225,0.895453311090866,-0.90190010013942,-0.570215116064834,-0.17320512729191,-0.72204197908438,-4.07190210075286,0.626484092377281,0.69068500778709,0.390668049457681,0.765660352167178,0.576274592465143,-0.369118169489141,-0.660407258589679,0.444002685434088,1.36191368283626,0.679191682533001,0.475821226269797,0.20710777465432,0.616034453280358,0.86939769670813,0.000319579244488628,0.000645690092260324,0.0295427130143668,1.50998506779991,0.523084957744357,-0.483521722702281,-1.90747764158997,0.573219911871632,-1.64470907591369,-0.0871742990610897,-0.400180229367273,0.474348427891537,0.774473333057582,0.621759423464464,-0.833975205999529,-0.0176387354778078,0.116037189690777,-10.358137374178,-1.09973655272851,0.135834107149152,0.0575077565892348,-0.190608284163996,0.224900235654593,0.684919431831544,0.994394180273598,0.201483039995601,0.585018429782762,-0.36112761596668,-0.295038084657693,-0.492560517719128,-12.0375016429431,-0.586508104070784,0.627313080089558,0.176847098829024,0.122105943920492,-0.143296359472755,0.182832285096167,-0.876272951211621,0.239221880082107,-0.357798348433788,-0.208264553605801,0.37356115199908,-0.963416987347002,0.893609273819361],[0.439436133613106,0.00164692553086316,0.153668807024539,0.140247132392119,271.735340309332,0.125153404478292,0.149538089794495,0.137019729322991,0.151663815130027,198.412803389545,0.210688598032042,0.264158251101189,0.489926102112257,0.324714227352274,0.232038142646095,0.261134466013896,0.176326794780089,0.1693102834257,0.157379617912654,0.214157541141319,0.153334061811756,0.167805129960545,369.603384791383,0.199965671808724,0.155537584746559,0.553964483654186,0.265525990218869,0.230116936183163,0.0874692030984812,0.163951241262785,0.153749843917801,0.0991310391650363,1.512711592776,0.0848178465076346,0.0871860722984007,0.142052694924456,0.145825819051786,0.106093840253293,0.124437187289207,1.66796893332848,0.093584741037498,0.130297210681938,0.0900870705180966,0.11931301907154,0.262746657809412,0.242712114274355,0.260127407344085,0.278716546303305,0.102534194914424,0.269528982331106,0.232189641987187,0.35406373263017,0.221376129943367,0.0791256323389603,1.03062681851175e-05,3.70901884384021e-05,0.00162061935624955,0.63399882251885,0.294866998364284,0.393537741718269,0.824975580850415,0.336627860051171,1.04982715465855,0.725797969099216,0.497050578316134,0.333385606321529,0.526982144049752,0.283984736527829,0.567158612604805,0.758264765737947,0.685872832601185,882.743429685186,2.42883444636279,0.67857133342596,0.772937421361708,0.328467551680408,0.450798046596235,0.644524533668993,0.345466437853425,0.462981644142213,0.419119647811737,0.857590575350491,0.254461210846,0.797769129080691,211.165079211797,0.859125492436499,0.28052098880195,0.420506970595978,0.633924450058448,0.403501104660112,0.788964631014276,0.442184108191042,0.471906603878573,0.835141243864055,0.867653314184271,0.137969617353828,0.616250905752862,0.68143832576097],[-20.3763358735478,15.2661912079058,7.08209827323961,2.89971213598131,-0.0386045466502387,4.72187098446059,5.07643717943567,0.708710430472011,1.82403909563393,-0.0615865480092705,0.3667819467581,1.81990959106231,-1.08295280633745,-0.786358566778843,-2.09146629691798,-0.768679839903691,7.54719314935912,7.91315979983441,12.2562135296334,13.9349956085093,5.25715394297957,13.5754695532672,-0.0573051700918432,13.9183410857819,7.40759444967929,4.86705343184202,8.31786911081112,-0.0921779710372337,-5.49529437917675,-0.741531088382254,0.813916047081085,1.13384473148927,-0.662449232040073,2.13411026862056,10.2706004237249,-6.34905307934535,-3.91024799155998,-1.63256534854797,-5.80246142502617,-2.44123377803402,6.69429744028739,5.30084262105264,4.33656069856555,6.41724061737209,2.19327087647735,-1.52080653490538,-2.5387838418584,1.59302593018972,13.282531588344,2.51992077682629,2.04927843549565,0.584944899935996,2.78275012503811,10.9875608068923,31.0082406889146,17.4086495498037,18.2292732099256,2.38168434099106,1.77396914760237,-1.22865400556278,-2.31216254864619,1.70282968196541,-1.56664748917513,-0.120108215746706,-0.805109674598852,1.42282215817698,1.46963866195904,2.18941141367834,-1.4704444003227,-0.0232619742797118,0.169181784399746,-0.0117340294199324,-0.45278366105824,0.200176607024282,0.0744015686133059,-0.580295627951262,0.498893545242064,1.06267394963633,2.87841037888462,0.435185806056949,1.39582678320427,-0.42109559776714,-1.15946192221906,-0.617422384201217,-0.0570051719151421,-0.682680364200851,2.23624293771632,0.420556878232913,0.192619079307059,-0.355132508480889,0.23173698529568,-1.98169254611257,0.50692632422593,-0.428428545545561,-0.240031992272861,2.70756097729162,-1.56335183989712,1.31135752134495],[2.71213729023382e-92,1.28460349324273e-52,1.41987903178997e-12,0.00373505507138245,0.969205677293984,2.33684907289934e-06,3.84577868043044e-07,0.478504190755438,0.0681461619437777,0.950892089739903,0.713781674959612,0.0687727740895259,0.278829371169269,0.431657449944204,0.0364862845869468,0.442083395735483,4.44738781175377e-14,2.50936921638431e-15,1.55602410180066e-34,3.88224273576554e-44,1.46301841870611e-07,5.59837782212193e-42,0.954302101985326,4.90148820048742e-44,1.28610885890239e-13,1.13274411704915e-06,8.95588349625088e-17,0.926556640084005,3.90059191830421e-08,0.458371489177141,0.415693047058802,0.256859683827558,0.507683360763876,0.032833755512602,9.56077488669944e-25,2.16644280677404e-10,9.22014107095433e-05,0.102560444026227,6.5348431074372e-09,0.0146371759934672,2.1671023350891e-11,1.15269423360644e-07,1.44729473354018e-05,1.38766507829408e-10,0.0282878686447778,0.128308394029073,0.0111238527981055,0.111154374123206,2.92340771366556e-40,0.0117381250109282,0.040434896086225,0.558584766570924,0.00539003005032738,4.3861986039746e-28,4.17401173917852e-211,7.0937586594307e-68,3.02311865885249e-74,0.0172336628302848,0.0760682537388012,0.219201557825924,0.0207687272919541,0.0885999464209091,0.117197111081993,0.904397424283851,0.420756393429216,0.154787716230285,0.141659643370902,0.0285669508222886,0.141441433631831,0.981441303628771,0.865653657197036,0.990637813932904,0.650704530047182,0.84134246178871,0.940690860691743,0.561715275258579,0.617854381551097,0.287929838458694,0.00399684827927198,0.663427564477708,0.162766662953175,0.673685276034929,0.246267948957969,0.536956161357117,0.95454107525034,0.494808840075324,0.0253358620125325,0.674078687625047,0.847257301616563,0.722490323628602,0.816742303093722,0.0475136619791692,0.612206514386816,0.66833914573928,0.810305455226594,0.00677796181105015,0.117969861255238,0.189737003158691]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Estimate<\/th>\n      <th>Std. Error<\/th>\n      <th>z value<\/th>\n      <th>Pr(>|z|)<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":5,"searching":false,"ordering":false,"columnDefs":[{"targets":1,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 3, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[1,2,3,4]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[5,10,25,50,100]}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render"],"jsHooks":[]}</script>

---
### Exemple - données `adult`

- Matrice de confusion, (avec seuil de décision à 0.5)
  - prédit en lignes et observé en colonnes

<div id="htmlwidget-666e045c120c3af5eef0" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-666e045c120c3af5eef0">{"x":{"filter":"none","vertical":false,"data":[["1","2"],["FALSE","TRUE"],[23021,1699],[3106,4735]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>Prédit<\/th>\n      <th>Observé: <=50K<\/th>\n      <th>Observé: >50K<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"paging":false,"searching":false,"ordering":false,"columnDefs":[{"className":"dt-right","targets":[2,3]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":[],"jsHooks":[]}</script>

---
### Vrai/faux positifs/négatifs

| | Observé = 0 | Observé = 1 |
|-|:-:|:-:|
| Prédit = 0 | `$n_{VN}$` | `$n_{FN}$` |
| Prédit = 1 | `$n_{FP}$` | `$n_{VP}$` |

#### Détail

- `$VN$` : Vrais négatifs (prédit 0 et observé 0)
- `$FN$` : Faux négatifs (prédit 0 et observé 1)
- `$FP$` : Faux positifs (prédit 1 et observé 0)
- `$VP$` : Vrais positifs (prédit 1 et observé 1)

---
### Précision, Recall, Sensibilité, Spécificité

3 critères de *qualité* de la prédiction

- *Precision* (ou précision)
  - Proportion de 1 correctement prédit par rapport à tous les 1 prédits
  - `$Precision = \frac{n_{VP}}{n_{VP} + n_{FP}}$`
  - Ce que nous prédisons comme vrai est-il bien vrai ?

- ¨*Sensitivity* ou *Recall* (ou sensibilité)
  - Proportion de 1 correctement prédit par rapport à tous les 1 observés
	- `$Sensibilite = \frac{n_{VP}}{n_{FN} + n_{VP}}$` 
	- Prédisons nous bien tous les cas positifs ?

- *Specificity* (ou spécificité)
  - Proportion de 0 correctement prédit par rapport à tous les 0 observés
	- `$Specificite = \frac{n_{VN}}{n_{VN}+n_{FP}}$`
	- Prédisons nous bien tous les cas négatifs ?

---
### Précision, Recall, Sensibilité, Spécificité

<div id="htmlwidget-3e84aec9ca5efd47f030" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-3e84aec9ca5efd47f030">{"x":{"filter":"none","vertical":false,"data":[[0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1],[0.240809557446024,0.715856392616934,0.794078805933479,0.83130124996161,0.848960412763736,0.852430822149197,0.848591873713952,0.836860047295845,0.818770922268972,0.797672061668868,0.759190442553976],[0.240809557446024,0.456937068912898,0.544416640600563,0.616723006561941,0.682345601996257,0.735934100093255,0.792680474562638,0.849212924606462,0.904841402337229,0.949748743718593,0],[1,0.95472516260681,0.887896951919398,0.791098074225226,0.697487565361561,0.603877056497896,0.502741997194235,0.392169366152276,0.276495344981507,0.168728478510394,0],[0,0.640088996763754,0.764320388349515,0.844053398058252,0.897006472491909,0.931270226537217,0.9582928802589,0.977912621359223,0.990776699029126,0.997168284789644,1]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th>Seuil<\/th>\n      <th>Correct<\/th>\n      <th>Precision<\/th>\n      <th>Recall<\/th>\n      <th>Specicity<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"paging":false,"searching":false,"ordering":false,"columnDefs":[{"targets":0,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 2, 3, \",\", \".\", null);\n  }"},{"targets":1,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 2, 3, \",\", \".\", null);\n  }"},{"targets":2,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 2, 3, \",\", \".\", null);\n  }"},{"targets":3,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 2, 3, \",\", \".\", null);\n  }"},{"targets":4,"render":"function(data, type, row, meta) {\n    return type !== 'display' ? data : DTWidget.formatSignif(data, 2, 3, \",\", \".\", null);\n  }"},{"className":"dt-right","targets":[0,1,2,3,4]}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":["options.columnDefs.0.render","options.columnDefs.1.render","options.columnDefs.2.render","options.columnDefs.3.render","options.columnDefs.4.render"],"jsHooks":[]}</script>

---
### Représentations graphiques

Quelques représentations graphiques usuelles

- ROC

- Precision/Recall

- Sensitivity/Specificity

- Lift

- *Performance*

---
### Courbe ROC

- Taux de vrai positif vs Taux de faux positif

---
### Courbe Precision/Recall

- Précision (Nb de vrai positifs sur Nb positifs prédits) vs Taux de vrai positif

---
### Courbe Sensitivity / Specificity

- Sensibilité vs Spécificité
  - Taux de vrai positif vs Taux de vrai négatif

---
### Courbe Lift

- Lift (Taux de vrai positifs sur Taux de prédictions positives) vs Taux de prédictions positives

---
### Courbe de *Performance*

- Taux de vrai positifs vs Taux de prédictions positives
- Pour 40 % des prédits considérés positifs, on aura 89 % des positifs