Quantitative variable: histogram and shapes
Learning objectives
- Make and interpret displays of the distribution of a variable.
- We display the distribution of a quantitative variable with a histogram, or dotplot.
- We understand distributions in terms of their shape, center, and spread.
- Describe the shape of a distribution.
- A symmetric distribution has roughly the same shape reflected around the center.
- A skewed distribution extends farther on one side than on the other.
- A unimodal distribution has a single major hump or mode; a bimodal distribution has two; multimodal distributions have more.
- Outliers are values that lie far from the rest of the data.
- Report any other unusual feature of the distribution such as gaps.
- Summarize a distribution by computing the median and IQR.
- The median is the middle value; half the values are above and half are below the median. It is a better summary when the distribution is skewed or has outliers.
- The IQR is the difference between the quartiles.
- Find a 5-number summary and, using it, make a boxplot.
- A 5-number summary consists of the median, the quartiles, and the extremes of the data.
- A boxplot shows the quartiles as the upper and lower ends of a central box, the median as a line across the box, and “whiskers” that extend to the most extreme values that are not nominated as outliers.
- Use the boxplot’s outlier nomination rule to identify cases that may deserve special attention.
- Boxplots display separately any case that is more than 1.5 IQRs beyond each quartile. These cases should be considered as possible outliers.