The data set is epa2021. This data set is available on openintro.org/data and in the `openintro` R package.
This week's data set of the week is epa_2021. These results are gathered each year for vehicles tested under the oversight of the Environmental Protection Agency. The data is collected in Ann Arbor, Michigan, in the National Vehicle and Fuel Emissions Laboratory. The information includes 12 manufacturers including Volvo, General Motors, and Toyota. The data is primarily categorical, but a few quantitative variables are present, such as city_mpg and high_mpg. There are a total of 28 variables in the data set.
The data set is an excellent one to use for essential graphical summaries at the start of the semester. One way to present this to the class is to have the students answer the following three questions.
- What type of graph would you use to explore the center and variability of miles per gallon for the vehicles tested at the National Lab?
- What type of graph would you use to explore the relationship between the miles per gallon for vehicles in the city versus on the highway?
You can then have them work in groups or pairs to make the graphs. Here is a histogram of the miles per gallon in the city.
Here is a scatterplot for exploring the relationship between miles per gallon in the city or highway.
The Guidelines for Assessment and Instruction in Statistical Education include an emphasis to "Give students experience with multivariable thinking." However, don't stop there. Have the students think about the different manufacturers of cars? Ask them if they think that the cars made by the various manufacturers would have different relationships between miles per gallon in the city versus on the highway.
To finish the activity, you could ask them what surprised them or if this is what they expected to see.
Source: Fuel Economy Data from fueleconomy.gov. Retrieved 6 May 2021.