The data set is biontech_adolescents. This data set is available on openintro.org/data and in the `openintro` R package.
On March 31, 2021, Pfizer and BioNTech announced that "in a Phase 3 trial in adolescents 12 to 15 years of age with or without prior evidence of SARS-CoV-2 infection, the Pfizer-BioNTech COVID-19 vaccine BNT162b2 demonstrated 100% efficacy and robust antibody responses, exceeding those recorded earlier in vaccinated participants aged 16 to 25 years old, and was well tolerated." These results are from a Phase 3 trial in 2,260 adolescents 12 to 15 years of age in the United States. In the trial, 18 cases of COVID-19 were observed in the placebo group (n = 1,129) versus none in the vaccinated group (n = 1,131).
This data set provides a great opportunity for students to practice some basic statistics with real world data that they see in the news on a daily basis. Instead of just taking the news as it is given they can actually do the computations themselves with the raw data. In particular, this data set allows them to conduct a chi-square test for independence to see if getting the vaccine has an effect on contracting COVID-19. Below are a few lines of code - and the output - that students can run to show that there is a relationship between not getting the vaccine and getting COVID at the 99.9% significance level.