Data Set: Pierce County Home Sales

Karen Higgins November 14th, 2021

You may find this data set in the `openintro` R package (via the `usdata` package or on this page.

The Pierce County sales dataset (pierce_county_house_sales) is 2020 house sales from the second largest county, by population, in Washington State. House sales data is used every day by realtors, prospective buyers, homeowners, and tax assessors to determine current value and to predict future value.

There are many variables to analyze. Number of bedrooms, number of bathrooms, square feet, size of attached garage, heating source, view and year built may all be factors in the house price. Correlation tests can identify what are the most important variables.

The data has outliers, demonstrating why median sale price is used in the real estate industry instead of mean sale price. The scatterplot below helps us identify sales price outliers every month. Two data points are over $6 million, far in excess of the other sales.

A scatter plot of the sales price of a house from Pierce County, Washington in 2020. The x-axis represents the date starting on January 1, 2020 and ending on December 31, 2020. The y-axis represents the sale price of the home measured in thousands of US dollars. There are two clear outliers, one in April and one in December, where the house sold for over six million dollars.