The FRQ is a great way to prep for the AP exam! Review FRQ practice writing samples and corresponding feedback from Fiveable teacher Jerry Kosoff.
Natural gas is used in many households to heat water, provide cooking fuel, and heat the home. January is typically a month in which many homes have their highest usage of natural gas. A utility company reviewed data from their customers in a certain city for a period of 15 years. For each year, the company recorded the average daily high temperature in the month of January (in degrees Fahrenheit), as well as the average household usage of natural gas (measured in therms, a unit of heat energy). The company displayed the data on a scatterplot, with the average daily high temperature on the x-axis and the average household usage of natural gas on the y-axis. The company noticed a negative, strong, linear association between the variables.
a. In the context of this situation, describe the meaning of the words “negative,” “strong,” and “linear.”
b. A least-squares regression line created from the company’s data is shown below.
Interpret the meaning of the slope of this regression line in the context of this problem.
c. In one of the years in the data set (2015), the average daily high temperature for the month of January was 20 degrees Fahrenheit, and the average household natural gas usage was 45 therms. Calculate and interpret the residual for the year 2015.
a.) The word negative means that as the daily temperature increases, the usage of natural gas decreases. By using the word strong, the company is implying that a lot (if not all) of the points decrease negatively at the same correlation. The word linear shows that the line of best fit would be a straight line.
b.) As the temperature decreases, it is expected that each household’s usage of natural gas will decrease around 1/698 degrees Fahrenheit on average.
c.) The residual was -6.412 therms. According to the least squares regression line, the predicted usage per household of natural gas was 51.412 therms. The real average household natural gas usage was 45 therms, which is 6.412 therms less than the predicted usage.
Teacher feedback
In part (a), you do a good job of describing “negative,” but your descriptions of “strong” and “linear” fall a little bit short. For “strong,” it’s not clear what you mean by “same correlation.” If you’re referring to a similar rate of change, that would fit under “linear.” The word “strong” refers to the idea that the predicted values of # of therms used are typically close to the actual # of therms used; that is there are small residuals based on the regression line. For “linear,” I’m actually unsure if your description would get credit or not - you should be referring to how the points on the scatterplot appear to create a straight line.
In parts (b) and ( c ), you have strong explanations and would earn full credit for both parts. Overall, this is well done!
a) Negative means as the average daily high temperature increases, the average household usage of natural gas decreases.
Strong means the average household usage of natural gas tends to be close to the predicted average household usage of natural gas.
Linear means as the average daily high temperature increases, the average household usage of natural gas tends to decreaseat a constant rate.
b) For each additional degree in temperature, we expect the average household usage of natural gas to decrease by 1.698.
c)The average natural gas usage at 45 therms was 6.42 degrees less than predicted.
Teacher feedback
In part (a), your responses are clear and concise. For the description of “negative,” you could strengthen your response by saying “tends to decrease,” but your response would earn credit as-is.
In part (b), I would be careful to include units in your response (“degree” should be sufficient but adding “Fahrenheit” can strengthen, and you should include “therms” for the natural gas). Your sentence works as written, and “expected” will work as a substitute for “predicted” or “on average,” which is what’s typically written on AP rubrics.
For part c, you’ve correctly interpreted the residual, but reversed the units. The residual was for the number of therms being used (so we used 6.42 therms less than predicted); assuming you show your math on the real exam (hard to do in this forum, I know), you would likely earn partial credit for that mix-up.
(a) The word strong means that the points on a scatterplot are close to the line of best fit and that there’s small residuals. The word linear means that the points form a straight line and that as temperature increases, the usage of natural gas decreases at a constant rate. The word negative means that as the temperature increases, the average usage of natural gas decreases and has a inverse relationship.
(b) As the temperature increases by 1 degree Fahrenheit, we expect the household usage of natural gas to decrease by 1.698 therms on average.
Residual=6.412. My observed y of 51.412 degrees Fahrenheit when it is 20 degrees Fahrenheit outside is 6.412 degrees Fahrenheit above what the model shows.
Teacher feedback
Your answer to part (a) communicates everything it needs to, and includes context of the problem. In part (b), you’ve state everything clearly and including the expression “on average” at the end of the sentence makes it clear that this is a prediction, not a guarantee. You would earn “essentially correct” (E) for both parts a and b. For part ( c ), be careful: the 51.412 you’ve calculated is the number of therms, and is a predicted value (not an observed value). Therefore, the residual should be negative instead of positive (as we’d do 45 - 51.412). Given both the reversal of signs and the incorrect units in your interpretation, it is likely that you would be scored “incorrect” (I) for that part of the problem.
a.) In the context of this situation, negative means that average daily high temp. increase (x), while the average usage of natural gas (y-hat) decreases. Strong means that the predicted average usage of natural gas is close to the actual average usage of natural gas. Linear means that the average daily temperature increases (x), while the usage of natural gas decreases at a constant rate.
b.) As the average daily high-temperature increases by 1-degree Fahrenheit, the average household usage of natural gas will decrease by 1.698 terms.
c.) y (hat)=85.372-1.698(20)=51.412 therms
Actual-Predicted= 45-51.412=-6.412 therms
The least-regression line shows that the average household natural gas usage was 6.412 there’s less than the predicted household natural gas usage.
Teacher feedback
In part (a), you make good use of context throughout your answers. Be careful that when you’re describing the scatterplot, you are making references to x and y (not y-hat), since the scatterplot contains actual values. You should also say things like “as the average daily high temperature increases…” in your description of “negative” and “linear” (your answers left of the “as”).
In part (b), your sentence framing is good, uses correct units, but misses a key thing that “gets” a lot of students on these types of problems: you’re missing “non-deterministic” language for the “decrease by 1.698 therms” part of your sentence. That’s the AP rubric’s fancy way of saying "you make it sound like it will decrease by this much, when it is only predicted to decrease by this much on average. You’ll need to use “predicted,” “on average,” “estimated,” or similar words when describing slope or intercept. Your answer would earn partial credit.
In part ( c ), you correctly calculate and clearly interpret the meaning of your answer in the context of this problem. Nicely done!
a. “negative”: As the average daily high-temperature increases, the average household usage of natural gas decreases
“strong”: The points relating the daily high-temperature and the average household usage of natural gas fit the line of best fit well. (small residuals)
“linear”: The points relating the daily high-temperature and the average household usage of natural gas follow a straight line pattern.
b. Based on the LSRL, as the average daily high-temperature increases by 1, the average household usage of natural gas decreases by 1.698
c. x= 20 Predicted average household usage of natural gas= 51.412
Residual= Actual - Predicted = 45-51.412 = -6.412
The LSRL overestimated the average household usage of natural gas when it was 20 degrees F during 2015 for being 51.412. The prediction is 6.412 larger than the actual average household usage of natural gas.
Teacher feedback
In part (a), your answers are clearly-written and presented in context. All three descriptions are correct; you would earn full credit.
In part (b), you are missing one key word that would make your answer complete, and it would be enough to mark you down to partial credit: something like “predicted” or “expected” when talking about the change in response variable. We should say something like “…the average household usage of natural gas is predicted to decrease by 1.698 therms”. This is a common error, so watch out for it.
In part ( c ), you perform the residual calculation correctly and describe the meaning of it in context. Full points!
a) The meaning of the word negative in this statement means that as the temperature is increasing the amount of natural gas used is decreasing. The meaning of the word strong in this statement means that the residuals of the given points are very small and are close together to the line of best fit implying that there is a very close relationship with temperature and the natural gas usage, and the word linear means that the relationship is best described by a linear straight line as the data when plotted is seen to have an approximately straight line.
b) For every increase of 1 Fahrenheit in the temperature, the predicted natural gas usage decreased by 1.698
c) The residual is represented by the formula Actual - Predicted, and the residual value which is calculated would be -6.412 therms, which represents that the actual value for the number of therms is less than the one predicted. It can be said that the actual value for the number of therms would be 45 which is lower than the predicted value of 51.412 by the LSRL.
Teacher feedback
In part (a), you clearly detail the meaning of the words “negative,” “linear,” and “strong,” including units and context along the way.
In part (b), you give a correct description of slope - including the key phrase predicted. You can strengthen your answer by including units on the response variable (therms, in this case).
In part ( c ) , you correctly calculate the residual. On the “real exam,” you’d want to show where the -6.412 came from. In your explanation of what it stands for, you should also use the number 6.412 in some way (by saying, for example, “which is 6.412 therms lower than the predicted value…”).
A) Negative: for every increase in average daily high temperature in degrees Fahrenheit, the average household usage of natural gas tends to decrease.
Strong: the predicted number of therms used and the actual number of therms used are close in value creating small residual sizes and strong association
Linear: the points in the scatterplot appear to create a straight line. For every increase in average daily temperature, the average household usage of natural gas decreases at a constant rate.
B) As average daily high temperature increases by 1 degree Fahrenheit, the average household usage of natural gas is predicted to decrease by 1.698 therms.
C) y hat = 85.372 - 1.698 (20)
y hat = 51.412
y = 45
residual = 45 - 51.412
residual = -6.412
context : The actual average household usage of natural gas is 6.412 therms less than the predicted average household usage of natural gas by the LSRL.
Teacher feedback:
You answer is very thorough! All three parts address what is asked in the question (without anything unnecessary added), and you use appropriate language such as “predicted” in part b and provide answers in context and with appropriate units. You would likely earn full credit on all parts. Well done!