11 min readโขjanuary 12, 2021

Jerry Kosoff

Practicing with FRQs is a great way to prep for the AP exam! Work through this FRQ from Unit 6, then review sample student responses and corresponding feedback from Fiveable teacher Jerry Kosoff!

After completing a sale, a car company likes to send a follow-up survey where customers can indicate their level of satisfaction with their experience. One of the questions in the survey asks โwould you recommend our company to a friend looking to purchase a vehicle?โ The company wonders if people would answer the question differently based on whether they bought a new or used vehicle. From a list of all 2018 vehicle sales, the company randomly selects 105 customers who bought a new vehicle 120 customers who bought a used vehicle. 88 of the customers who bought new vehicles answered โyes,โ while 85 of the customers who bought used vehicles answered โyes.โ

P(yes|new) = 88/105 = 0.838

P(yes|used) = 85/120 = 0.708

margin of error = +/- (1.960)sqrt[(0.838(0.162)/105)+(0.708(0.292)/120)]

margin of error = +/- 0.108

confidence interval = 0.05 +/- 0.108 = (-0.058, 0.113)

No, the data do not provide convincing statistical evidence that the proportion of customers who would answer โyesโ to the survey question is different for new vs used vehicle sales. Since 0 is captured in the 95% confidence interval of -0.058 and 0.113, the data shows that the true difference in proportions could be 0.

Teacher Feedback

Iโll give feedback on your work below, but I want to start with noticing that you used a 2-sample confidence interval to answer the question. That is a totally valid strategy for a situation like this, but only because the alternative hypothesis was โdifferentโ; had the scenario asked โhigherโ or โlowerโ the confidence interval would not work in the same way. Typically, when given a significance level, and asked if there is โconvincing statistical evidenceโ of something, we should be running a hypothesis test. That said, you will still be scored for your work with the confidence interval.

The scoring for a โconvincing statistical evidenceโฆโ scenario includes:

- Stating null/alternative hypotheses
- Defining the parameters in the null/alternative hypotheses
- Choosing an appropriate test/interval by name
- Checking the conditions to run the chosen test/interval
- Writing the results from the chosen test/interval
- Correctly interpreting the results from the chosen test/interval in terms of whether we do or donโt have evidence for the alternative hypothesis.

Given that list (some parts are scored together to create a question with 3-4 scoring components), you can likely see that your work doesnโt have enough there to be earning much of the available credit. You calculate the appropriate margin of error, and therefore obtain a confidence interval, but never name the interval, check conditions (random samples, approximately normal sampling distribution [at least 10 successes/at least 10 failures], 10% condition), or write hypotheses. Additionally, you used โ0.05โ in the interval, instead of using (0.838 - 0.708 = 0.13) as your difference of proportions to add/subtract the margin of error. That would have led you to a different confidence interval where 0 wasยnotย included. Given that your interval did include 0 though, your conclusion that we do not have convincing evidence would get scored as correct, because you interpreted the answer you got correctly. Unfortunately, you would not get credit for the other components of the question.

p_1 = the proportion of customers who bought a new vehicle and answered yes to the survey question

p_2 = the proportion of customers who bought a new vehicle and answered yes to the survey question

2-sample z test for p_1 - p_2

H_0: p_1 - p_2 = 0, H_a: p_1 - p_2 not equal 0

- Random - Stated that the company โrandomly selectsโ customers for the survey
- 10% Condition for Independence - satisfied since it is safe to assume that there are at least 105(10) = 1050 customers who bought a new vehicle at the car company, and at least 120(10) = 1200 customers who bought a used vehicle at the car company.
- Large Counts Condition - satisfied since
- n_1
*p-hat_1 = 105*0.838 = 87.99 >=10 - n_1*(1-p-hat_1) = 105
*0.162 = 17.01 >=10* *n_2*p-hat_2 = 120*0.708 = 84.96 >= 10**n_2*(1-p-hat_2) = 120*0.292 = 35.04 >=10- With Large Counts Condition satisfied, the sampling distribution of p-hat_1 - p-hat_2 is approximately normal.

p-hat_1 = 88/105 = 0.838, n_1=105

p-hat_2 = 85/120 = 0.708, n_2=120

z* = 2.303

P-val = P(z>=2.303 or z<=-2.303) = 0.021247

Since 0.021247 < alpha of 0.05, we reject the null hypothesis, because there is convincing statistical evidence that the proportion of customers who would answer โyesโ to the survey question is different for new vs used vehicle sales.

Teacher Feedback

Strong execution from top to bottom, presented clearly. One thing that youโre going to facepalm about: you defined the parametersยp1ย andยp2ย as the exact same thing. โp2โ should say โusedโ

Test type- two sample proportional difference hypothesis z-test

- H0= p1-p2=0
- Ha=p1-p2 does not = 0
- p1=the proportion of all customers from a list of 2018 vehicle sales who bought a new vehicle and would recommend our company to a friend
- p2=the proportion of all customers from a list of 2018 vehicle sales who bought a used vehicle and would recommend our company to a friend

Conditions

- Since we are dealing with a two-sample proportional difference hypothesis test, we will have to pool/combine our proportions.
- pc=88+85/105+120=0.75
- qc=1-0.75=0.25
- *we have to check for the independence of our pooled data --> .75(225)=168.75 >=10 & .25(225)=56.25 >= 10

Conditions for new cars:

- simple random sample: stated in the problem- โthe company randomly selects 105 customersโฆโ
- independence: 10(105)=1050 Assume that the population of new cars purchased in 2018 is greater than 1050.
- normal: .84(105)=(88 >= 10), .162(105)=(17 >= 10) All of the conditions for new cars are met.

Conditions for old cars:

- simple random sample: stated in the problem- โthe company randomly selectsโฆ120 customersโฆโ
- independence: 10(120)=1200 Assume that the population of new cars purchased in 2018 is greater than 1200.
- normal: .708(120)=(85 >= 10), .292(120)=(35 >= 10)
- All of the conditions for old cars are met.

Solve

- z=.84-.708/(sqrt.(.25x.75)/105 + (.25x.75)/120) = 2.28 --> *I looked at table z to find the p-value of -2.28, p-value=.0129(2)=.0258

Conclusion

- (.0258<.05 our significance level) --> Reject the H0 in favor of the Ha. We have significant evidence that the proportion of all customers from a list of 2018 vehicle sales who bought a new vehicle and said they would recommend the company is different than the proportion of all customers from a list of 2018 vehicle sales who bought a used vehicle and said that they would recommend our company to a friend.

Teacher Feedback

This is about as thorough a response as Iโve seen! Very well done - youโve nailed all of the components.

p1= Proportion of customers who bought a new car and answers โyesโ to the survey question.

p2= Proportion of customers who bought a used car and answered โyesโ to the survey question.

Ho= p1-p2=0 Ha= p1-p2 does not equal 0

We are interested in conducting a 2 sample z test for a difference in population proportions.

Conditions:

- Random- A random sample of 105 customers who bought a new vehicle and 120 customers who bought a used vehicle is taken
- Normal-
- Sample of new cars: np = 105 * 0.838= 88 is greater than or equal to 10.
- n(1-p)= 105(0.162)= 17 is greater than or equal to 10.

- Sample of used cars: np= 120* 0.708= 85 is greater than or equal to 10.
- n(1-p)= 120(0.292)= 35 is greater than or equal to 10.

Calculator: 2-Prop Z Test {x1=88, n1=105, x2=85, n2=120, p1 does not equal p2} = p: 0.0212

Since the p-value of 0.0212 is less than our alpha level of 0.05, we have convincing statistical evidence to reject the null hypothesis. The proportion of customers who would answer โyesโ to the survey question is different for new vs. used vehicle sales.

Teacher Feedback

Nice job! Youโve defined parameters, checked conditions, named the test, obtained appropriate test statistic and p-value, and made an appropriate conclusion.ย

H_o: p_1 = p_2

H_a: p_1 โ p_2

Where p_1 is the true proportion of customers who bought a new vehicle and answered โyesโ to the survey question.

Where p_2 is the true proportion of customers who bought a used vehicle and answered โyesโ to the survey question.

- Independence:
- We have 2 independent random samples of customers from 2018 vehicle sales.
- Population of new vehicle customers is at least 1050 and the population of used vehicle customers is at least 1200.

- Normality:
- n_1 * p-hat_1 = 105 * 0.8381 = 88 โฅ 10
- n_1 * (1-p-hat_1) = 105* (0.1619) = 16.9995 โฅ 10
- n_2 * p-hat_2 = 120 * 0.7083 = 84.996 โฅ 10
- n_2 * (1-p-hat_2) = 120 * 0.2917 = 35.004 โฅ 10
- Since all 4 are greater than 10, the sampling distribution is approximately normal.

p_hat_combined = 105(0.8381) + 120(0.7083) / 105+102 = 0.7689

z = (0.8381 - 0.7083) - 0 / sqrt((0.7689 * (1-0.7689) / 105) + (0.7869 * (1-0.7869) / 120) = 2.3036

p-value = 2*normalcdf(2.3036, 1E99, 0, 1) = 0.0212

alpha = 0.05

p-value<alpha

Since the p-value<alpha, we reject the H_o. There is sufficient evidence to suggest that the proportion of customers who bought a new vehicle and answered โyesโ to the survey question is different from the proportion of customers who bought a used vehicle and answered โyesโ to the survey question.

Teacher Feedback

Well done from top to bottom - youโve got parameters, conditions, appropriate calculations, and appropriate conclusions. Youโre ready!

H null: p1-p2=0

H alternative: p1-p2 doesnโt equal 0.

Anything with subscript 1 pertains the new vehicle group while any thing with subscript 2 pertains to the used vehicle group.

The test we will be using is a two sample z-test (for difference in proportion).

- np1 hat and nq1 hat both greater than 10 (or 15). np2 hat and nq2 hat both greater than 10 or (15). Calculations omitted, but it is clear that both groups both have more than 10 successes and failures.
- Samples are random and independent. This is met as the question specifies that the selection of customers from both groups was random, and it is clear that the selection of a customer from one group does not affect the selection of a customer from the other group.
- It is also safe to assume that the number of customers who bought new and used cars is at least 105
*10 = 1050 and 120*10 - 1200 people, respectively.

Z= (p1 hat - p2 hat)/(phat *(1-phat)(1/n1 + 1/n2))^.5 where phat is the pooled proportion = 173/235 = 0.7361

p1 hat = 0.8381

p2 hat = 0.7083

Z=0.1298/((0.7361)(0.2639)(0.0179))^.5

Z=2.19

pvalue = 2(1-0.9857) = 0.0286.

Conclusion: Since the pvalue of 0.0286 is less than the alpha level of 0.05, we reject the null hypothesis, leadings us to the conclusion that the data does provide convincing statistical evidence that the proportion of customers who would answer yes to the survey question is different for new vs used vehicle sales.

Teacher Feedback

Good work! Youโve done all required parts of a hypothesis test and answered appropriately. The only place where you might not receive full credit is in your hypotheses: youโve used symbols, and named which group is which, but did not define the parameter in context. The parameter here was โproportion of people who would say yes to the surveyโ, and then can be differentiated between new/used sales.

Ho: p1=p2

Ha: p1=/p2 (=/ means not equal to)

Alpha = 0.05

p1: The proportions of customers that answered โyesโ to โwould you recommend our company to a friend looking to purchase a vehicle?โfrom buying the new vehicle

p2: The proportions of customers that answered โyesโ to โwould you recommend our company to a friend looking to purchase a vehicle?โfrom buying the used vehicle

2 sample z test for proportions

Random: met, since stated in the problem that company selects customers randomly.

10%: 105<= 1/10 all customers who bought new vehicle

120<= 1/10 all customers who bought used vehicle

Large Counts: For new: 105(88/105)>=10 105(1-(88/105))>=10

For Used: 120(85/120)>=10 120(1-(85/120)) >=10

Since, Large counts conditions are met for both we can assume that the distribution is approximately normal.

z score = 109/840 / root 0.769(1-0.769)/105 + 0.769(1-0.769)/120 = 2.3039

p-value=0.0212

Teacher Feedback

Nicely done! Youโve done all components of a hypothesis test correctly (and shout-out to the state-plan-do-conclude method thatโs in the textbook I use as well).

2-sample z test for proportion

Ho: p1 = p2

Ha: p1 =/= p2

Conditions:

- Random: We are told that the customers are โrandomly selectedโ
- Normal: It is approximately normal because:
- (88/105)(105)=88 โฅ10 (17/105)(105) = 17 โฅ 10
- (85/120)(120)=85 โฅ10 (35/120)(120) = 35 โฅ10

- 10% condition: There are more than 10 x 120 and 10 x 105 customers. The two samples are independent from each other.

Calculation: Pc = 88+85/105+120 = 173/225

z= (88/105-85/120)-0/ sqrt(173/225)(52/225) x sqrt(1/105+1/120) = 2.304

p-value: .0212

Interpret: Because or p-value (.0212) is below the alpha (0.05) we reject the Ho. There is significant evidence that the proportion of customers who would answer โyesโ to the survey question is different for new vs used vehicle sales.

Teacher Feedback

The only part you need more on is your hypotheses; you donโt define what you mean by โp1โ and โp2โ, so youโd only get partial credit there (you didnโt define the parameter). Every other part is done appropriately and would earn full credit.

pnew=ย proportion of customers who bought a new car and would recommend to a friend.

pused=ย proportion of customers who bought a used car and would recommend to a friend.

H0=pnewโpold=0
Ha=pnewโpoldโ 0

- Random: โThe company randomly selects 105 customers who bought a new vehicle and 120 customers who bought a used vehicleโ
- Normal: For the new vehicle population, there isย 88โฅ10ย successes andย 105โ88=17โฅ10ย failures. For the old vehicle population, there isย 85โฅ10ย successes andย 120โ85=35โฅ10ย failures.
- Independent: It is reasonable to assume that the population of the people who bought new and old cars is at least 1050 and 1200 respectively.

We will be conducting a 2 sample z-test for difference of population proportions.

Calculations:

p=0.021

Teacher Feedback

Very good work. Little thing: on โcalculationsโ part, it is typical to show both the test statistic (in this case, yourยz), as well as the p-value.ย

Browse Study Guides By Unit

๐Unit 1 โ Exploring One-Variable Data

โ๏ธUnit 2 โ Exploring Two-Variable Data

๐Unit 3 โ Collecting Data

๐ฒUnit 4 โ Probability, Random Variables, & Probability Distributions

๐Unit 5 โ Sampling Distributions

โ๏ธUnit 6 โ Proportions

๐ผUnit 7 โ Means

โณ๏ธUnit 8 โ Chi-Squares

๐Unit 9 โ Slopes

โ๏ธFrequently Asked Questions

๐Study Tools

๐คExam Skills

ยฉ 2024 Fiveable Inc. All rights reserved.