Lecture 2

Point and Interval Estimates

Review

Friday’s Lab

\[ \color{gray} \begin{array}{r|rrrrrr|r} j & 1 & 2 & 3 & 4 & 5 & 6 & \bar{y} \\ y_{j} & 0 & 1 & \textcolor[RGB]{7,59,76}{1} & 1 & \textcolor[RGB]{7,59,76}{0} & \textcolor[RGB]{7,59,76}{1} & \textcolor{black}{4/6} \\ \end{array} \]

  • On Friday, we talked about how to use sampling to summarize a population without surveying everyone in it.
    • We estimated the proportion of people in a population of six who prefer chocolate to sour candy.
    • In the fake population shown above, 4/6 people prefer chocolate.
    • Those who prefer chocolate are in the top row, at y=1. Those who prefer sour candy are below at y=0.
  • We tried a few different sampling schemes.
    • sampling with replacement, sampling without replacement, coin-flip randomization, convenience sampling
  • Ultimately we chose coin-flip randomization.
    • We flipped a coin for each person in our population to decide whether they would be sampled.
    • Of the 3 people who flipped heads, 2 prefer chocolate to sour candy.
    • So our estimate of the proportion of the population who preferred chocolate was 2/3.

Fake Data Simulation

\[ \color{gray} \begin{array}{r|rrrrrr|r} j & 1 & 2 & 3 & 4 & 5 & 6 & \bar{Y} \\ y_{j} & 0 & 1 & \textcolor[RGB]{7,59,76}{1} & 1 & \textcolor[RGB]{7,59,76}{0} & \textcolor[RGB]{7,59,76}{1} & \textcolor[RGB]{7,59,76}{2/3} \\ y_{j} & 0 & 1 & 1 & \textcolor[RGB]{239,71,111}{1} & 0 & \textcolor[RGB]{239,71,111}{1} & \textcolor[RGB]{239,71,111}{2/2} \\ y_{j} & \textcolor[RGB]{17,138,178}{0} & 1 & \textcolor[RGB]{17,138,178}{1} & \textcolor[RGB]{17,138,178}{1} & 0 & \textcolor[RGB]{17,138,178}{1} & \textcolor[RGB]{17,138,178}{3/4} \\ \end{array} \]

  • This was a huge time-saver. But is our estimate trustworthy?
  • How close was our estimate to the actual proportion in the population of 6?
  • We looked into it ahead of time by running a simulation of our survey on a fake population.
    • We did exactly what we did in our actual survey, but we did it over and over.
    • This let us look at what our estimator does, knowing what we want it to do: be \(\textcolor{gray}{ \approx 4/6}\).
    • Above, we’ve shown the results of three of these simulations.
  • We call the distribution of estimates we get like this our estimator’s sampling distribution.
    • These three estimates are three draws from the sampling distribution.
    • That’s three equally-likely outcomes of our survey. And we do ok in these three.
    • But that’s just three. To get a better sense of what’s likely, we should look at a lot more.

Visualizing the Sampling Distribution

  • Above, we plot 1000 draws from the sampling distribution as 1000 s.
  • To highlight what we want, we’ve drawn the population proportion as a green line.
  • Eyeballing it, we can see that estimates are usually pretty close, but not always.
  • To be more precise, we could count the dots in each column. But it’s easier to ask our computer to do it for us.

  • We can draw in the proportion of dots in each column as a bar graph.
  • If you really want to count dots in columns, this is the way to do it.
  • But they can be a bit counterintuitive visually because of the uneven spacing of the columns.
    • Most of the dots are in the middle, but there are a lot of columns there. A lot of slightly different estimates.
    • As a result, no bar there is particularly high. The highest bar is out at \(x=1\).
  • How do we visualize the distribution of dots without this problem?

  • We can group together nearby columns into equal-width bins and count the dots in each bin.
  • This gives you a sense of the density of estimates near each value of \(x\).
  • We call this kind of plot a histogram.
  • If you want to know the fraction of dots in some interval, this makes it easy.
    • As long as it’s the interval between two bin edges.
    • It’s the fraction of the histogram’s area that’s between them.

Sampling Distributions in Real Studies

Your Estimator in a Simulated Study
Your Estimator in a Real Study
  • It’s easy to think about sampling distributions when we’re running a simulation of our study.

    • We can run our simulation as many times as we want and plot the sampling distribution.
    • And we can use our knowledge of the population to see if we’re happy with it.
  • In a real study, we don’t have any of this. All we get is one estimate.

  • But that doesn’t stop us from filling in the rest of the picture by …

    1. thinking of it as one draw from our estimator’s sampling distribution.
    2. working out where that sampling distribution is in relation to the estimation target.
    3. using the data we have to estimate the sampling distribution.

What Do We Do with a Sampling Distribution?

  • A sampling distribution is an odd summary of the proportion of people who prefer chocolate.
    • It’s not what you asked for. You asked for a single number describing a population.
    • What you got was 1000 numbers describing 1000 samples.
  • But it does tell you something if you look at it right.
    • How often is the estimate exactly equal to the population proportion? 20% of the time.
    • How often is it off by \(\textcolor[RGB]{17,138,178}{1/3}\) or less? 93% of the time.
  • To make this sound like a statement about the population, we report two things.
    1. An interval you can expect the population proportion to be in.
    • e.g., we think it’s in the interval \(\textcolor[RGB]{7,59,76}{2/3} \pm \textcolor[RGB]{17,138,178}{1/3}\) because \(\textcolor[RGB]{7,59,76}{2/3}\) is our sample proportion.
    1. The degree of confidence you have that it’s actually in it.
    • e.g., it’s in an interval calculated exactly like this in 93% of surveys run exactly like this.

Other Sampling Schemes

One Sampling Distribution per Sampling Scheme

coinflip.randomization convenience with.replacement without.replacement

  • Let’s take a moment to get a sense of how different sampling schemes affect our estimate.
    • To do this, we’ll look at 1000 draws from the sampling distribution we get using each.
    • To help it a bit easier, we’ll display a histogram of each too.
  • 3/4 of the mechanisms we’ve plotted give us fairly similar sampling distributions.
    • They all have a peak around 2/3, which is the population proportion.
    • That’s where most of our estimates wind up being—where we want them.
  • The one that’s different is convenience sampling.
    • It was most convenient for me to sample the first three people in our population.
    • That doesn’t change, so neither does our estimate. In this case, it’s a good one. That happens sometimes.
    • But you can’t count on it. Without a random mechanism, it’s hard to know your estimate will be good.
    • Using multiple, meaningfully different fake populations can help you catch stuff that works by luck in one.

We’ll Focus on One

  • Today, we’ll focus on sampling with replacement.

    • That’s just to keep things simple and concrete.
    • Most of what we’ll say will apply to any sampling scheme.
  • We’ll really be talking about the relationship between three things.

    1. Our estimator
    2. Its sampling distribution
    3. Our estimation target
  • As the semester continues, we’ll look into this more generally. We’ll talk about …

    • how we want these three things to be related.
    • when we know they’re related that way.

Polling

Context

  • Suppose that, week before the 2020 presidential election, you did some polling.
    • You use a list of the \(m \approx 7.23M\) people registered to vote in Georgia.
    • And you make \(n=625\) phone calls.
    • Each call, you select a voter uniformly at random from the list, e.g. by rolling a 7.23M -sided die.
    • And then you ask the potential voter whether they plan to vote.
  • Suppose also that all these registered voters will
    • Pick up the phone will called
    • Respond honestly to your questions
    • Not change their minds about voting
  • That is, suppose they tell us whether they do ultimately vote on election day.

Polling Results

  • You put your polling results in a table.
  • It has one column for each call.
  • In that columns, you record …
    • the call number \(i\)—a number from \(1 \ldots 625\).
    • the response \(Y_i\) of the person you called—\(1\) for ‘yes and \(0\) for no’.

\[ \begin{array}{r|rrrr|r} i & 1 & 2 & \dots & 625 & \bar{Y}_{625} \\ Y_i & 1 & 1 & \dots & 1 & 0.68 \\ \end{array} \]

  • You summarize the results with an extra column: the mean of the responses.
    • Remember that the mean of a binary variable is a frequency. Or a proportion.1
    • You found that 68% of the people polled said they would vote.

Sample Versus Population

  • We want to estimate the proportion of all registered voters — our population — who will vote.
  • To do this, we use the proportion of polled voters — our sample — who said they would.
  • When the election occurs, we get to see who turns out to vote.
    • 5.05M people, or roughly 70% of registered voters, actually vote.

 

Before the Election

\[ \begin{array}{r|rrrr|r} i & 1 & 2 & \dots & 625 & \bar{Y}_{625} \\ Y_i & 1 & 1 & \dots & 1 & 0.68 \\ \end{array} \]

Outcomes for our sample of polled voters.
To enumerate our sample, we …

  • give each call a number \(i \in 1 \ldots 625\).
  • write \(Y_i\) for the turnout of the \(i\)th person we called.

After the Election

\[ \begin{array}{r|rrrrrr|r} j & 1 & 2 & 3 & 4 & \dots & 7.23M & \bar{y}_{7.23M} \\ y_{j} & 1 & 1 & 1 & 0 & \dots & 1 & 0.70 \\ \end{array} \] Outcomes for the population of registered voters.
To enumerate our population, we …

  • give each registered voter a number \(j \in 1 \ldots 7.23M\).
  • write \(y_j\) for the turnout of the person with ID \(j\).

Success!

Our sample proportion \(425 / 625 \approx 0.68\) is close to the population proportion \(5.05M / 7.23M \approx 0.70\)

Before the Election

\[ \begin{array}{r|rrrr|r} i & 1 & 2 & \dots & 625 & \bar{Y}_{625} \\ Y_i & 1 & 1 & \dots & 1 & 0.68 \\ \end{array} \]

After the Election

\[ \begin{array}{r|rrrrrr|r} j & 1 & 2 & 3 & 4 & \dots & 7.23M & \bar{y}_{7.23M} \\ y_{j} & 1 & 1 & 1 & 0 & \dots & 1 & 0.70 \\ \end{array} \]

  • That’s pretty accurate. Our reputation as a turnout pollster is intact for now.
  • But unless we’re looking to retire, one success isn’t enough. We’re going to poll again.
  • If our methods aren’t reliable, we’ve got work to do fixing them.
  • Even if they are, if we overstate our accuracy, we’re going to have to answer for it.
    • e.g. we could say we’ll be off by at most 2%, since that’s what happened this time.
    • But if it’s 4% next time, that’s not going to look good.
  1. Was it luck that we got as close as we did?
  2. Could we have predicted how close we’d get before the election happened?
  • To find answers, we’ll think about what’d happen if our friends had run identical polls.
    • Each friend will choose have a different random sample \(Y_1 \ldots Y_{625}\).
    • And estimate the population proportion using the proportion in their sample.
  • We’ll see how accurate these estimates tend to be.
  • This ‘friends’ stuff is just an informal way of talking about the sampling distribution of our estimator.
    • The sampling distribution is the probability distribution of our estimator.
    • i.e., distribution of the turnout frequency in a sample of size \(n=625\)
    • … drawn with replacement from the population \(y_1 \ldots y_{7.23M}\).
  • Each friend’s estimate is, like ours, a random variable with this probability distribution.

Review: Connecting Sample and Population

  • For each call \(i\), we randomly select a voter with an id we’ll call \(J_i\).
  • And we record as that call’s outcome the turnout of that voter: \(Y_i=y_{J_i}\).
  • On each call, each registered voter has a \(1/7.23M\) chance of being called.
    • This is called sampling with replacement because we could call the same person twice.
    • In our poll, this is unlikely because we’re making a small number of calls relative to population size.

869369 4428455 1268868

869369 4428455 13218 2793742 5321915 3321825 5555477 6928310 7168330 454473 5154517 2376809 5768046 6610597 2194282 2872271 2149204 2275654 4837930 2476143 1279900 1246714 2866041 5667398 2822681 4565370 2550195 2717062 1549984 883330 2889106 1398478 6641857 6806854 5687039 7040838 6776423 4899281 6534669 2571993 3095897 1483886 5490535 1864845 6072220 7030064 2138669 1319199 6679697 4782567 5659475 5879600 6976064 6159850 6257663 6734140 3310301 2177674 37675 3803418 978362 1871768 2751536 1555125 5975254 2562785 5414234 2249771 32797 5080982 4540188 2388541 3006833 4329558 6469959 7184739 3332908 3170665 3908850 3644150 3164978 5955719 3323445 989651 6506626 3087425 2838887 6833276 5844578 4296723 1065378 4912671 2278243 3002771 7120459 1045544 541490 5984890 1567882 5131843 6580985 2054373 2514911 2384861 766078 880124 6639220 1042703 92737 6669073 5470335 293929 5914476 1990517 4687396 3163945 1881045 4132993 3448532 6885358 6906297 5721205 3424726 6650418 5179166 68680 2302426 2570741 2423624 3637703 6532217 4750875 2646031 3681574 4002182 205859 4998249 3675463 1636383 7186981 1822708 4015252 1899647 2713148 6755265 1798478 2004236 6094160 2853578 6401306 3439733 5621977 1571621 7067919 910063 5250031 5363297 6264392 3412536 6927079 4616155 7007666 1267042 6565213 4877826 146287 2477054 6823183 1779127 4517084 5056421 2795755 5265921 5875885 6354003 5894551 4415184 3735143 5209762 6074861 2256714 4565436 2512945 6297047 4052641 2289547 811255 3218149 3915098 5864304 2558626 4423736 1590182 5266904 4726701 7206622 4742859 2362848 6788099 3316371 2107776 3342463 213709 3913537 4159976 3438078 308842 2825763 1867637 2278052 4774263 6461091 1257622 5572051 7031551 1140268 2518813 5953700 2067935 3319927 5273829 3303159 5824457 1311253 5158771 4150744 5100615 5923674 1953261 3798287 349730 1318401 1072845 4009077 2909658 2598722 2634361 524024 1609695 1302254 312650 7018516 3592760 5154717 87937 3926085 3751385 6572000 2566285 1589380 2741222 317364 4217126 7185805 497841 5378992 3077724 3327790 6673547 490199 3957056 5492851 4466114 743875 5895026 5107533 2001015 6367006 1679469 3136007 490582 2939782 1733548 6101258 1945729 386961 3927757 3525338 67552 3059556 2862341 5659814 5976476 770147 220975 1661773 190958 941693 6407334 3331069 220352 1657941 5556893 3548368 1311491 5837911 6541815 3173526 3066371 3665412 2342970 3384805 4222063 6615382 5552505 5906123 355376 6488377 387028 6767414 6313320 683723 3521237 5554257 3491986 7035208 5676038 3307770 6561561 98087 6910248 3829590 2291211 5031667 122344 4533647 3575990 2678845 5995068 107145 3147589 5911914 6088770 6601503 5807829 5665990 3091836 5566917 4097177 1064126 2150010 2612179 5306516 4146677 5658900 1989253 1727329 4219453 6089954 3620388 1268053 6667828 5758626 3029076 5485346 5138078 2691981 7007396 5966349 455250 3935489 4473534 7159670 2295138 4108570 6295571 5274995 833842 2517988 4640854 4750688 6767156 6340017 1791244 6157885 3636771 169167 2490455 5772801 5985780 563190 852596 6214904 5195221 2832844 6098774 6034096 766428 722073 3221593 504886 1987397 785438 582850 571753 2661419 1391993 1752106 6680124 6370627 3870081 4289747 4516194 6821882 762985 5347234 3072574 818471 220230 3219504 5655917 5790672 1441178 3108622 329270 4698709 1516355 6815013 864283 5016403 196604 1843773 4497133 6202062 3813623 5464607 164136 1562333 651725 2713286 4841594 6693362 1180024 5319863 5515950 1475802 5134301 2709021 4498774 4217713 5628429 1391360 3775814 7024681 6329624 1407777 7106899 516492 4898960 2789013 335320 5890300 4687893 414563 2749217 6109317 302724 714345 128843 614690 87676 4957062 7189050 4751847 5433349 5273929 3489260 3519577 787340 2270984 110782 6183088 3797118 4602501 3539591 2542252 7032543 3755522 5446836 3702195 3850793 1590691 2057979 5194759 5820650 5384442 6475097 838192 3297774 2217616 4896691 5765763 4589150 1384254 2037143 5738764 1831304 1504269 1167210 3830082 6860339 763274 3619116 1471046 518935 796254 7128930 472332 3298518 1495348 145668 3431893 4976379 5747597 5927446 3612858 6449504 326196 3051703 2466139 612765 4986041 3604236 4131163 6711022 2713344 3958320 2424 4781693 7075339 6799674 6496945 2409271 4810517 3956708 1501660 2038809 1527291 4930426 3768837 1969873 343470 1885011 5877211 4205121 570950 4129260 2711012 4079940 4750409 6950457 2542910 2295866 1204100 2254391 6063672 996483 2684772 1250190 891310 6291564 32832 7057891 5383064 4899442 1483058 4314599 5057144 1335201 1804103 261209 1908638 2417056 6769802 2888343 571256 2102125 3548491 5454676 3577186 4729007 842757 990595 1676788 7179921 4203694 3102088 5390720 127150 5580324 3592944 4421710 3933771 2898320 1617275 2330309 5898719 7059483 114281 6906208 2964823 5503430 4486867 6365553 5510887 5443674 6247975 6810058 718293 4122388 2432071 6577949 5243400 2449529 2093976 4558167 4840715 634894 2766800 3011995 6846465 4401703 1446015 1028392 1268868

\[ \begin{array}{r|rr|rr|r|rr|r} \text{call} & 1 & & 2 & & \dots & 625 & & \\ \text{variable} & J_1 & Y_1 & J_2 & Y_2 & \dots & J_{625} & Y_{625} & \overline{Y}_{625} \\ \text{value} & 869369 & \underset{\textcolor{gray}{y_{869369}}}{1} & 4428455 & \underset{\textcolor{gray}{y_{4428455}}}{1} & \dots & 1268868 & \underset{\textcolor{gray}{y_{1268868}}}{1} & 0.68 \\ \end{array} \]

Our Poll

\[ \begin{array}{r|rr|rr|r|rr|r} \text{call} & 1 & & 2 & & \dots & 625 & & \\ \text{variable} & J_1 & Y_1 & J_2 & Y_2 & \dots & J_{625} & Y_{625} & \overline{Y}_{625} \\ \text{value} & 869369 & \underset{\textcolor{gray}{y_{869369}}}{1} & 4428455 & \underset{\textcolor{gray}{y_{4428455}}}{1} & \dots & 1268868 & \underset{\textcolor{gray}{y_{1268868}}}{1} & 0.68 \\ \end{array} \]

Our First Friend’s Poll

\[ \begin{array}{r|rr|rr|r|rr|r} \text{call} & 1 & & 2 & & \dots & 625 & & \\ \text{variable} & J_1 & Y_1 & J_2 & Y_2 & \dots & J_{625} & Y_{625} & \overline{Y}_{625} \\ \text{value} & 600481 & \underset{\textcolor{gray}{y_{600481}}}{0} & 6793745 & \underset{\textcolor{gray}{y_{6793745}}}{1} & \dots & 1377933 & \underset{\textcolor{gray}{y_{1377933}}}{1} & 0.71 \\ \end{array} \]

Our Second Friend’s Poll

\[ \begin{array}{r|rr|rr|r|rr|r} \text{call} & 1 & & 2 & & \dots & 625 & & \\ \text{variable} & J_1 & Y_1 & J_2 & Y_2 & \dots & J_{625} & Y_{625} & \overline{Y}_{625} \\ \text{value} & 3830847 & \underset{\textcolor{gray}{y_{3830847}}}{1} & 5887416 & \underset{\textcolor{gray}{y_{5887416}}}{1} & \dots & 4706637 & \underset{\textcolor{gray}{y_{4706637}}}{1} & 0.70 \\ \end{array} \]

The Sampling Distribution of our Estimate

\[ \begin{array}{r|rr|rr|r|rr|r} \text{call} & 1 & & 2 & & \dots & 625 & & \\ \text{pollster} & J_1 & Y_1 & J_2 & Y_2 & \dots & J_{625} & Y_{625} & \overline{Y}_{625} \\ \hline \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}869369 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}4428455 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}\dots & \color[RGB]{7,59,76}1268868 & \color[RGB]{7,59,76}1 & \color[RGB]{7,59,76}0.68 \\ \color[RGB]{239,71,111}2 & \color[RGB]{239,71,111}600481 & \color[RGB]{239,71,111}0 & \color[RGB]{239,71,111}6793745 & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}\dots & \color[RGB]{239,71,111}1377933 & \color[RGB]{239,71,111}1 & \color[RGB]{239,71,111}0.71 \\ \color[RGB]{17,138,178}3 & \color[RGB]{17,138,178}3830847 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}5887416 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}\dots & \color[RGB]{17,138,178}4706637 & \color[RGB]{17,138,178}1 & \color[RGB]{17,138,178}0.70 \\ {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} \\ \color[RGB]{6,214,160}1M & \color[RGB]{6,214,160}2533350 & \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}5539770 & \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}\dots & \color[RGB]{6,214,160}7068692 & \color[RGB]{6,214,160}1 & \color[RGB]{6,214,160}0.71 \\ {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} \\ \end{array} \]

The Sampling Distribution in R

We run 10,000 simulated polls and store who we call (J) and what they say (Y).

Js = array(dim=c(10000, n))
Ys = array(dim=c(10000, n))
for(rr in 1:10000) {
  Js[rr,] = sample(m, n, replace=TRUE)
  Ys[rr,] = y[Js[rr,]]
}

We calculate the sample proportion for each poll.

meanY.samples = rowMeans(Ys)

And we histogram the result.

ggplot() + geom_bar(aes(x=meanY.samples, y=after_stat(prop)), alpha=.2)  

Before we look at the real thing, let’s sketch the histogram we expect to see.

Checklist

  • Is the histogram you drew in the location you think it should be?
  • What about the shape? Is its spread what you think it should be?

Annotating the Sampling Distribution Histogram

  • There are a few features we may want to highlight.
    • The mean of the sampling distribution is the solid blue line.
    • The middle 2/3 of the sampling distribution lies between the dashed blue lines.
    • The middle 95% of the sampling distribution lies between the dotted blue lines.
  • Our estimation target — the turnout frequency in the population — is drawn as a wide green line.
    • It’s in exactly the same place as the solid blue line.2
    • What does this tell us about our estimator?

Revisiting our Questions

Observation. Our estimate — the black dot — is close to our estimation target. It’s within 2%.

Q. Did we get lucky?

Not really.

  • In 68% of polls, the estimator is within 2% of the target.
  • In 95% of polls, the estimator is within 4% of the target.

Q. Could we have predicted how close we’d get before the election happened?

Yes, in sense.

  • We will use an interval estimate—a range of values the estimation target is likely to be in.
  • The width of this interval speaks to the ‘how close’ question.
  • The coverage probability — the probability our estimate is actually that close — qualifies this answer.

Interval Estimation

  • Our point estimate of the turnout frequency in our population is the turnout frequency in our sample: \(\overline{Y}_{625}\).
  • So let’s try an interval of width 0.02 centered on it: \(\ \ \overline{Y}_{625} \pm 0.01\).
    • This is just a width we chose arbitarily.
    • Maybe it’s wishful thinking. Being off by at most 1% sounds good.
  • What we want is for our interval to cover our estimation target.
    • i.e. for the population frequency to be in our interval.
  • This one doesn’t. Is that just bad luck? Or is it typical of \(\pm 0.01\) intervals?
  • Let’s see what happens when our friends try intervals like this.

Interval Estimation: Our Friends’ Polls

  • Here are the interval estimates our first two friends.
  • That is, the interval estimates based on the pink and teal rows in our sampling distribution table.
  • One of these intervals covers the estimation target. The teal one.
  • So between ours and our two friends, we’re covering 1/3 of the time. Not great.
  • But that’s just three polls. To get a better sense of how often this happens, let’s do it for a hundred.

Interval Estimation: 100 Polls

  • \(45/100\) cover the estimation target.
  • This gives us a sense of the probability that our interval covers the target.
  • If we want to be more precise, we could do the same for millions of different polls.
  • Let’s not. Instead, let’s find a more direct way to calculate the coverage probability.

Coverage Probability and Sampling Distributions

Activity. Explain how to calculate this coverage probability using the sampling distribution of your point estimate \(\bar Y_{625}\).
 

  • Suppose you want to use a diagram like this to count how many of our 100 intervals cover the estimation target.
  • But you don’t want to look at the horizontal segments for each poll. They’re small and hard too see.
  • You just want to use the dots representing each poll’s point estimate \(\bar Y_{625}\).
  • Sketch something on top of our diagram to help you count.
  • An interval estimate is a point — a ‘body’ — with ‘arms’ of a certain length. Not so different from you.
  • Suppose you had identical twin. And you’re not sure whether you’re standing close enough to touch them.
  • But you don’t want to put your arms out. You’re tired from all that polling. Could your twin check for you?
  • Now you’ve worked out how to count how many of 100 intervals cover the estimation target.
  • What would you do if you had a million? Or a billion? That’d be good enough.
    • If x% of a billion intervals cover, you’re pretty safe saying that x% is the coverage probability.
  • You can’t look at a billion dots one by one, but you can look a histogram of a billion dots.
  • Explain how to use that histogram to calculate the coverage probability. Use your sketch from Step 1.

Calculating Coverage

  • Let’s shade in an interval of width .02 centered on the estimation target.
    • This gives it ‘arms’ the same length as our interval estimates have.
    • And its arms touch a point estimate if and only if the point estimates’ arms touch it.
  • That means we can count the intervals that cover by counting the point estimates between the dotted lines.
  • What, in terms of the sampling distribution of the point estimate, is the coverage probability?

The Coverage Probability

It’s the probability that a random draw from the sampling distribution lies in the green shaded area. \[ \text{coverage probability} = P\qty(\overline{Y}_{625} \in \overline{y}_{7.23M} \pm .01) \]

And it’s about 43%. We can get that by counting dots.

mean(mean(y) - .01 <= meanY.samples & meanY.samples <= mean(y) + .01)
[1] 0.4278

Or by finding the area of the histogram that’s shaded green.

Ad-hoc Interval Estimates

What we just did was choose a width and calculate a coverage probability.

  • The coverage probability we found — 43% — probably wasn’t what we wanted.
  • Think of how you’d advertise your polling services.
  • ‘I’m right about half the time. Actually, a bit less than that’.
  • 95% sounds a lot better. For that, we’ll have to use a wider interval.

Calibrated Interval Estimates

  • Instead of choosing a width and calculating the coverage probability, let’s go backward.
    • We’ll choose a coverage probability — 95% is conventional.
    • And we’ll calculate the width we need to get it.
  • An interval estimate calibrated like this—to have a given coverage—is called a confidence interval.
    • Let’s think about how to do that. Again, we’ll use the sampling distribution of our point estimate.
    • Let’s take a look at our annotated histogram of point estimates again.

Using the Sampling Distribution for Calibration

Question.

  • Suppose you and your friends want to draw 95% confidence intervals around your point estimates.
  • How wide do you have to make them to actually get 95% coverage?

Review—Our Annotations

  • The mean of the sampling distribution is the solid blue line—and is the same as the estimation target.
  • The middle 2/3 of the sampling distribution lies between the dashed blue lines.
  • The middle 95% of the sampling distribution lies between the dotted blue lines.

Using the Sampling Distribution for Calibration

Answer.

  • The width of a 95% interval should be the width between the dotted blue lines.
  • That’s the width of the ‘arms’ containing 95% of the estimates drawn from the sampling distribution.
  • You can check that these intervals have the coverage we want.
coverage = mean(mean(y)- dotted.width/2 <= meanY.samples & meanY.samples <= mean(y) + dotted.width/2)
coverage
[1] 0.9516

A Problem

  • We can’t calibrate intervals like this in real life.
    • When we run our a poll, we get a single point estimate \(\bar Y_{625}\) based on our sample.
    • We don’t know the sampling distribution of this point estimate until election day.
  • But what we actually do is almost the same.
    • We do the same thing.
    • But we use an estimate of the sampling distribution in place of the thing itself.
  • That’s what we’ll talk about next class.

Communication

Why Confidence Intervals?

  • Talking about calibrated interval estimates, a.k.a. confidence intervals, has some advantages.

    1. It focuses on what we actually want to know: where the estimation target is.
    2. It reminds us that we’re not (usually) going to be able to know it exactly.
    3. It gives us a sense of how close we can expect that we’ve gotten.
  • But there’s something about them that is a bit infuriating when you’re not used to them.

    • Fundamentally, you’re talking about what would happen in surveys you aren’t doing.
    • You can imagine someone saying ‘I don’t have time for imaginary surveys. What did this one tell you?’
    • And being pretty unhappy when the answer is ‘If that’s how you want to think about it, almost nothing.’
  • This isn’t a problem with intervals. This is something that’s fundamentally uncomfortable about sampling.

    • It might feel like a miracle that you can say anything about 7.23M people after 625 phone calls.
    • But once someone thinks you can, it can be hard for them to accept that you also kind of can’t.
    • We really are just saying ‘I think it’s between 64% and 72%, but I’m wrong sometimes.’
      • We are being clear about what ‘sometimes’ means.
      • But that feels like it’s about you more than it is about what they want to know.

Don’t Make this Mistake

  • Talking about surveys you aren’t doing is pretty awkward. So awkward that you’ll want to try to avoid it.
  • It’s tempting to say something nonsensical.
    • e.g. “the turnout frequency in the election is in my confidence interval, 0.64-0.72, 95% of the time.”
    • This a really weird thing to say. It’s like saying that 95% of the time 2 is between 1 and 3. It either is or it isn’t.
    • The only way it could make sense is if the turnout frequency were random. Because 0.64 and 0.72 aren’t.
    • People do that. Don’t. And please don’t encourage it by saying stuff like this.
  • This doesn’t come up that often because you can just say ‘confidence interval’ most of the time.
  • So you really only have to deal with all of this awkwardness in two situations.
    1. When you say ‘confidence interval’ and someone asks what that means.
    2. When someone else says ‘confidence interval’ but clearly doesn’t know what it means.
  • You’ll get to practice this a bit. Homework, quizzes, maybe exams.

Footnotes

  1. This is one of those instances where language is complicated. We say the ‘frequency a person said yes’ or the ‘proportion of people who said yes’. If we’re leaving the people out of the sentence, we say either: ‘the sample frequency’ or ‘the sample proportion’. Using only one or the other really limits how you phrase things.

  2. That’s why it looks like the solid blue light is ‘highlighted’ in green.