= read.csv("https://qtm285-1.github.io/assets/data/nba_sample_1.csv")
sam = read.csv("https://qtm285-1.github.io/assets/data/nba_population.csv")
pop
= function(W,L,...) { W / (W+L) > 1/2 }
indicator
library(purrr)
= sam |> pmap_vec(indicator)
Y = pop |> pmap_vec(indicator)
y
= length(Y)
n = length(y) m
Week 2 Homework
$$
Summary
This week, we’re going to get into the details of real-world calibration. The kind that doesn’t require that you know your estimator’s actual sampling distribution. We’ll start simple. In Exercise 1 and Exercise 2, you’ll implement a calibrated interval estimator for a population proportion and calculate its actual coverage probability. Then, after a detour on how exactly we go from an estimate of the sampling distribution to an interval width (Exercise 3), we’ll really get into how using an estimate affects the calibration of our interval estimates.
Here we take advantage of a very nice feature of the sampling distributions we get when we’re estimating a proportion \(\theta\). after sampling with or without replacement. They’re determined by our estimation target \(\theta\), i.e. it’s Binomial or Hypergeometric with success probability \(\theta\). As a result, once we have a point estimate \(\hat\theta\), we’ve essentially determined our estimate of the sampling distribution estimate, i.e., it’s Binomial or Hypergeometric with success probability \(\hat\theta\). This means we can think of properties of our interval, e.g. its width or its upper and lower bounds, as functions of \(\hat\theta\). And we can visualize these functions. That’s what Figure 2 and Figure 3 are about, and you’ll be using them in Exercise 46 to understand a phenomeon we saw on the last slide of Lecture 3: around some point estimates, we get a wider-than-perfectly-calibrated interval, which effectively helps it cover the estimation target, and around others we get a narrower-than-perfectly-calibrated interval, which does the opposite. By looking at the width of the sampling distribution as a function of \(\theta\), we can understand where and why this happens—no programming required.
We’ll conclude by using our visualizations to calculate the actual coverage probability of our imperfectly-calibrated interval estimates. First, in Exercise 7, we’ll do it at a single value of \(\theta\)—effectively repeating Exercise 2 using our understanding of what these intervals look like rather than brute-force simulation. Then, in Exercise 8, we’ll repeat the process for a range of \(\theta\) values, allowing us to plot the actual coverage probability of our real-world interval estimates as a function of \(\theta\).
This is all for the case of sampling with replacement. Some extra credit exercises, Exercise 1012, repeat this process for the case of sampling without replacement.
The Point
For a lot of estimators, calibration is a bit more complicated than it is here. For example, that the width of the sampling distribution of a difference of two proportions depends on more than that difference—it depends on the two proportions themselves. We lose our one-to-one mapping between the point estimate and the sampling distribution, so there’s more to the question of how estimating this sampling distribution affects calibration. My hope is that, having some shared visual intuition for how calibration works in this simple case, we’ll have an easier time talking about how it works more generally.
The Data
We’ll be using the NBA data from last week’s homework. But to stay in familiar territory, we’ll need a binary outcome, so we’ll use a frequency. In particular, we’ll look at the frequency that a player’s team wins in over half of the games they play in.
You may have caught on that ‘your sample’ from last week’s homework probably isn’t sampled with replacement from the population after all. So we’ll work with ‘my sample’. Here’s some code you can run to get the data set up. Just like in class, we’ll use (little) y
for the population and (big) Y
for the sample.1
= '#ef476f' # rgb(239,71,111)
pink = '#118ab2' # rgb(17,138,178) teal
Warm Up
To get started, we’ll redo a few exercises from last week with a few modifications. First, we’ll use this new binary outcome instead of points. And second, we’ll calibrate our intervals using estimates of our sampling distribution rather than the actual sampling distribution.
To remind you of the notation, here’s how we write out our estimation target and estimator again. \[ \begin{aligned} \underset{\text{estimation target}}{\theta} &= \frac{1}{m}\sum_{j=1}^m y_j \\ \underset{\text{estimate}}{\hat\theta} &= \frac{1}{n}\sum_{i=1}^n Y_i \end{aligned} \]
The first one is a version of last week’s Exercise 3.
The second one is a version of last week’s Exercise 4.
Improving Speed and Precision at The Same Time
So far, to calculate our interval widths, we’ve used a ‘dot counting’ method. We’ve sampled 10,000 times from our (estimated) sampling distribution, then tried drawing arms of different lengths around the sampling distribution’s mean until we go one that included 9500 (95%) of those dots. You can think of it as an iterative process like this.