The objective for the first three weeks is to step back and think and talk about statistics and data analysis from a philosophical and epistemological perspective
Last class we had Team 1 talk about “data exploration”
Essentially, a protocol for “pre-modeling” data analysis
Why is it important?
What kind of data do you have and how have you thought about approaching it?
What kinds of datasets (or distributions) that would be an exception to these rules? *Rules presented last week
How can you tell between an actual outlier and observational error?
How might you prevent these errors before they happen?
For Thursday we will read the Box (1976) paper that has the following quote:
“Since all models are wrong, the scientist must be alert to what is importantly wrong. It is inappropriate to be concerned about mice when there are tigers abroad.”
How does this quote relate to last week’s lecture?
The goal is not to make a perfect model (impossible) but to detect the problems that will meaningfully distort your conclusions
Now we will have Team 2 give a lecture on Tredennick et al. 2021 Tredennick et al. (2021)
We use prior information in our daily lives all the time
You check the score: the Browns are up 21 to 7 at halftime against the Ravens.
At the same time, the Chiefs are up 21 to 7 at halftime against the Jets
In frequentist probability, you would think:
“Probability of winning when going 21-7 is about 80%”
But! yous have priors… if you are a Browns fan:
\[ \text{Posterior belief} \propto \text{Prior (they find ways to lose)} \times \text{New data (they’re winning now)} \]
Essentially, you know that they tend to find ways to lose
You may be expecting heartbreak, despite the data (21-7)
A Chiefs fan, at this point, might not be worried at all