This week, Paul Smaldino of UC Merced invites us to reflect on the importance of good theory. “The thing is”, he says, “we don’t just want science to be reproducible”, which requires better methods, “we want it to help us make better sense of the world. For that, we must create better hypotheses – and those require better models and better measurements”. I agree that we want science to help us make better sense of the world. I don’t agree that the missing ingredients are better hypotheses and models. We need more diverse models and hypotheses, not better ones.
The piece is very thought provoking, and a welcome addition to the discussion on reproducibility. Reading it allowed me to check in with some long-settled assumptions of mine, that, as I age, I realize have been slowly calcifying into a worldview. I am certainly sympathetic to a rebellion against the current focus on methodological reproducibility. Theory building and modeling are, indeed, at least as important.
It doesn’t matter how good an idea is
Smaldino’s central contention is that “generating better hypotheses is at least as important as reducing methodological errors for minimizing false discoveries”. A subsidiary point is that “to generate good hypotheses, we need good theory”.
I would answer that it really doesn’t matter how good or bad our hypotheses are. In the short term, we can save some time by focusing on hypotheses with a higher prior probability, and by ignoring some of the more apparently outlandish ones. The longer-term cost of that strategy is that we are more likely to miss something really important that is completely counter-intuitive.
So there is a trade-off. You can maximize the likelihood that you will pursue a hypothesis that will pay-off, or you can maximize the likelihood that you will get productive but unexpected results. In one case you save time but sacrifice discovery. In the other, you flounder around most of the time, but you get the big breaks once in a while. One favours normal science, and the other favours paradigm shift, in a very coarse sense.
Hypothesis testability is not a measure of quality
I don’t disagree that out of the mass of good, bad, or indifferent hypotheses, we should intentionally focus at least some of our effort on the subset that is currently testable. That’s our day to day work.
But note that I explicitly do not tie the quality of a hypothesis to its testability. Testability, as Smaldino recognizes, is a function of the current state of our measurement methods and of our models. The quality of a hypothesis, whether it is good or bad, is something that can only be evaluated by testing it.
Before a test, a hypothesis can be measured against our sense of prior probabilities, but I would argue that this is not a measure of quality. It is a measure of our current state of ignorance about the system we are studying.
I partly agree when Smaldino says that the two requirements of good theory are that “it can be used to build mathematical or computational models that derive clear, testable consequences from our assumptions”, and that it “must make sense, or at least acknowledge its contradictions”.
I would qualify that by saying that good theory should conceivably allow the building of models that derive clear testable consequences, and that it should at least be internally consistent if we accept its bedrock assumptions, however unrealistic they might be. We really shouldn’t limit our sense of good theory to what we can use right now to produce testable models that make sense to us in the current state of our knowledge and understanding. That’s a little bit too instrumentalist for me.
We should spend as much time dreaming up stuff that could conceivably be testable, as we spend time testing currently testable stuff. Neither should we limit our dreaming to stuff that could be useful in the current state of our knowledge.
Today’s completely useless speculation, like the indivisibility of matter dear to Democritos, is tomorrow’s absolutely crucial source of insight. At any one time, we need a healthy pool of seemingly useless speculation and weird modeling to help us deal with our changing environment. We should value the apparently useless as much as we value the immediately useful, but we should use it differently in our work.
Theory building should not be constrained by our ability to measure
Consequently, I disagree that “If we can’t reliably measure something, it’s hard to build a theory around it”. In fact, I would reverse that and say that it is easier to build theory around stuff we can’t observe and measure, because we have many fewer constraints. Whether that theory has any relation to reality is a different matter.
In class, I often invoke Arbuthnot’s maxim that if we can’t express our understanding of things numerically, “it’s a sign that our knowledge of them is very small and confused”. But I always specify that “very small and confused” is not the same as bad or undesirable.
We should always strive toward better measurement, but we shouldn’t let ourselves be intimidated by its absence.Until such time as are our knowledge is bigger and clearer about something, we should limit our conclusions accordingly, and if we so choose, we should seek better ways of measuring and modeling it to reach more solid conclusions.
I should spend more time on the water
Finally, I am really not sure how to act on Smaldino’s call to action that “it is time to focus on better practices for hypothesis generation”. Should we canoe across more lakes? Eat things that disagree with us and go then to sleep? Spend more afternoons looking at branches undulating in the wind? Because those are some of the activities that produce hypotheses for me.
The mechanisms for production of hypotheses and insights of all kinds are infinitely varied. It doesn’t matter where a hypothesis comes from or how it was produced. It only matters what we do with it. Models can be close to our understandings and perceptions, or they can be speculative and bizarre. It only matters what we learn when we run them.
The only reliable way to produce hypotheses and models that increase our understanding of the world, is to create a greater diversity of hypotheses and models. Pre-selection of hypotheses and models through “better” mechanisms for producing them is a diversity reduction measure. How we use that diversity, that’s another question.
Argh! Typo in title!!
Simon
>
LikeLike
Fixed 🙂
LikeLike
Perhaps he’s implying a game changer. 😳😉😁
LikeLike
Ouch 🙂
LikeLike