The tyranny of numbers
A problem I keep coming back to is one of the first things one notices when trying to analyse social science data; it is hard to come up with any concrete conclusion – or, conversely, it is easy to make up any story you want, and have your data confirm it. It’s trivial to keep adding confounders and claiming that “aha”, this is confounded by X, which behaves like this and that, which you did not take into account when doing the study. In the same vein, one can selectively ignore confounders, present the right data (i.e. How to Lie with Statistics or The Tyranny of Numbers: Mismeasurement and Misrule) and make it deliver whatever you want to deliver. At this point, what even counts as inference? It’s all made up, a theology of sorts on top of latent variables.
What drives these latent variables though? While the inference is lacking, a narrative is put together “that makes sense”. A fiction, based on other fictions that are based on top of other fictions – a latent space you can reason/plan in. How does one come up with this latent space might prove to be a central theme in AI.
People will soon realize that automated prompt engineering is akin to latent variable inference.
— Yann LeCun (@ylecun) October 4, 2022
Then they will realize that latent variable inference is akin to reasoning/planning.
Then they will realize that transformers were missing that piece all along.