Goodhart’s Law and AI
OpenAI has a blog post here – quoting from the article:
Goodhart’s law famously says: “When a measure becomes a target, it ceases to be a good measure.” Although originally from economics, it’s something we have to grapple with at OpenAI when figuring out how to optimize objectives that are difficult or costly to measure. It’s often necessary to introduce some proxy objective that’s easier or cheaper to measure, but when we do this, we need to be careful not to optimize it too much.
In my experience, the problem is not that proxy objectives are easier to measure (hence we need to align the proxy and true objectives), it’s that the “real” objective in societal issues is almost never available, apart from very controlled settings. This creates massive distortions; the real objective is unknowable and needs to be discovered iteratively (through potentially multiple lifetimes). When the detrimental effects of optimising for a proxy objective become apparent, it’s already too late to change course, as power relationships have already been established, and all rulers have incentives that point towards the broken proxy objective, inevitably leading to conflict (hence the “Blind God” of the Gnostics).