Average, or worst?

Over the last few years, deep-learning-based AI has progressed extremely rapidly in fields like natural language processing and image generation. However, self-driving cars seem stuck in perpetual beta mode, and aggressive predictions there have repeatedly been disappointing. Google’s self-driving project started four years before AlexNet kicked off the deep learning revolution, and it still isn’t deployed at large scale, thirteen years later. Why are these fields getting such different results?

~ Alyssa Vance from, https://www.lesswrong.com/posts/28zsuPaJpKAGSX4zq/humans-are-very-reliable-agents

This makes the interesting distinction between average–case performance, and worst–case performance. People are really good by both measures (click through to see what that means via Fermi approximations.) AI (true AI, autonomous driving systems, language models like GPT-3, etc.) is getting really good on average cases. But it’s the worst–case situations where humans perform reasonably well… and current AI fails spectacularly.

ɕ