The language and image recognition capabilities of AI systems have developed very rapidly.
The chart shows how we got here by zooming into the last two decades of AI development. The plotted data stems from a number of tests in which human and AI performance were evaluated in five different domains, from handwriting recognition to language understanding.
Within each of the five domains the initial performance of the AI system is set to -100, and human performance in these tests is used as a baseline that is set to zero. This means that when the model’s performance crosses the zero line is when the AI system scored more points in the relevant test than the humans who did the same test.2
Just 10 years ago, no machine could reliably provide language or image recognition at a human level. But, as the chart shows, AI systems have become steadily more capable and are now beating humans in tests in all these domains.
Outside of these standardized tests the performance of these AIs is mixed. In some real-world cases these systems are still performing much worse than humans. On the other hand, some implementations of such AI systems are already so cheap that they are available on the phone in your pocket: image recognition categorizes your photos and speech recognition transcribes what you dictate.