- 2,700
- 2,171
- TL;DR Summary
- New tests to compare AI performance to human performance for tasks that are natural and easy for humans but are currently difficult for AI. These tests are known as Abstract and Reasoning Corpus for Artificial General Intelligence., ARC-AGI. Current AI models are just beginning to compete with humans for simple tasks. An example is given.
One of the goals of AI development is to achieve AI at the level of human intelligence, AGI artificial general intelligence. In the thread "Is AI Hype" I glibly stated that we will know we have AGI when we see it. This is of no value in trying to develop AGI. But what is intelligence? The characteristics of intelligence include performing tasks by learning, applying, adapting, and reasoning something that standard LLMs are not expected to do and don't
Current AI systems can perform certain complex human tasks above human levels by acquiring information about these tasks. Basically, the AI is given a skill(s) for certain a task(s). However, it cannot leverage its information to develop new skills at least not extensively. Most cannot reason that is to go through a series of queries checking for leads resulting in the completion of a task. I try to avoid using unique anthropomorphic terms like thinking to avoid making it more than it is.
What can be done though is to develop tests that both humans and AI can perform and compare the results. In particular, such a test would be easy for humans but difficult for AI. Such a test has been developed called Abstract and Reasoning Corpus for Artificial General Intelligence ARC-AGI introduced in 2019. The test is based on a standard IQ test involving inductive reasoning ability using sample patterns to find a rule for how the patterns are related and applying this knowledge to predict the pattern to a new situation
A simple example: Given these
What would you predict for this
A discussion of this test can be found here.
A quick explanation is given below by its inventor Francois Chollet in the video below.
When the development of AI began to show some human levels of performance in the first edition of ARC when models went from 20% of human capability to 86% in a few months ARC_AGI was revised to be significantly more difficult. It was available at the end of March of this year.
In anticipation of more powerful models expected to be released, ARC-AGI is being revised again and will be available next year.
This test/benchmark is used on Kaggle a platform for data science competitions where AI developers compete for cash prizes for developing the best model for a specified problem. The prizes for ARC-AGI competition totals $1M.
Finally, the test is not just about showing AI better than humans at cognitive tasks but doing so efficiently i.e., the lowering of the compute, i.e., computer resources, to an acceptable level for a task.
Current AI systems can perform certain complex human tasks above human levels by acquiring information about these tasks. Basically, the AI is given a skill(s) for certain a task(s). However, it cannot leverage its information to develop new skills at least not extensively. Most cannot reason that is to go through a series of queries checking for leads resulting in the completion of a task. I try to avoid using unique anthropomorphic terms like thinking to avoid making it more than it is.
What can be done though is to develop tests that both humans and AI can perform and compare the results. In particular, such a test would be easy for humans but difficult for AI. Such a test has been developed called Abstract and Reasoning Corpus for Artificial General Intelligence ARC-AGI introduced in 2019. The test is based on a standard IQ test involving inductive reasoning ability using sample patterns to find a rule for how the patterns are related and applying this knowledge to predict the pattern to a new situation
A simple example: Given these
What would you predict for this
A discussion of this test can be found here.
A quick explanation is given below by its inventor Francois Chollet in the video below.
When the development of AI began to show some human levels of performance in the first edition of ARC when models went from 20% of human capability to 86% in a few months ARC_AGI was revised to be significantly more difficult. It was available at the end of March of this year.
In anticipation of more powerful models expected to be released, ARC-AGI is being revised again and will be available next year.
This test/benchmark is used on Kaggle a platform for data science competitions where AI developers compete for cash prizes for developing the best model for a specified problem. The prizes for ARC-AGI competition totals $1M.
Finally, the test is not just about showing AI better than humans at cognitive tasks but doing so efficiently i.e., the lowering of the compute, i.e., computer resources, to an acceptable level for a task.