On Progress Toward AGI

gleem · Jul 2, 2025

One of the goals of AI development is to achieve AI at the level of human intelligence, AGI artificial general intelligence. In the thread "Is AI Hype" I glibly stated that we will know we have AGI when we see it. This is of no value in trying to develop AGI. But what is intelligence? The characteristics of intelligence include performing tasks by learning, applying, adapting, and reasoning something that standard LLMs are not expected to do and don't

Current AI systems can perform certain complex human tasks above human levels by acquiring information about these tasks. Basically, the AI is given a skill(s) for certain a task(s). However, it cannot leverage its information to develop new skills at least not extensively. Most cannot reason that is to go through a series of queries checking for leads resulting in the completion of a task. I try to avoid using unique anthropomorphic terms like thinking to avoid making it more than it is.

What can be done though is to develop tests that both humans and AI can perform and compare the results. In particular, such a test would be easy for humans but difficult for AI. Such a test has been developed called Abstract and Reasoning Corpus for Artificial General Intelligence ARC-AGI introduced in 2019. The test is based on a standard IQ test involving inductive reasoning ability using sample patterns to find a rule for how the patterns are related and applying this knowledge to predict the pattern to a new situation

A simple example: Given these

What would you predict for this

A discussion of this test can be found here.

A quick explanation is given below by its inventor Francois Chollet in the video below.

When the development of AI began to show some human levels of performance in the first edition of ARC when models went from 20% of human capability to 86% in a few months ARC_AGI was revised to be significantly more difficult. It was available at the end of March of this year.
In anticipation of more powerful models expected to be released, ARC-AGI is being revised again and will be available next year.

This test/benchmark is used on Kaggle a platform for data science competitions where AI developers compete for cash prizes for developing the best model for a specified problem. The prizes for ARC-AGI competition totals $1M.

Finally, the test is not just about showing AI better than humans at cognitive tasks but doing so efficiently i.e., the lowering of the compute, i.e., computer resources, to an acceptable level for a task.

Borg · Jul 2, 2025

gleem said:

I glibly stated that we will know we have AGI when we see it.

I think that this is the best way to look at it for now. No matter what tests are created, people just keep moving the goalposts anyway.

The main problem for models to perform planning is memory. I'm not referring to training data or context windows in this respect. When a model is asked to perform a task, it needs to have long term capabilities to know what it has done in the past, integrate it into its current actions and update its plan and memories for the next step. Figuring out which memories are relevant and when to use them in the current context window is no easy task (how do you generalize your thought processes for every situation?). As those architectures get better, I think that we'll see something closer to AGI.

jack action · Jul 3, 2025

I always defined intelligence as "having imagination". The smarter you are, the more imagination you have, and vice versa.

Filip Larsen · Jul 9, 2025

Benj at Arstechnica made a run-through on how different AI groups and players currently try to define what AGI is (or not is):
https://arstechnica.com/ai/2025/07/...fine-and-thats-a-multibillion-dollar-problem/

I must admit it had skipped my attention that Microsoft and OpenAI apparently quite literally define AGI as being achieved when the technology generates 100 billion in profits (per year, I guess?), which is a definition that seems conceptually fairly uncorrelated with more technical definitions that tries to compare with intellectual performance of humans, but on the other hand that definition very clearly underscore (if anyone had any doubt) that for the big players it is pretty much only the huge amount of potential profits that drives the tech and the hype around it.

Hornbein · Jul 9, 2025

must admit it had skipped my attention that Microsoft and OpenAI apparently quite literally define AGI as being achieved when the technology generates 100 billion in profits (per year, I guess?),

Does anyone else find this funny?

gleem · Jul 9, 2025

This is about an article in Fortune.

I think it is generally accepted that we want AI to be as capable at any task as any human. Humans are not perfect and make mistakes: "to err is human...". Some companies will deploy AI when this is demonstrated. Some, however, have reservations, take for example, British Petroleum Corp. They are thinking of using AI to advise on safety and reliability. They let a LLM take their safety certification example, and it scored 92% well above the average. So are they using it? No, because they could not figure out why it got 8% wrong. My question is, do they ask humans why they got their answers wrong? OK, the humans will learn from their mistakes, but can't LLM be trained or corrected?

The article brings up a good point that the mistakes of AI are seemingly unhuman-like. So when a mistake is made, we think that if a human were making the decision, the mistake would not have happened. We fail to recognize on average, it makes fewer mistakes. We accept human fallibility over AI because we believe that human activities can be made perfectly safe by rules, regulations, or training (falling into the category of insanity).

Machines should always be predictable. It seems whenever AI passes a test for "intelligence" humans say it was not good enough and develop another one.

The attitude of not trusting AI over human judgment in many cases is like the problem that some have with vaccines. While some vaccines harm a few people, they save way more lives. Some are willing to accept a much larger risk than accepting a much smaller one that they may feel responsible.

bob012345 · Jul 9, 2025

I think society should require large AI data centers to generate their own electricity at their own expense.

gleem · Jul 9, 2025

bob012345 said:

I think society should require large AI data centers to generate their own electricity at their own expense.

I think Colossus the supercomputer that Musk just built in Memphis TN is powered on site by 35 gas turbine generators. He is also purchasing a 2GW gas powerplant from Europe and shipping it to Memphis to provide power for 1M GPUs he ultimately plans for Colossus.

russ_watters · Jul 9, 2025

bob012345 said:

I think society should require large AI data centers to generate their own electricity at their own expense.

What does that mean? These are computing services companies; of course they already buy (pay for the generation of) their electricity as part of their normal expenses....like any other business or household does.

As said they sometimes build/buy their own plants, but whether they do or don't there's not much real world difference.

bob012345 · Jul 9, 2025

russ_watters said:

What does that mean? These are computing services companies; of course they already buy (pay for the generation of) their electricity as part of their normal expenses....like any other business or household does.

As said they sometimes build/buy their own plants, but whether they do or don't there's not much real world difference.

There is concern the rate of growth of data centers for AI will outstrip the infrastructure putting a strain on the grid possibly leading to less reliability. In Texas ERCOT predicts a doubling of the grid capacity by 2031 with 50% of the projected additional power needs for data centers largely driven by AI. Texas is also becoming a crypto center.

bob012345 · Jul 11, 2025

Speaking of progress, Grok 4 announced yesterday. Elon expects it to invent new technology and discover new physics within a year or so.

gleem · Jul 11, 2025

bob012345 said:

Speaking of progress, Grok 4 announced yesterday. Elon expects it to invent new technology and discover new physics within a year or so.

While the video describing Grok4 is impressive, how much of what Musk says about his products can you take to the bank? Tesla's full self-driving mode is still in development, his robot Optimus is behind schedule, Starship has hit a wall. His Boring company is well boring. His solar company is not quite stellar. I guess we will have to wait and see how well Grok4 does, at least until Musk can keep it from hallucinating. BTW, Musk cofounded OpenAI and had access to much of the development of GPT. so it is not l ike he built Grock from scratch.

Borg · Jul 11, 2025

bob012345 said:

Speaking of progress, Grok 4 announced yesterday. Elon expects it to invent new technology and discover new physics within a year or so.

Uh huh.
https://techcrunch.com/2025/07/10/g...-elon-musk-to-answer-controversial-questions/

During xAI’s launch of Grok 4 on Wednesday night, Elon Musk said — while livestreaming the event on his social media platform, X — that his AI company’s ultimate goal was to develop a “maximally truth-seeking AI.” But where exactly does Grok 4 seek out the truth when trying to answer controversial questions?

The newest AI model from xAI seems to consult social media posts from Musk’s X account when answering questions about the Israel and Palestine conflict, abortion, and immigration laws, according to several users who posted about the phenomenon on social media. Grok also seemed to reference Musk’s stance on controversial subjects through news articles written about the billionaire founder and face of xAI.

Borg · Jul 11, 2025

The merging of robotics and AI is the next frontier. HuggingFace wants to make it accessible to everyone.

https://venturebeat.com/ai/hugging-...t-could-disrupt-the-entire-robotics-industry/

bob012345 · Jul 11, 2025

gleem said:

While the video describing Grok4 is impressive, how much of what Musk says about his products can you take to the bank? Tesla's full self-driving mode is still in development, his robot Optimus is behind schedule, Starship has hit a wall. His Boring company is well boring. His solar company is not quite stellar. I guess we will have to wait and see how well Grok4 does, at least until Musk can keep it from hallucinating. BTW, Musk cofounded OpenAI and had access to much of the development of GPT. so it is not l ike he built Grock from scratch.

Well, at least Musk is doing things. It seems to me most people (including myself) just watch while they critique and criticize.

PeroK · Jul 11, 2025

bob012345 said:

Well, at least Musk is doing things. It seems to me most people (including myself) just watch while they critique and criticize.

Some of the things he's doing are better left undone.

gleem · Jul 11, 2025

While Grok4 significantly beat all other GPT models on the ARC-AGI2 benchmark (See post #1) it scored only 16.2% about double Anthropic Clause Opus 4. Musk is relying on the supposition that the more GPUs you have the more powerful GPT will be. Some believe that GPT is beginning to hit a wall as far as AGI is concerned and that the only significant progress will be in the development of neuromorphic hardware. Also this may be the only way to make AGI cost effective.

ShadowKraz · Aug 14, 2025

Hornbein said:

Does anyone else find this funny?

And disappointingly predictable. At the risk of getting this comment pulled or a warning issued, this is the problem with the relationship between corporations and research. Bean counters attempt to evaluate the results but only use the metric of "how much money can we make off of it hic et nunc".

ShadowKraz · Aug 15, 2025

bob012345 said:

Well, at least Musk is doing things. It seems to me most people (including myself) just watch while they critique and criticize.

Yes, yes he is... but it's his motivations that bother many of us. Why is he doing these things? Based upon his actions and words, it is decidedly not for the betterment of our species. That is the true issue here. It's the same issue with any corporation working on AI. It isn't for the betterment of our species but merely to find new ways to line their already bulging pockets.
AI, in and of itself, is not an issue; it is the people behind the AI, the ones calling the shots on its development and uses who are the issue.

BWV · Aug 15, 2025

Good article on the LLMs possibly hitting a wall

https://www.newyorker.com/culture/open-questions/what-if-ai-doesnt-get-much-better-than-this

It’s hard to overstate how completely the A.I. community came to believe that it would inevitably scale its way to A.G.I. In 2022, Gary Marcus, an A.I. entrepreneur and an emeritus professor of psychology and neural science at N.Y.U., pushed back on Kaplan’s paper, noting that “the so-called scaling laws aren’t universal laws like gravity but rather mere observations that might not hold forever.” The negative response was fierce and swift. “No other essay I have ever written has been ridiculed by as many people, or as many famous people, from Sam Altman and Greg Brockton to Yann LeCun and Elon Musk,” Marcus later reflected. He recently told me that his remarks essentially “excommunicated” him from the world of machine learning. Soon, ChatGPT would reach a hundred million users faster than any digital service in history; in March, 2023, OpenAI’s next release, GPT-4, vaulted so far up the scaling curve that it inspired a Microsoft research paper titled “Sparks of Artificial General Intelligence.” Over the following year, venture-capital spending on A.I. jumped by eighty per cent.

After that, however, progress seemed to slow. OpenAI did not unveil a new blockbuster model for more than two years, instead focussing on specialized releases that became hard for the general public to follow. Some voices within the industry began to wonder if the A.I. scaling law was starting to falter. “The 2010s were the age of scaling, now we’re back in the age of wonder and discovery once again,” Ilya Sutskever, one of the company’s founders, told Reuters in November. “Everyone is looking for the next thing.” A contemporaneous TechCrunch article summarized the general mood: “Everyone now seems to be admitting you can’t just use more compute and more data while pretraining large language models and expect them to turn into some sort of all-knowing digital god.” But such observations were largely drowned out by the headline-generating rhetoric of other A.I. leaders

On Progress Toward AGI

Similar threads

Hot Threads

Is AI hype?

How to disable AI responses in Google Searches?

Dealing with the new security règime

More on Distributing High Quality Audio

Looking For Ideas for a Hackathon: 'AI-Driven Diagnostic Efficiency & Solution'

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem