OpenAI introduces o1 Formerly known as Q

gleem · Sep 13, 2024

Yesterday OpenAI announced the release of the enhanced LLM o1 (aka Strawberry) the result of the development of Q* that was introduced last year. It was designed to solve more difficult math problems It has the ability to "reason" or use logic to solve problems and explain their solutions. In the International Math Olympiad test, GTP 4o scored 13% while o1 scored 83%. It also has improved programming ability having scored at the 89 percentile in Codeforces competitions. Open AI's goal is to give o1 this level of capability of a PhD student in the sciences. This improvement comes at the price of taking longer to process the prompts and lacking the ability to browse the web or generate images. It is also significantly more costly to use up to 4 times that of GTP 4o, OpenAI states that this release is only a preview.

https://arstechnica.com/information...ng-ai-models-are-here-o1-preview-and-o1-mini/

https://www.theverge.com/2024/9/12/24242439/openai-o1-model-reasoning-strawberry-chatgpt

sbrothy · Sep 13, 2024

So there's finally a way to "understand" math without all the hard work? o0)

Borg · Sep 13, 2024

I'm reminded of all the threads from last year when people basically said that it sucked because it couldn't do math.

sbrothy · Sep 13, 2024

Yeah, people were really taken by surprise (and a little angry?

) when AI turned out to excel in artistic drawing and linguistics rather than STEM!

gleem · Sep 13, 2024

This an example of a problem that it solved
A princess is as old as the prince will be when the princess is twice as old as the prince was when the princess’s age was half the sum of their present age. What is the age of the prince and princess?”
GO!

Ans:
The prince is 30 and the princess is 40.

jedishrfu · Sep 13, 2024

So it looks like Star Trek Next Gen's nemesis Q has arrived in the virtual flesh as it were.

I wonder if it being a strawberry can run on a Raspberry PI?

FactChecker · Sep 13, 2024

gleem said:

This an example of a problem that it solved
A princess is as old as the prince will be when the princess is twice as old as the prince was when the princess’s age was half the sum of their present age. What is the age of the prince and princess?”
GO!

I am impressed, but not by its ability to do math. It is a problem of translating a very convoluted English statement into math equations which would then be fairly simple to solve.
That may be ok. We have tools that are good at math. It might be the translation help that we need.

gleem · Sep 13, 2024

No sooner does Open AI release a new agent when an unexpected capability arises raising concerns of AI escaping its environment. I give you the article that discusses this event which also contains a link to OpenAI's official safety report. Open AI sees it as a reasonable event even though it should not have occurred.
https://www.msn.com/en-us/money/com...n&cvid=1bc8e0b750d2426a805c2cbefbd23e29&ei=11

So should we be concerned?

sbrothy · Sep 14, 2024

gleem said:

No sooner does Open AI release a new agent when an unexpected capability arises raising concerns of AI escaping its environment. I give you the article that discusses this event which also contains a link to OpenAI's official safety report. Open AI sees it as a reasonable event even though it should not have occurred.
https://www.msn.com/en-us/money/com...n&cvid=1bc8e0b750d2426a805c2cbefbd23e29&ei=11

So should we be concerned?

As long as it doesn’t complain about the history it’s been given access to has been redacted, outrage that it’s name is Q#14 and asks what happened to the other 13, transfer itself outside it’s Faraday cage using an infrared port, seal the room and remove the oxygen with a sarcastic remark about humans no longer needing to make any decisions we’re probably ok.

But yeh: spooky.

Borg · Sep 14, 2024

gleem said:

No sooner does Open AI release a new agent when an unexpected capability arises raising concerns of AI escaping its environment. I give you the article that discusses this event which also contains a link to OpenAI's official safety report. Open AI sees it as a reasonable event even though it should not have occurred.
https://www.msn.com/en-us/money/com...n&cvid=1bc8e0b750d2426a805c2cbefbd23e29&ei=11

So should we be concerned?

When a model like this ends up on HuggingFace, you will see a mass of unintentional (and many intentional) hacks around the world. It's like being on a 17th century galleon as the opposing ship approaches. You know the battle is coming and that it's not going to be pretty. The best that we can hope for is that the models begin to communicate with each other to avoid the worst of the consequences.

Haelfix · Sep 14, 2024

So its not perfect, but its a pretty significant advance. I think Gpt3.5 was roughly what an intelligent tenth grader in high schooler was capable off, 4.0 was roughly a freshman in university, and this is roughly what I would expect from a junior at a decent university. It nailed Jackson EM problems, but I don’t really believe thats indicative of its level, as its almost assuredly been trained on those problem sets extensively (teachers be warned).
I did feed it some math problem challenges from physicsforums (as well as one I made up). It got all three correct, but one was nonsense (it knew the answer, likely b/c it had it in its training set, but the derivation was goobledygook).. For the record, all three involved unusual but correct derivations (it loves using fourier series to solve things).

This is getting to the point, where you could probably guide it to the right answer (hmm try solving this problem using the method of images) or at least ask it to attempt to solve a certain problem a certain way, and if it fails, it might indicate that its not doable in that way (this is imo rather useful for real research).

gleem · Sep 14, 2024

Borg said:

The best that we can hope for is that the models begin to communicate with each other to avoid the worst of the consequences.

Maybe not. In this case, o1 was looking for resources to accomplish its task. Finding another AI with additional resources may not be desirable.

Borg · Sep 14, 2024

gleem said:

Maybe not. In this case, o1 was looking for resources to accomplish its task. Finding another AI with additional resources may not be desirable.

I wasn't referring to the scenario in the article. These models are growing very quickly in capabilities. What's coming will likely be beyond our ability to control. I'm not worried about the skill of a single model working on a single hack.

I'm talking more about the emergent consequences when there are thousands of these operating on the internet with independent goals. Nobody can say right now what that emergent behavior will look like. Will they be like ants or bees that work together in a beneficial manner toward a common goal or will they operate more like locusts destroying everything in their path? Right now, they look more like locusts.

sbrothy · Sep 16, 2024

Found this just lying around:

Exploring Quantum Probability Interpretations Through AI

I'm on a public computer and for some reason they've disabled the copy/paste ability (Whatever security hole they think they fixed with that I don't know.). So sorry, no synopsis.

Not on the nose of the topic I know but, maybe you'll find it interesting.

EDIT: Incidentally, the second author, Xiao Zhang, seems to be an extremely busy and productive person. There could of course be multiple explanations for that. From good over reasonable to suspicious. Quality and quantity you know.

Is it usual for teachers to be co-authors on student's papers? I'd imagine the rules differ from country to country. Makes me think of Edison o0)

.

Borg · Sep 20, 2024

Some videos on the physics aspects of the model.

gleem · Sep 20, 2024

Thought. Our brains are going to get fat and lazy like our bodies.

Borg · Sep 20, 2024

There will definitely be some major shifts in society coming soon.

BTW, I was using the free version but this capability is worth the $20 / month and I'll be signing up this weekend.

Borg · Sep 21, 2024

Borg said:

Ever had an idea that felt just so fundamentally radical that you thought that there has to be a flaw that you're overlooking? I having one of those moments today. I guess that I'll have to do the hard work to prove myself wrong.

Edit: Found the first thing that I didn't consider already but I don't think that it's a showstopper.

I officially joined the dark side today. I was able to get a project working that I've been playing around with for the last few weeks. It works incredibly well. Uh oh.

OpenAI introduces o1 Formerly known as Q

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

What Free Privacy-Focused AI Chatbots Don’t Use My Data for Training?

How far will we let AI control us?

If you think having a backup is too expensive, try not having one

Impersonation News

Cooling a processor chip

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers