Why are LLMs so bad at deriving physics formulas?

  • Thread starter Thread starter manuel-sh
  • Start date Start date
Click For Summary
SUMMARY

Large Language Models (LLMs) such as GPT-5 and Claude Opus struggle significantly with deriving fundamental physics formulas, including Black Body radiation, Kepler's law, and Lorentz transformation. This inadequacy stems from the absence of structured derivation datasets specifically tailored for physics, as existing datasets primarily consist of physics papers, math problems, and Wikipedia articles. Users report frequent trivial mistakes and nonsensical outputs when querying these models for derivations. The discussion highlights the limitations of LLMs in logical reasoning and their reliance on probabilistic language generation rather than structured problem-solving.

PREREQUISITES
  • Understanding of Large Language Models (LLMs)
  • Familiarity with fundamental physics concepts and formulas
  • Knowledge of dataset structures and their impact on AI training
  • Basic comprehension of natural language processing techniques
NEXT STEPS
  • Research the development of structured physics datasets for LLM training
  • Explore methods to enhance LLM accuracy in scientific derivations
  • Investigate the implications of using Retrieval-Augmented Generation (RAG) in LLMs
  • Examine the limitations of LLMs in logical reasoning and their applications in physics
USEFUL FOR

Researchers, educators, and developers in the fields of artificial intelligence and physics, particularly those interested in improving the reliability of LLMs for scientific applications.

manuel-sh
Messages
3
Reaction score
2
Hi, I've been using recently AI to generate physics results derivations and, in general, even frontier models such as GPT5 or Claude Opus are quite bad at that, which is surprising. By bad, I mean they are not able to derive common formulas (say the Black Body radiation, Kepler's law, Lorentz transformation, etc) without doing trivial mistakes or getting into nonsensical rabbit holes.

I believe that one of the reasons for that is that there isn't any structured derivation physics dataset which LLMs can use to learn. All physics datasets out there -at least the ones I know about- are physics papers, maths problems, wikipedia articles... I actually did a compilation of them. Even when I ask to an LLM for a derivation, and I want to amend it, it's hard to find open sources for them that are well written.
 
  • Like
Likes   Reactions: 256bits
Computer science news on Phys.org
manuel-sh said:
Hi, I've been using recently AI to generate physics results derivations and, in general, even frontier models such as GPT5 or Claude Opus are quite bad at that, which is surprising. By bad, I mean they are not able to derive common formulas (say the Black Body radiation, Kepler's law, Lorentz transformation, etc) without doing trivial mistakes or getting into nonsensical rabbit holes.

I believe that one of the reasons for that is that there isn't any structured derivation physics dataset which LLMs can use to learn. All physics datasets out there -at least the ones I know about- are physics papers, maths problems, wikipedia articles... I actually did a compilation of them. Even when I ask to an LLM for a derivation, and I want to amend it, it's hard to find open sources for them that are well written.
You gave found a limitation of the LLM model, in that garbage, in garbage out.
LLM's are not at the very least thinking about the subject they are asked about.
 
  • Like
Likes   Reactions: AlexB23 and manuel-sh
manuel-sh said:
Hi, I've been using recently AI to generate physics results derivations and, in general, even frontier models such as GPT5 or Claude Opus are quite bad at that, which is surprising. By bad, I mean they are not able to derive common formulas (say the Black Body radiation, Kepler's law, Lorentz transformation, etc) without doing trivial mistakes or getting into nonsensical rabbit holes.

I believe that one of the reasons for that is that there isn't any structured derivation physics dataset which LLMs can use to learn. All physics datasets out there -at least the ones I know about- are physics papers, maths problems, wikipedia articles... I actually did a compilation of them. Even when I ask to an LLM for a derivation, and I want to amend it, it's hard to find open sources for them that are well written.
My understanding of LLMs (taken from a lot of 3Blue1Brown, other Youtube, and one book, so maybe take my knowledge with a pinch of salt, I really know nothing) is that they work by using the language from before, and then predicting what the next token is likely to be.
There's a bit of randomness with this (particularly because depending on the settings the AI might not always choose the most probable next word, just to make the writing more interesting) and thus they won't tend to get it all right all the time: after all, when it comes to derivations, after one step there is only really one other correct step. And in longer or more complicated derivations it's going to happen that you ahve a rabbit hole. After all, LLMs aren't made to reason or think through logic. They're made to give the illusion that they are speaking your language, really they're just glorified biased random word generators.
 
Which prompts did you use and what sort of results did you get? Can you share the chat log?
 
Was at a conference today and the discussion was RAGs set up with physics laws as a constraint on the LLM, both to improve accuracy and reduce compute by trimming possible inferences to exclude physically impossible scenarios. The context was oil & gas, particularly infrastructure assets
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
967
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 57 ·
2
Replies
57
Views
5K
Replies
25
Views
3K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 11 ·
Replies
11
Views
3K