Why are LLMs so bad at deriving physics formulas?

  • Thread starter Thread starter manuel-sh
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around the challenges and limitations of large language models (LLMs) in deriving physics formulas. Participants explore the reasons behind the perceived inadequacies of LLMs, particularly in the context of well-known physics concepts and formulas.

Discussion Character

  • Exploratory
  • Debate/contested
  • Technical explanation

Main Points Raised

  • Some participants express frustration with LLMs like GPT5 and Claude Opus, noting their inability to derive common physics formulas without errors or confusion.
  • There is a suggestion that the lack of a structured derivation dataset for physics may contribute to the poor performance of LLMs in this area.
  • One participant mentions that existing datasets primarily consist of physics papers, math problems, and Wikipedia articles, which may not be suitable for training LLMs on derivations.
  • Another participant highlights the inherent limitations of LLMs, suggesting they do not truly "think" about the subjects they address, leading to inaccuracies.
  • A participant shares insights from a conference discussing the use of RAGs (retrieval-augmented generation) with physics laws as constraints to enhance LLM accuracy and efficiency.
  • Some participants share links to related discussions on the reliability of LLMs, indicating a broader concern within the community regarding their performance.
  • One participant reflects on their understanding of LLMs, describing them as sophisticated random word generators rather than true reasoning entities, which may explain their struggles with logical derivations.

Areas of Agreement / Disagreement

Participants generally agree on the limitations of LLMs in deriving physics formulas, but there are multiple competing views regarding the reasons for these limitations and potential solutions. The discussion remains unresolved with no consensus on the effectiveness of LLMs in this context.

Contextual Notes

Participants note the absence of structured datasets specifically for physics derivations and the reliance on existing datasets that may not adequately support LLM training for this purpose. There are also mentions of the randomness in LLM outputs affecting their reliability in producing accurate derivations.

manuel-sh
Messages
3
Reaction score
2
Hi, I've been using recently AI to generate physics results derivations and, in general, even frontier models such as GPT5 or Claude Opus are quite bad at that, which is surprising. By bad, I mean they are not able to derive common formulas (say the Black Body radiation, Kepler's law, Lorentz transformation, etc) without doing trivial mistakes or getting into nonsensical rabbit holes.

I believe that one of the reasons for that is that there isn't any structured derivation physics dataset which LLMs can use to learn. All physics datasets out there -at least the ones I know about- are physics papers, maths problems, wikipedia articles... I actually did a compilation of them. Even when I ask to an LLM for a derivation, and I want to amend it, it's hard to find open sources for them that are well written.
 
  • Like
Likes   Reactions: 256bits
Computer science news on Phys.org
manuel-sh said:
Hi, I've been using recently AI to generate physics results derivations and, in general, even frontier models such as GPT5 or Claude Opus are quite bad at that, which is surprising. By bad, I mean they are not able to derive common formulas (say the Black Body radiation, Kepler's law, Lorentz transformation, etc) without doing trivial mistakes or getting into nonsensical rabbit holes.

I believe that one of the reasons for that is that there isn't any structured derivation physics dataset which LLMs can use to learn. All physics datasets out there -at least the ones I know about- are physics papers, maths problems, wikipedia articles... I actually did a compilation of them. Even when I ask to an LLM for a derivation, and I want to amend it, it's hard to find open sources for them that are well written.
You gave found a limitation of the LLM model, in that garbage, in garbage out.
LLM's are not at the very least thinking about the subject they are asked about.
 
  • Like
Likes   Reactions: AlexB23 and manuel-sh
manuel-sh said:
Hi, I've been using recently AI to generate physics results derivations and, in general, even frontier models such as GPT5 or Claude Opus are quite bad at that, which is surprising. By bad, I mean they are not able to derive common formulas (say the Black Body radiation, Kepler's law, Lorentz transformation, etc) without doing trivial mistakes or getting into nonsensical rabbit holes.

I believe that one of the reasons for that is that there isn't any structured derivation physics dataset which LLMs can use to learn. All physics datasets out there -at least the ones I know about- are physics papers, maths problems, wikipedia articles... I actually did a compilation of them. Even when I ask to an LLM for a derivation, and I want to amend it, it's hard to find open sources for them that are well written.
My understanding of LLMs (taken from a lot of 3Blue1Brown, other Youtube, and one book, so maybe take my knowledge with a pinch of salt, I really know nothing) is that they work by using the language from before, and then predicting what the next token is likely to be.
There's a bit of randomness with this (particularly because depending on the settings the AI might not always choose the most probable next word, just to make the writing more interesting) and thus they won't tend to get it all right all the time: after all, when it comes to derivations, after one step there is only really one other correct step. And in longer or more complicated derivations it's going to happen that you ahve a rabbit hole. After all, LLMs aren't made to reason or think through logic. They're made to give the illusion that they are speaking your language, really they're just glorified biased random word generators.
 
Which prompts did you use and what sort of results did you get? Can you share the chat log?
 
Was at a conference today and the discussion was RAGs set up with physics laws as a constraint on the LLM, both to improve accuracy and reduce compute by trimming possible inferences to exclude physically impossible scenarios. The context was oil & gas, particularly infrastructure assets
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
1K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 57 ·
2
Replies
57
Views
5K
Replies
25
Views
3K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 11 ·
Replies
11
Views
3K