manuel-sh
- 3
- 2
Hi, I've been using recently AI to generate physics results derivations and, in general, even frontier models such as GPT5 or Claude Opus are quite bad at that, which is surprising. By bad, I mean they are not able to derive common formulas (say the Black Body radiation, Kepler's law, Lorentz transformation, etc) without doing trivial mistakes or getting into nonsensical rabbit holes.
I believe that one of the reasons for that is that there isn't any structured derivation physics dataset which LLMs can use to learn. All physics datasets out there -at least the ones I know about- are physics papers, maths problems, wikipedia articles... I actually did a compilation of them. Even when I ask to an LLM for a derivation, and I want to amend it, it's hard to find open sources for them that are well written.
I believe that one of the reasons for that is that there isn't any structured derivation physics dataset which LLMs can use to learn. All physics datasets out there -at least the ones I know about- are physics papers, maths problems, wikipedia articles... I actually did a compilation of them. Even when I ask to an LLM for a derivation, and I want to amend it, it's hard to find open sources for them that are well written.