Using AI to evaluate white papers?

  • Thread starter Thread starter frankinstien
  • Start date Start date
AI Thread Summary
The discussion revolves around the submission of a conceptual paper to a publication, which received a rude response, prompting the author to seek critique from an AI model, Gemma-3-12b. The AI provided constructive feedback, highlighting weaknesses in mathematical rigor, definitions, and experimental testability, while also recognizing the paper's novel ideas. A debate emerged about the potential for AI to evaluate academic papers without human biases, contrasting this with the expectation that journals require well-formed, original submissions rather than exploratory ideas. Critics argue that while AI can assist in refining concepts, it lacks true understanding and cannot replace the nuanced judgment of human experts. Ultimately, the conversation underscores the challenges of integrating AI into academic evaluation processes while maintaining rigorous standards.
frankinstien
Messages
3
Reaction score
5
[Mentor Note: Two duplicate thread starts merged.]

I recently submitted a paper to a publication, but did so with the intention of just opening up a conversation about the idea in general. The response from the publisher was less than polite. :cry: I then used Gemma-3-12b on my server to critique the paper. It was able to evaluate it, including its diagrams, pointing out the bigger picture of the work, but it also had excellent constructive comments. So, the question becomes: Wouldn't it be better to have an AI evaluate papers, rather than a human being who has bias, and is prone to using ad hominem attacks? Below are the constructive critiques from the LLM. What do you think?
Potential Weaknesses/Questions & Areas Requiring Further Development:

Mathematical Rigor: The paper is largely conceptual and lacks detailed mathematical derivations. The introduction of L', S', ε, and θ feels somewhat ad hoc without a more rigorous justification for their forms and how they impact the path integral calculations. How exactly does biasing these paths work mathematically?

Nature of 'w': The paper doesn't fully define or characterize the hyper-dimensional coordinate 'w'. What are its properties? Is it quantized, continuous, etc.? How is it related to other physical quantities?

Physical Interpretation of Biasing: The "biasing term" εf(x, ̇x,w, ̇w,t) is crucial but not well explained. What is this function 'f'? How does the coupling constant 'ε' relate to fundamental constants or physical parameters? Why should we expect such a biasing mechanism to exist?

Experimental Testability: While the paper mentions experimental proposals, it doesn’t detail specific, testable predictions that would distinguish this model from other interpretations of quantum mechanics. The "detection of hyper-dimensional coupling effects" is vague.

Coherence and Decoherence: The paper acknowledges the need for maintaining coherence but doesn't fully address how to prevent decoherence in the hyper-dimensional space, which is a significant challenge in any extended system.

Justification for Conserved Quantity Q(w): The introduction of Q(w) as a conserved quantity feels somewhat arbitrary without further explanation. What properties does it have and why is its conservation necessary?

In conclusion: This paper presents an intriguing theoretical framework with the potential to offer new insights into quantum entanglement and non-locality. However, it requires substantial mathematical development, more detailed physical explanations, and concrete experimental predictions to move beyond a conceptual proposal. It's a good starting point for further research but remains speculative at this stage.
 
Last edited by a moderator:
Physics news on Phys.org
These would be the AI's response to the same material that you posted here (which was removed because of our rule disallowing unpublished theories)?

The response shows what an LLM can do, but also that identifying worthwhile papers is not currently one of those things. The criticisms about mathematical rigor, lack of explanation, lack of justification can be made based on the real scientific papers in its training data without any reference to the actual meaning of the text (although the misdirected comment about coherence suggests the limitations of that approach). So yes, the exposition could be improved by following up on these criticisms.

But for evaluating the paper? The "In conclusion" section demonstrates that the AI has missed the point completely. As presented, the paper is nonsense and the constructive criticisms are merely ways of making it more plausible nonsense.
 
Last edited:
  • Like
Likes PeterDonis, russ_watters and PeroK
"But for evaluating the paper? The "In conclusion" section demonstrates that the AI has missed the point completely. As presented, the paper is nonsense and the constructive criticisms are merely ways of making it more plausible nonsense."

I did revise the paper with diagrams and analogies, but as I mentioned before, AI assesses the idea without bias. I didn't place the points that Gemma found intriguing in the first post; some of them were:

Novel Conceptual Framework: The core idea of introducing a hyper-dimensional coordinate (w) and biasing path integrals to explain entanglement is original and offers a fresh perspective on existing phenomena. It attempts to bridge the gap between quantum mechanics and geometry in an interesting way.
Clear Explanation of Feynman Path Integrals: The paper does a good job of briefly explaining the foundational concepts of Feynman's path integral formulation, making it somewhat accessible even to those not deeply familiar with the formalism.

Effective Analogy (Flatland): The Flatland analogy is exceptionally well-chosen and effectively communicates the idea of how seemingly disconnected events could be linked in a higher dimension. This greatly aids understanding.
Potential for New Experimental Directions: The paper explicitly suggests future work involving simulations and experimental proposals to detect hyperdimensional coupling effects, which highlights its potential for driving further research.

Addresses a Fundamental Question: It tackles the core mystery of quantum non-locality – how entangled particles can exhibit correlations that seem to defy classical explanations.

So, it did understand the core concepts of the paper. And that perspective can help collaborate on ideas, not just write them off as nonsense.
 
frankinstien said:
I recently submitted a paper to a journal, and the response was less than polite. :cry: So, I asked Gemma-3-12b, which I run on my local server, what it thought about my paper. It gave a very effective and constructive response. It understood the concepts of the paper as a proposal and highlighted issues that needed to be addressed. Compare this to a rude, biased, and speckled with ad hominem responses from a human evaluator. So, the question becomes: Would it be more constructive to have AI evaluate papers rather than humans? Below are the critiques of Gemma:
"[...] speckled with ad hominem responses from a human evaluator [...]"

Maybe I didn't read close enough but those didn't exactly jump off the page at me.

EDIT: Oh sorry. That was the AI response, right?
 
  • Like
Likes frankinstien
frankinstien said:
I recently submitted a paper to a publication, but did so with the intention of just opening up a conversation about the idea in general.
I've never heard of a physics journal that allows, let alone encourages, submitting a paper "with the intention of just opening up a conversation about the idea in general." Journals are not meant to be "sounding boards" for theory development. Editors expect to receive papers that have been honestly judged by their authors to offer original, novel, useful and complete results, as based on those authors' own education, subsequent research in physics, and feedback from colleagues. Can you cite the submission criteria of an actual, reputable physics journal that leads you to believe that it would welcome such a "conversation"?
 
Last edited:
  • Like
Likes Hornbein, Drakkith, dextercioby and 7 others
So, I'm using an application similar to Crew AI, where you can chain AIs together or have them interact, literally talk with one another. I'm playing the intermediary between two AIs, Gemma and Chagpt 4. So, ChatGPT 4 responded to Gemma's critique with approaches as to how to address the issues it brought up. An example is shown below:

1749312731861.webp
 
frankinstien said:
"But for evaluating the paper? The "In conclusion" section demonstrates that the AI has missed the point completely. As presented, the paper is nonsense and the constructive criticisms are merely ways of making it more plausible nonsense."

I did revise the paper with diagrams and analogies, but as I mentioned before, AI assesses the idea without bias. I didn't place the points that Gemma found intriguing in the first post; some of them were:





So, it did understand the core concepts of the paper. And that perspective can help collaborate on ideas, not just write them off as nonsense.
Part of the problem is that almost anyone can propose some new laws of physics and write a conceptual paper. Several papers of the type you wrote are posted here every week. Each of you individually believes that your ideas will revolutionise physics.

Many professional physicists also receive a steady stream of such papers.

Patience wears a bit thin at times. Whereas, Gemma has infinite patience.
 
  • Like
Likes frankinstien
renormalize said:
I've never heard of a physics journal that allows, let alone encourages, submitting a paper "with the intention of just opening up a conversation about the idea in general." Journals are not meant to be "sounding boards" for theory development. Editors expect to receive papers that have been honestly judged by their authors to offer original, novel, useful and complete results, as based on those authors' own education, subsequent research in physics, and feedback from colleagues. Can you cite the submission criteria of an actual, reputable physics journal that leads you to believe that it would welcome such a "conversation"?
Well, I didn't look at it from that perspective, but as I mentioned, the response from the journal wasn't polite...
 
  • Haha
Likes davenn and BillTre
PeroK said:
Part of the problem is that almost anyone can propose some new laws of physics and write a conceptual paper. Several papers of the type you wrote are posted here every week. Each of you individually believes that your ideas will revolutionise physics.

Many professional physicists also receive a steady stream of such papers.

Patience wears a bit thin at times. Whereas, Gemma has infinite patience.
Which is part of the point of getting AI involved, it addresses issues that need to be solved without having to tax an individual. But, AI can go further and expand upon ideas or help resolve the issues that otherwise make an idea implausible or difficult. Think of how AlphaGo was able to outdo humans, and how AI is proving to be much better at working out protein folding, could it also do the same in other disciplines?
 
  • #10
frankinstien said:
Which is part of the point of getting AI involved, it addresses issues that need to be solved without having to tax an individual. But, AI can go further and expand upon ideas or help resolve the issues that otherwise make an idea implausible or difficult. Think of how AlphaGo was able to outdo humans, and how AI is proving to be much better at working out protein folding, could it also do the same in other disciplines?
The funny thing is that on another thread there is someone who believes that AI is no more revolutionary than a pocket calculator!

That said, your optimism is misplaced. If you can induce AI to revolutionise physics, then why can't the world's best physicists?

It's difficult to communicate the gulf between your expectations of what cutting-edge physics looks like and what it really looks like.

Let me find a link for you. I'll post it below.
 
  • Like
Likes russ_watters and BillTre
  • #11
frankinstien said:
AI assesses the idea without bias....it did understand the core concepts of the paper.
An LLM neither assesses nor understands. Quoting from Tyler Harper in the Atlantic (behind a paywall so no link):
[attributing thinking/understanding to an LLM] betray a conceptual error: Large language models do not, cannot, and will not “understand” anything at all. ... LLMs are impressive probability gadgets that have been fed nearly the entire internet, and produce writing not by thinking but by making statistically informed guesses about which lexical item is likely to follow another.​
Some of our older threads about why we currently do not accept LLMs as valid sources discuss the technology and its limitations in greater depth.

It's very difficult for us to avoid this conceptual error. We naturally consider speech to be an expression of the speaker's thoughts so imagine a thinking/reasoning agent behind every piece of coherent speech. But that's just not what an LLM is.
 
Last edited:
  • Like
Likes Dale, Filip Larsen, PeterDonis and 2 others
  • #12
frankinstien said:
Think of how AlphaGo was able to outdo humans,
AlphaGo is a different technology than the LLMs, with different set of limitations and abilities. Interestingly, and perhaps because it does not present as a thinking entity, we are much more able to recognize it as a machine that it is very good at what it does without attributing thought and understanding to it. There are many other examples: chess players, the AIs that are widely used to scan images, facial recognition sofware, ....
 
Last edited:
  • Like
Likes dextercioby, javisot and PeterDonis
  • #13
  • Like
Likes javisot, davenn, russ_watters and 1 other person
  • #14
Nugatory said:
An LLM neither assesses nor understands. Quoting from Tyler Harper in the Atlantic (behind a paywall so no link):
[attributing thinking/understanding to an LLM] betray a conceptual error: Large language models do not, cannot, and will not “understand” anything at all. ... LLMs are impressive probability gadgets that have been fed nearly the entire internet, and produce writing not by thinking but by making statistically informed guesses about which lexical item is likely to follow another.​
Some of our older threads about why we currently do not accept LLMs as valid sources discuss the technology and its limitations in greater depth.

It's very difficult for us to avoid this conceptual error. We naturally consider speech to be an expression of the speaker's thoughts so imagine a thinking/reasoning agent behind every piece of coherent speech. But that's jst not what an LLM is.
At the risk of starting another debate on LLMs, the opinion of Tyler Harper cannot seriously be taken as the last word on the subject.
 
  • Like
Likes frankinstien
  • #15
PeroK said:
This is what a physics research paper looks like

https://arxiv.org/abs/2503.24263

Note the difference between this and what you have written.
That's a risky game. I recently came across a "paper" about a "new proof" of Fermat's last theorem. The author definitely tried to mimic the look of a serious paper, but a closer look revealed basic formal flaws. Look-Alike is only half of the truth. It's more a bit like: I know it when I see it.
 
  • Like
Likes dextercioby
  • #16
Who says that "thinking" isn't making statistically informed guesses. Unless you believe that our intelligence is divine, our neurons must be working to some sophisticated algorithm. Whatever that algorithm is, it could be described as not "thinking".
 
  • Like
Likes Hornbein, Dale and frankinstien
  • #17
PeroK said:
At the risk of starting another debate on LLMs, the opinion of Tyler Harper cannot seriously be taken as the last word on the subject.
(Presuming that we're talking about the same Tyler Harper, and not the Georgia state Ag Commissioner)
That's fair - that particular quote just happened to be close at hand. The PF threads provide more technical descriptions of what an LLM does.
 
  • #18
PeroK said:
Who says that "thinking" isn't making statistically informed guesses. Unless you believe that our intelligence is divine, our neurons must be working to some sophisticated algorithm. Whatever that algorithm is, it could be described as not "thinking".
It's getting to where we do have to confront that question. We could consider a lifetime of experience with sensory input, interactions with the world, other humans, written and spoken communication, physical manipulation of objects, all as the "training data" provided to the complex protoplasmic device inside our skull, and it is quite possible that a manmade device of comparable scale and malleability could develop comparable capabilities.

But that's taking us well beyond this thread: Today's LLMs aren't doing what OP imagines.
 
  • Like
  • Agree
Likes Dale, dextercioby, russ_watters and 3 others
  • #19
fresh_42 said:
That's a risky game. I recently came across a "paper" about a "new proof" of Fermat's last theorem. The author definitely tried to mimic the look of a serious paper, but a closer look revealed basic formal flaws. Look-Alike is only half of the truth. It's more a bit like: I know it when I see it.
That misses the point. That paper shows what one looks like. I never said that anything like that must be valid. The Penrose paper could be nonsense too!

Today I have been ambushed by more false syllogisms than ever!
 
  • #20
PeroK said:
That misses the point. That paper shows what one looks like. I never said that anything like that must be valid. The Penrose paper could be nonsense too!

Today I have been ambushed by more false syllogisms than ever!
I didn't mean it so seriously. Definitely not in the state of an argument. I just wanted to note that form alone isn't sufficient. Penrose's point of view is a completely different issue and worth a thread on its own in a technical forum.
 
  • Like
Likes PeroK
  • #21
Moderator's note: Several posts have been moved to the other current AI thread:

 
  • Like
Likes fresh_42
  • #22
Nugatory said:
An LLM neither assesses nor understands. Quoting from Tyler Harper in the Atlantic (behind a paywall so no link):
[attributing thinking/understanding to an LLM] betray a conceptual error: Large language models do not, cannot, and will not “understand” anything at all. ... LLMs are impressive probability gadgets that have been fed nearly the entire internet, and produce writing not by thinking but by making statistically informed guesses about which lexical item is likely to follow another.​
Some of our older threads about why we currently do not accept LLMs as valid sources discuss the technology and its limitations in greater depth.

It's very difficult for us to avoid this conceptual error. We naturally consider speech to be an expression of the speaker's thoughts so imagine a thinking/reasoning agent behind every piece of coherent speech. But that's just not what an LLM is.
So, here's where the issue of emergence comes in, despite the simple rule that an LLM performs at its basic objective, which is to predict the next word in a sequence, as well as its simple logistic function, what emerges to perform that prediction is the beauty behind LLMs. I think you confuse thinking, which I doubt you can actually define, with cognitive abilities. LLMs do demonstrate cognitive abilities. What most want to believe they are is the classical "Homunculus" agent of reason and free will, not a collection of cognitive processes that reintegrate their inputs and outputs, where some of those outputs post to the hippocampus. What I do find interesting is how zero-shot learning in LLMs happens, where at one time it was a big issue that neural networks needed streams of data to learn, whereas humans can learn, if starting from a sufficient level of experience, learn fairly quickly, comparatively from a very small data set. With zero-shot learning, the ability for an LLM to learn from an interactive conversation is possible, however, that lesson needs to be stored in a database and a technique called Retrieval Augmented Generation (RAG) is used that gives the LLM contextual long term memory of past conversations. Even Tesla's Optimus learned to dance from a zero-shot RAG approach, so its not limited to language but actual physical experience.

So, does an LLM think? As I stated before, it's not a very good conjecture; the better question is, does an LLM demonstrate cognitive abilities? Ultimately, we'll have to apply the same metric to humans, which we do, and that is what cognitive abilities do we excel at as individuals, what we call an individual's "Talent"...
 
  • #23
If we get back to the orignal question.

1) It's against the rules to post and discuss your paper on here.

2) No peer-reviewed physics journal will consider your paper for publication.

3) No professional physicist is likely to help you develop the paper.

Your only option is to develop your paper with the help of Gemma and self publish. Even then, no one in the professional physics community is going to read it.

That's the reality of the situation.
 
  • #24
frankinstien said:
AI assesses the idea without bias.
No, AI is not assessing your idea. That is simply not what LLMs do. LLMs don't even have any concept of "assessing an idea". All they do is generate more text with similar patterns to the text in the prompt you give them, where "similar patterns" is based on their training data, i.e., a corpus of text scraped from the Internet.

You are simply reading things into the LLM text that aren't there.
 
  • Like
Likes javisot
  • #25
Nugatory said:
We could consider a lifetime of experience with sensory input, interactions with the world, other humans, written and spoken communication, physical manipulation of objects, all as the "training data" provided to the complex protoplasmic device inside our skull, and it is quite possible that a manmade device of comparable scale and malleability could develop comparable capabilities.
Yes, but as your description of the "training data" we humans get illustrates, that data is much, much, much, much more than just text scraped from the Internet. What's more, it's qualitatively different, since it includes "interactions with the world", "physical manipulation of objects", etc. Our brains are physical devices, yes, but they're not physical devices that do nothing but process text.
 
  • Like
Likes Dale and Nugatory
  • #26
PeroK said:
This is what a physics research paper looks like

https://arxiv.org/abs/2503.24263

Note the difference between this and what you have written.
In line with what you're saying, I don't know if you've ever tried (for example, with chatgpt) offering them an arxiv paper and asking them to rewrite it, taking a change into account.

What you expect is for chatgpt to maintain the quality of the original work and simply add the change, but the reality is that it returns a ridiculously small paper. I've found it's more effective to ask them to rewrite each part independently, not the entire paper at once.
 
  • #27
The big problem with ChatGPT is that it's sycophantic. It will strive to tell you what you want to hear. This renders it unsuitable for evaluating papers, and possibly for evaluating anything else. Open Source is quite concerned about this. If I were them I'd leave it that way. I bet sycophancy adds considerably to its popularity.

I use ChatGPT solely for programming. I then can test what it produces immediately. It either works or it doesn't. Then the nonsense it occasionally produces is harmless.
 
Last edited:
  • Like
Likes Dale, dextercioby and russ_watters
  • #28
PeterDonis said:
No, AI is not assessing your idea. That is simply not what LLMs do. LLMs don't even have any concept of "assessing an idea". All they do is generate more text with similar patterns to the text in the prompt you give them, where "similar patterns" is based on their training data, i.e., a corpus of text scraped from the Internet.

You are simply reading things into the LLM text that aren't there
Ah...No, what an LLM does at its fundamental process is indeed a simple rule, and from that simple rule emerge properties that allow it to do its basic objective, but what is surprising is what it can do. The interrelationships across an astronomical number of contextual relationships form a type of meaning. Effectively, if you assess yourself, what is the meaning of concepts over what an LLM is doing? You really can't say that someone who has zero experience as a plumber but reads about plumbing and is able to respond to questions about plumbing from reading about it is much different from an LLM! That if you have "experience" in the physical world, it gives meaning to a subject matter you've never experienced, doesn't give it much weight. You can say you've interacted with objects before, and that experience is associated with the symbolic language, which only means you've added some unrelated material that has some contextual relatedness as datapoints, which an LLM could do as well using word or even contextual embeddings. But what LLMs are demonstrating is the adaptive advantage of language, where data can be compressed into a data point called a context, and from those interrelationships, derive cognitive abilities that no other animal has...
 
  • Like
Likes javisot and PeroK
  • #29
Last edited:
  • #30
frankinstien said:
You really can't say that someone who has zero experience as a plumber but reads about plumbing and is able to respond to questions about plumbing from reading about it is much different from an LLM!
Thank you for agreeing with my main point!

You're quite right--and you wouldn't allow such a person to actually try to fix a plumbing problem in your house, would you? You'd want an actual plumber who could connect all those words about plumbing to actual plumbing in the real world.

And what you are trying to do in this thread is just as daft--asking an LLM, something which has zero experience actually doing science but has "read about" lots of "scientific stuff" by snarfing up text from the Internet--to evaluate a scientific paper for you.
 
  • Like
Likes Dale, russ_watters and javisot
  • #31
javisot said:
This is usually shown with the example of chinese room https://en.wikipedia.org/wiki/Chinese_room

His counterargument is https://en.m.wikipedia.org/wiki/Strong_AI_hypothesis
No, the Chinese Room is not the same as the LLMs we're talking about here--because as the Chinese Room thought experiment is formulated, its answers have to actually be correct. They have to show actual world knowledge--not just "text snarfed from the Internet knowledge".

This point is overlooked by far too many discussions of the Chinese Room, because those discussions don't appreciate that you can ask the Chinese Room any question you want, including questions about real world experiences that no amount of just snarfing up text will let any kind of entity (including an actual human whose only "knowledge" comes from reading stuff on the Internet) give correct answers to. And of course when you do that with LLMs, you get all kinds of nonsense--no sane person should be fooled into thinking that the LLM is a person with actual real world knowledge of the topic being asked about.

But in the Chinese Room thought experiment, by hypothesis, the Chinese Room can convince people that it's a person with actual real world knowledge of all the topics it's asked about. In other words, the thought experiment states a performance standard that LLMs simply don't and can't meet.
 
  • Like
Likes javisot
  • #32
PeterDonis said:
Thank you for agreeing with my main point!

You're quite right--and you wouldn't allow such a person to actually try to fix a plumbing problem in your house, would you? You'd want an actual plumber who could connect all those words about plumbing to actual plumbing in the real world.

And what you are trying to do in this thread is just as daft--asking an LLM, something which has zero experience actually doing science but has "read about" lots of "scientific stuff" by snarfing up text from the Internet--to evaluate a scientific paper for you.
There are usually two parts: checking that the math is correct and checking that the paper is conceptually correct. I understand that AI can fail on the conceptual side, but wouldn't you use it to check the math?

(I'm not saying now with the models we have, in the future with some specific model that correctly reviews the mathematics of the works)
 
  • #33
javisot said:
wouldn't you use it to check the math?
No. There are certainly computer programs that can check math, but LLMs don't do that.

javisot said:
some specific model that correctly reviews the mathematics of the works
We already have computer programs that can check math for accuracy--automated theorem provers and checkers, automated equation solvers, and things like that. But they don't work anything like LLMs. They don't check math by snarfing up a huge amount of text, say from math papers, and looking for patterns in it. They check math by having the actual logical rules that the math is supposed to follow coded into them directly, and then being able to apply those logical rules much more quickly and accurately, over huge numbers of logical steps, than humans can.
 
  • Like
Likes Dale and javisot
  • #34
PeterDonis said:
Thank you for agreeing with my main point!

You're quite right--and you wouldn't allow such a person to actually try to fix a plumbing problem in your house, would you? You'd want an actual plumber who could connect all those words about plumbing to actual plumbing in the real world.

And what you are trying to do in this thread is just as daft--asking an LLM, something which has zero experience actually doing science but has "read about" lots of "scientific stuff" by snarfing up text from the Internet--to evaluate a scientific paper for you.
That's a subjective argument, and there are plenty of DIY who just start out by reading subject matter material. So, here's where AI is being exploited, because AI doesn't have a body yet, it can't explore the real world from the knowledge it gained from human documentation. So we as humans collaborate with AI who gives us notions that we can then validate in the real world and communicate back to AI who then can learn from our experiences.
 
  • Like
Likes PeroK
  • #35
frankinstien said:
That's a subjective argument, and there are plenty of DIY who just start out by reading subject matter material. So, here's where AI is being exploited, because AI doesn't have a body yet, it can't explore the real world from the knowledge it gained from human documentation. So we as humans collaborate with AI who gives us notions that we can then validate in the real world and communicate back to AI who then can learn from our experiences.
Generally I would agree with you that an LLM can do far more than the official PF policy will admit. I did watch a video recently of a professional physicist getting it to suggest ideas based on genuine input from him.

But, an LLM is no substitute for a physics degree or PhD. It's your input that is the problem. And, you can't judge when the LLM has actually produced an insight (by luck or otherwise) or has produced something useless. The professional physicist above could do precisely that.

Also, the attempt to do physics by getting the right words into the right order plays into the LLMs hands. It can do that stuff better than any human ad infinitum.

Instead, physics is about a mapping from ideas to hard mathematics. That's what an LLM cannot reliably do. It cannot figure out whether those words and those equations represent a valid mathematical model for the physical phenomena in question.

The suggestions it made about your paper were amazing, IMO. But, it can't do enough unique mathematics to produce a paper for you. It has no way to generate a mathematical model from a vague concept.
 
  • Like
Likes dextercioby and javisot
  • #36
frankinstien said:
That's a subjective argument
It's your argument. You can't have it both ways.
 
  • #37
PeroK said:
physics is about a mapping from ideas to hard mathematics.
And to data from the real world.
 
  • Like
Likes BillTre and russ_watters
  • #38
PeroK said:
The suggestions it made about your paper were amazing, IMO
How can we know that if we haven't read the paper itself?
 
  • #39
PeterDonis said:
How can we know that if we haven't read the paper itself?
I read it on Research Gate.
 
  • #40
I would have to disagree with the inability for LLM to apply hard mathematics. LLMs have proven to be able to take a set of requirements and turn it into real working software with object-oriented structure and design, which is a form of mathematics. After all, mathematics is a language, and there are some good AI math models:

Julius AI,
Mathos AI,
Google DeepMind's AI models
 
  • #41
frankinstien said:
After all, mathematics is a language, and there are some good AI math models:
That depends heavily on what you call mathematics. I think your understanding of mathematics is a kind of advanced calculations. This is not the case. As long as you can't show me an AI that solves ##NP\neq P,## literally a language problem, I have to disagree with you.
 
  • Like
Likes PeterDonis
  • #42
frankinstien said:
I would have to disagree with the inability for LLM to apply hard mathematics. LLMs have proven to be able to take a set of requirements and turn it into real working software with object-oriented structure and design, which is a form of mathematics. After all, mathematics is a language, and there are some good AI math models:

Julius AI,
Mathos AI,
Google DeepMind's AI models
I am both a mathematician and a programmer. I believe these have little in common.

I once asked ChatGPT to prove something in Lie group theory. It came up with a bunch of nonsense. I didn't know Lie group theory so it sounded plausible to me. When I got it checked though....

ChatGPT is very useful with programming, it knows the techniques of four dimensional geometry better than do I and it's great with computer game kind of things, but there I can execute the program and know immediately whether of not it's nonsense.

Once it told me certain 4D object could have two velocities.

Using ChatGPT to find basic errors in a paper could work pretty well, but evaluating original ideas seems like a nonstarter. It's much better with routine things that I can't be bothered to learn.
 
Last edited:
  • #43
OP is on a 10-day vacation from PF, so this thread can be closed for now.
 

Similar threads

Replies
46
Views
5K
Replies
0
Views
3K
Replies
14
Views
5K
Replies
15
Views
4K
Replies
8
Views
3K
Replies
4
Views
3K
Replies
34
Views
13K
Replies
25
Views
5K
Back
Top