Using AI to evaluate white papers?

PeterDonis · Jun 8, 2025

javisot said:

This is usually shown with the example of chinese room https://en.wikipedia.org/wiki/Chinese_room

His counterargument is https://en.m.wikipedia.org/wiki/Strong_AI_hypothesis

No, the Chinese Room is not the same as the LLMs we're talking about here--because as the Chinese Room thought experiment is formulated, its answers have to actually be correct. They have to show actual world knowledge--not just "text snarfed from the Internet knowledge".

This point is overlooked by far too many discussions of the Chinese Room, because those discussions don't appreciate that you can ask the Chinese Room any question you want, including questions about real world experiences that no amount of just snarfing up text will let any kind of entity (including an actual human whose only "knowledge" comes from reading stuff on the Internet) give correct answers to. And of course when you do that with LLMs, you get all kinds of nonsense--no sane person should be fooled into thinking that the LLM is a person with actual real world knowledge of the topic being asked about.

But in the Chinese Room thought experiment, by hypothesis, the Chinese Room can convince people that it's a person with actual real world knowledge of all the topics it's asked about. In other words, the thought experiment states a performance standard that LLMs simply don't and can't meet.

javisot · Jun 8, 2025

PeterDonis said:

Thank you for agreeing with my main point!

You're quite right--and you wouldn't allow such a person to actually try to fix a plumbing problem in your house, would you? You'd want an actual plumber who could connect all those words about plumbing to actual plumbing in the real world.

And what you are trying to do in this thread is just as daft--asking an LLM, something which has zero experience actually doing science but has "read about" lots of "scientific stuff" by snarfing up text from the Internet--to evaluate a scientific paper for you.

There are usually two parts: checking that the math is correct and checking that the paper is conceptually correct. I understand that AI can fail on the conceptual side, but wouldn't you use it to check the math?

(I'm not saying now with the models we have, in the future with some specific model that correctly reviews the mathematics of the works)

PeterDonis · Jun 8, 2025

javisot said:

wouldn't you use it to check the math?

No. There are certainly computer programs that can check math, but LLMs don't do that.

javisot said:

some specific model that correctly reviews the mathematics of the works

We already have computer programs that can check math for accuracy--automated theorem provers and checkers, automated equation solvers, and things like that. But they don't work anything like LLMs. They don't check math by snarfing up a huge amount of text, say from math papers, and looking for patterns in it. They check math by having the actual logical rules that the math is supposed to follow coded into them directly, and then being able to apply those logical rules much more quickly and accurately, over huge numbers of logical steps, than humans can.

frankinstien · Jun 8, 2025

PeterDonis said:

Thank you for agreeing with my main point!

You're quite right--and you wouldn't allow such a person to actually try to fix a plumbing problem in your house, would you? You'd want an actual plumber who could connect all those words about plumbing to actual plumbing in the real world.

And what you are trying to do in this thread is just as daft--asking an LLM, something which has zero experience actually doing science but has "read about" lots of "scientific stuff" by snarfing up text from the Internet--to evaluate a scientific paper for you.

That's a subjective argument, and there are plenty of DIY who just start out by reading subject matter material. So, here's where AI is being exploited, because AI doesn't have a body yet, it can't explore the real world from the knowledge it gained from human documentation. So we as humans collaborate with AI who gives us notions that we can then validate in the real world and communicate back to AI who then can learn from our experiences.

PeroK · Jun 8, 2025

frankinstien said:

That's a subjective argument, and there are plenty of DIY who just start out by reading subject matter material. So, here's where AI is being exploited, because AI doesn't have a body yet, it can't explore the real world from the knowledge it gained from human documentation. So we as humans collaborate with AI who gives us notions that we can then validate in the real world and communicate back to AI who then can learn from our experiences.

Generally I would agree with you that an LLM can do far more than the official PF policy will admit. I did watch a video recently of a professional physicist getting it to suggest ideas based on genuine input from him.

But, an LLM is no substitute for a physics degree or PhD. It's your input that is the problem. And, you can't judge when the LLM has actually produced an insight (by luck or otherwise) or has produced something useless. The professional physicist above could do precisely that.

Also, the attempt to do physics by getting the right words into the right order plays into the LLMs hands. It can do that stuff better than any human ad infinitum.

Instead, physics is about a mapping from ideas to hard mathematics. That's what an LLM cannot reliably do. It cannot figure out whether those words and those equations represent a valid mathematical model for the physical phenomena in question.

The suggestions it made about your paper were amazing, IMO. But, it can't do enough unique mathematics to produce a paper for you. It has no way to generate a mathematical model from a vague concept.

PeterDonis · Jun 8, 2025

frankinstien said:

That's a subjective argument

It's your argument. You can't have it both ways.

PeterDonis · Jun 8, 2025

PeroK said:

physics is about a mapping from ideas to hard mathematics.

And to data from the real world.

PeterDonis · Jun 8, 2025

PeroK said:

The suggestions it made about your paper were amazing, IMO

How can we know that if we haven't read the paper itself?

PeroK · Jun 8, 2025

PeterDonis said:

How can we know that if we haven't read the paper itself?

I read it on Research Gate.

frankinstien · Jun 8, 2025

I would have to disagree with the inability for LLM to apply hard mathematics. LLMs have proven to be able to take a set of requirements and turn it into real working software with object-oriented structure and design, which is a form of mathematics. After all, mathematics is a language, and there are some good AI math models:

Julius AI,
Mathos AI,
Google DeepMind's AI models

fresh_42 · Jun 8, 2025

frankinstien said:

After all, mathematics is a language, and there are some good AI math models:

That depends heavily on what you call mathematics. I think your understanding of mathematics is a kind of advanced calculations. This is not the case. As long as you can't show me an AI that solves ##NP\neq P,## literally a language problem, I have to disagree with you.

Hornbein · Jun 8, 2025

frankinstien said:

I would have to disagree with the inability for LLM to apply hard mathematics. LLMs have proven to be able to take a set of requirements and turn it into real working software with object-oriented structure and design, which is a form of mathematics. After all, mathematics is a language, and there are some good AI math models:

Julius AI,
Mathos AI,
Google DeepMind's AI models

I am both a mathematician and a programmer. I believe these have little in common.

I once asked ChatGPT to prove something in Lie group theory. It came up with a bunch of nonsense. I didn't know Lie group theory so it sounded plausible to me. When I got it checked though....

ChatGPT is very useful with programming, it knows the techniques of four dimensional geometry better than do I and it's great with computer game kind of things, but there I can execute the program and know immediately whether of not it's nonsense.

Once it told me certain 4D object could have two velocities.

Using ChatGPT to find basic errors in a paper could work pretty well, but evaluating original ideas seems like a nonstarter. It's much better with routine things that I can't be bothered to learn.

berkeman · Jun 8, 2025

OP is on a 10-day vacation from PF, so this thread can be closed for now.

Using AI to evaluate white papers?

Similar threads

Predictions for the Nobel Prize in Physics 2025 (results: John Clarke, Michel H. Devoret and John M. Martinis)

Why do we spend so little time learning grammar in college?

How do people explore new ideas in physics?

In the early days of electricity, they didn't have wall plugs

A three month long summer vacation from public school seems stupid

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers