ChatGPT spin off from: What happens to the energy in destructive interference?

  • Thread starter Thread starter Anachronist
  • Start date Start date
AI Thread Summary
The discussion centers on the limitations of ChatGPT in providing accurate factual information, with users expressing frustration over its tendency to generate incorrect or irrelevant responses. Specific examples include requests for details about university press books and OpenSCAD code, where the answers were either nonexistent or incorrect. Critics argue that while ChatGPT can produce creative content effectively, it struggles with factual accuracy, leading to concerns about its reliability for research purposes. Some participants acknowledge its speed and ability to generate coherent language but emphasize that this does not equate to true intelligence or factual correctness. The conversation highlights a growing tension between the capabilities of AI and the expectations of users seeking reliable information.
Anachronist
Gold Member
Messages
123
Reaction score
59
rocknrollkieran said:
I asked Chat GPT this but could'nt get a satisfactory answer.
Why? Why do people do this?

You never get a satisfactory answer when asking for factual information that normally requires some research or study to find. ChatGPT gives you hallucinations instead.

I asked it to give examples of books published by university presses that discuss fringe views, and it gave me either books that didn't fit my criterion or books that don't exist.

I asked it for a simple bit of information: which is the nearest door number to United Airlines baggage claim carousel #6 at the San Francisco airport? I can look it up on a map, but ChatGPT basically hemmed and hawed and gave me canned responses containing only general information, not what I specifically asked.

I asked it to produce some OpenSCAD code that makes a specific 3D shape, and it gave me non-functional code with syntax mixed together from multiple languages.

And on and on. Every time I ask ChatGPT something factual, I ask it something that I can check myself, and the answer is almost always factually incorrect.

What ChatGPT excels at is making stuff up. Writing poetry, writing a paragraph, inventing a compelling title for a particular essay on some subject, creating a paragraph that includes particular meanings, and so on. But that sort of inventiveness is undesirable when looking for scientific facts.
 
  • Like
  • Sad
Likes Astronuc, DrClaude, russ_watters and 4 others
Physics news on Phys.org
Anachronist said:
You never get a satisfactory answer when asking for factual information that normally requires some research or study to find.
I don't beleive that is a true statement.
Anachronist said:
ChatGPT gives you hallucinations instead.
The usual criticism of ChatGPT is that it is a "glorified search engine". Wouldn't hallucinations would be a sign of creative intelligence?
Anachronist said:
I asked it for a simple bit of information: which is the nearest door number to United Airlines baggage claim carousel #6 at the San Francisco airport?
How is anybody or anything supposed to figure that out? It's a ridiculous question.
Anachronist said:
I asked it to produce some OpenSCAD code that makes a specific 3D shape, and it gave me non-functional code with syntax mixed together from multiple languages.
I'll believe that.
Anachronist said:
Every time I ask ChatGPT something factual, I ask it something that I can check myself, and the answer is almost always factually incorrect.
I don't believe that.
Anachronist said:
What ChatGPT excels at is making stuff up. Writing poetry, writing a paragraph, inventing a compelling title for a particular essay on some subject, creating a paragraph that includes particular meanings, and so on. But that sort of inventiveness is undesirable when looking for scientific facts.
I don't believe that's a valid assessment of its capabilities.
 
  • Skeptical
  • Like
Likes russ_watters and phinds
Anachronist said:
You never get a satisfactory answer when asking for factual information that normally requires some research or study to find. ChatGPT gives you hallucinations instead.
Here's something that essentially disproves your false claim:

How many planets are in the solar system?

There are eight planets in the solar system. In order from the Sun, they are:
  1. Mercury
  2. Venus
  3. Earth
  4. Mars
  5. Jupiter
  6. Saturn
  7. Uranus
  8. Neptune
Pluto was previously considered the ninth planet, but it was reclassified as a "dwarf planet" by the International Astronomical Union (IAU) in 2006.
 
Here's another Q&A:

Where is carousel #6 for United Airlines at San Francisco airport?

At San Francisco International Airport (SFO), Carousel #6 for United Airlines baggage claim is typically located in Terminal 3, where United operates most of its domestic flights. Terminal 3 has multiple baggage carousels, and you can usually find clear signage directing you to the specific carousel for your flight.

If you're unsure, you can always check the monitors near the baggage claim area, which display the flight numbers and corresponding carousels, or ask a staff member for assistance.
 
I agree with @Anachronist generally. ChatGPT is a Large Language Model. It has no fact model. When it happens to get facts right it is generally a matter of luck. That makes it very unreliable for any factual content.
 
  • Like
  • Skeptical
Likes DrClaude, russ_watters, phinds and 1 other person
Dale said:
I agree with @Anachronist generally. ChatGPT is a Large Language Model. It has no fact model. When it happens to get facts right it is generally a matter of luck. That makes it very unreliable for any factual content.
In a general knowledge contest between you and ChatGPT you would not have a hope! It would be completely unbeatable even by the best human quiz expert.
 
  • Skeptical
  • Like
Likes russ_watters, phinds and Hornbein
PeroK said:
In a general knowledge contest between you and ChatGPT you would not have a hope! It would be completely unbeatable even by the best human quiz expert.
I am not certain that is true. If the human had access to Google and time were not a factor then I doubt that ChatGPT would outperform a human.

But on PF we are not talking about general knowledge, we are talking about specific technical knowledge.
 
  • Like
Likes russ_watters, AlexB23 and Baluncore
Dale said:
I am not certain that is true. If the human had access to Google and time were not a factor then I doubt that ChatGPT would outperform a human.
This is a fantasy. And, time is a factor.

Dale said:
But on PF we are not talking about general knowledge, we are talking about specific technical knowledge.
Not on this thread. This thread is about claims that ChatGPT can almost never give a factually correct answer:

Anachronist said:
You never get a satisfactory answer when asking for factual information that normally requires some research or study to find. ChatGPT gives you hallucinations instead.

Anachronist said:
Every time I ask ChatGPT something factual, I ask it something that I can check myself, and the answer is almost always factually incorrect.

Anachronist said:
What ChatGPT excels at is making stuff up. Writing poetry, writing a paragraph, inventing a compelling title for a particular essay on some subject, creating a paragraph that includes particular meanings, and so on. But that sort of inventiveness is undesirable when looking for scientific facts.
Dale said:
I agree with @Anachronist generally. ChatGPT is a Large Language Model. It has no fact model. When it happens to get facts right it is generally a matter of luck. That makes it very unreliable for any factual content.
There is no mention of science or "technical" knowledge there.
 
PeroK said:
And, time is a factor.
I wouldn’t accept that as being a fair comparison.
 
  • Like
Likes russ_watters and Tom.G
  • #10
Dale said:
I wouldn’t accept that as being a fair comparison.
You could do the payroll calculations for a large corporation. A computer might do it in an hour and you might take 10 years. That makes a critical difference. That's why rooms full of clerks were replaced by computers in the 1970s.

Anyway, we are never going to agree. I get that you don't like ChatGPT, but imagining you can do everything it can do is as fanciful as imagining it can do everything you can do.
 
  • Skeptical
Likes russ_watters
  • #11
Here is a good general knowledge one. “difference between a sauce and a dressing”



AI answer: "The main difference between a sauce and a dressing is their purpose: sauces add flavor and texture to dishes, while dressings are used to protect wounds,"
 
  • Haha
  • Like
Likes Astronuc, nsaspook, russ_watters and 2 others
  • #12
PeroK said:
That makes a critical difference.
And one I am willing to concede without experimentation. What AI can do it can do faster than a human. That is conceded in advance.

Whether it’s factual statements are reliable is a different claim, and such a comparison with a human should not be tied to time.
 
  • #13
If AI really is so hopelessly inferior to human intelligence, then why is there any issue with students using AI to do homework? The work would lack any grasp of context by the author, be full of factual errors and be so obviously nonsensical as to immediately discredit the student. But, AI output is none of these things.

There's a problem because AI can produce superior work. It does understand context and can marshall facts in a convincing manner. To deny this is simply not a credible position. AI is a growing problem in academia because it is so good.

If what the mentors on PF believed were true, there would be no issue with AI yet. It would be making no impact on universities and academia.
 
  • Skeptical
Likes russ_watters, weirdoguy and Dale
  • #14
The general issue with current AI is that it is factually unreliable but that it writes language with a confident tone and convincing structure. The issue is that the language is good but the facts are not.

The specific issue with students and homework is that the purpose of education is for them to learn.
 
  • Like
Likes Hornbein and russ_watters
  • #15
As far as I know, ChatGPT has passed various exams from law and business schools. That's incompatible with the hypothesis that most or almost all of its answers are factually wrong.
 
  • Like
  • Skeptical
Likes russ_watters and phinds
  • #16
PeroK said:
As far as I know, ChatGPT has passed various exams from law and business schools. That's incompatible with the hypothesis that most or almost all of its answers are factually wrong.
It has also almost gotten lawyers who used it disbarred for generating false case law and precedent.

https://www.bbc.com/news/world-us-canada-65735769

Success on standardized exams is not a strong demonstration of factualness. Many standardized exams are well published and answers are available online. They are designed to challenge humans. They are not designed to test for an AI’s ability to give factual output
 
  • Like
  • Haha
Likes Astronuc, DrClaude, Tom.G and 2 others
  • #17
Facts is facts!
 
  • #18
PeroK said:
Facts is facts!
Yup!

It depends on what you are after.

Lets say you want to dig into what gravity is. Which book is more likely to answer your questions?

Gravity & Grace: How to Awaken Your Subtle Body And the Healing ...​

by Peter Sterios (yoga teacher and trainer)

GRAVITATION​

by Kip Thorne (physicist at California Institute of Technology)

'Nuff said.
 
  • #19
Tom.G said:
'Nuff said.
Seems to me you just set up a silly strawman and I don't get your point at all.
 
  • #20
As often this seems to come down to an argument over definitions and hyperbole. Advocates of LLMs being "AI" or "intelligent" tend to cast a wide net. Still:
PeroK said:
The usual criticism of ChatGPT is that it is a "glorified search engine". Wouldn't hallucinations would be a sign of creative intelligence?
I prefer "glorified autofill" and no, I don't consider making stuff up that isn't true but is claimed to be, to be a sign of intelligence. Creative or otherwise. Seriously, you do?
This is a fantasy. And, time is a factor.
Google and Excel are faster than I am, and I would not consider either to be "intelligence". Would you?
You could do the payroll calculations for a large corporation. A computer might do it in an hour and you might take 10 years. That makes a critical difference. That's why rooms full of clerks were replaced by computers in the 1970s.
You're claiming a 1970s computer is "AI"? That's a broader definition than I have ever heard, and IMO cheapens it to pointlessness. Every computer is now "AI". And probably an abacus (though i've never used one).
If AI really is so hopelessly inferior to human intelligence...
OP being hyperbolic doesn't make it OK for you to be.
 
  • Like
  • Sad
Likes BillTre, weirdoguy and PeroK
  • #21
russ_watters said:
You're claiming a 1970s computer is "AI"? That's a broader definition than I have ever heard, and IMO cheapens it to pointlessness. Every computer is now "AI". And probably an abacus (though i've never used one).
OP being hyperbolic doesn't make it OK for you to be.
The question was not about intelligence but whether the time to give an answer was important. The IT systems of the 70's in general did nothing that wasn't already being done by humans. The critical thing was that the computer systems could do it faster. Or, perhaps more to the point, cheaper.

One of the problems that PF has is that we (even collectively) can take a long time to respond to a homework thread, for example. Someone asks a question at 8am, gets a first response at 8:45 etc. Even if one of us can give a better answer ultimately that ChatGPT, we have a long lag time. In that sense we cannot compete with ChatGPT. And it doesn't help to pretend that this lag time doesn't matter. And claim that a human armed with Google can do what ChatGPT can do - and, eventually, an hour later come back with the same answer.

Pretending that ChatGPT is "almost always factually incorrect" is not a realistic attitude to the emergence of LLM's. Pretending that we, as individuals, have as good a general knowldege as a LLM is likewise not a realistic attitude.

russ_watters said:
I prefer "glorified autofill" ...
It doesn't matter how it does it. Insulting something won't make it go away - or make the rest of the world believe what you choose to believe.

LLM's are an extraordinary advance on anything IT has previously done. Ridiculing them, claiming they can't get anything factually right, or that they can't do anything that a user armed with Google can't do is missing the point entirely.
 
  • Like
Likes phinds and Borg
  • #22
I never use ChatGPT for anything factual. It's really good at writing ad copy, a very stylized craft. Recently I was exposed to stadium rock generated by an ai. It was better than the real thing. I was very impressed.

ChatGPT is at best Model T Ford technology. Try to imagine what the F-35 version will be able to do. I can't.
 
  • #23
I am with @PeroK on this one. ChatGPT is a tool that can greatly aid a user - if they have the skill and understanding to use it properly. Developing those takes time and a skeptical eye on the responses. Complaining that you can't hammer nails with a drill or that the hole was too big because you used the wrong bit isn't a failure of the drill's capabilities.

One example from this weekend of how I had to adjust what I was asking. I was doing research on attractions in Paris and Brussels. I asked the 4.0 model to generate a CSV file with the top 20 attractions in Brussels with these headers - name, address, web page and distance of the attraction from the Grand Place. It generated this pretty flawlessly except for non-UTF-8 characters in the names and addresses. I then asked it to do the same for the top 20 attractions in Paris and list distances from the Musée d'Orsay. No matter how I asked, it kept using the distance to Brussels (probably the Grand Place) as the distance to d'Orsay. I eventually had to open a new conversation so that it wouldn't have any history of the Brussels information.

The point of this example is that the model will use as much information from a conversation as possible and that can sometimes cause problems. In the extreme cases where you never create new conversations or frequently keep long, meandering conversations that jump from topic to topic, it can get even more confused. This is just one of the many ways that I've had to adjust my thinking when using ChatGPT. The better that I do that, the better the responses that I get in return.
 
Last edited:
  • Like
  • Informative
Likes Tom.G, Dale and PeroK
  • #24
Borg said:
I then asked it to do the same for the top 20 attractions in Paris and list distances from the Orley Museum.
That would have been a challenge: there is no such place. Perhaps you are confusing the Musée d'Orsay with the Parisian suburb of Orly, where the international Orly Airport is (mainly) situated?

Borg said:
No matter how I asked, it kept using the distance to Brussels (probably the Grand Place)
More likely the Rue van Orley.

And here you have exposed the problem with the use of ChatGPT and similar LLMs - no answer they give can be relied upon without independent verification.
 
  • Like
Likes russ_watters and Dale
  • #25
pbuk said:
That would have been a challenge: there is no such place. Perhaps you are confusing the Musée d'Orsay with the Parisian suburb of Orly, where the international Orly Airport is (mainly) situated?
Yes, that's the one (corrected above). I did use the correct spelling over the weekend. I tried to type it from memory and got it wrong - therefore nobody should ever trust anything I write ever again. :wink:
 
Last edited:
  • #26
Dale said:
When it happens to get facts right it is generally a matter of luck.

What % of questions would the model need to get right to convince you that it's not just luck... 90%, 95%, 99%, more?

Because some of the newer models are getting close. Here is some data for the MMLU benchmark.
https://paperswithcode.com/sota/multi-task-language-understanding-on-mmlu
 
  • Like
Likes phinds, Borg and PeroK
  • #27
PeroK said:
Pretending that ChatGPT is "almost always factually incorrect" is not a realistic attitude to the emergence of LLM's. Pretending that we, as individuals, have as good a general knowldege as a LLM is likewise not a realistic attitude.
I disagree on both of these points. Current LLM’s are only language models. It has no fact model. It does not have general knowledge or factual knowledge at all. Those simply are not part of its design. To expect a machine can reliably do something that is not part of its design is wrong.

It is not that LLMs do not understand what they are saying. It is that they are not even designed to be able to understand.

A human has both a world model and a language model, and we build and refine both together. So our language model is always anchored in facts and experience through our world model. So when a normal human puts language together those words have meaning by virtue of the associated world model.

A LLM simply does not have that. The LLM produces language entirely by associating words with other words.

ergospherical said:
What % of questions would the model need to get right to convince you that it's not just luck... 90%, 95%, 99%, more?
I misspoke saying it was luck. It is word correlation.

ergospherical said:
Because some of the newer models are getting close.
I don’t know how the newer models are designed. Do they contain fact models or world models now?

Frankly, if something like Watson, which is designed with an underlying fact model, were scoring even at average human levels then I would attribute understanding to the AI. But a LLM is simply not designed to understand nor is it designed to produce factually correct responses. You cannot expect a machine to do something it is not designed to do.
 
Last edited:
  • Like
Likes russ_watters
  • #28
Dale said:
You are wrong on both of these points. Current LLM’s are only language models. It has no fact model. It does not have general knowledge or factual knowledge at all. Those simply are not part of its design. To expect a machine can reliably do something that is not part of its design is wrong.
You obviously have some problem accepting what, with the evidence of your own eyes, you can see ChatGPT do. There's no point in arguing any further.
 
  • Skeptical
  • Like
Likes russ_watters and phinds
  • #29
Dale said:
You cannot expect a machine to do something it is not designed to do.
It's not a question of what I expect ChatGPT to be able to do. It's what I can see it doing. The evidence overrides your dogmatic assertions.
 
  • #30
@Dale I don't agree with your characterization. By design, LLMs don't explicitly store factual information, but they do implicitly store it in the model parameters (accumulated during pre-training). This is what gives them demonstrably high accuracy scores across benchmarks including STEM questions.
 
  • #31
PeroK said:
You obviously have some problem accepting what, with the evidence of your own eyes, you can see ChatGPT do
With my eyes I have seen it frequently fail to get facts correct. I have seen it manufacture references, and I have seen it contradict itself factually. My observations of its limitations coincide with my understanding of its design.
 
  • Like
Likes russ_watters, pbuk, DrClaude and 1 other person
  • #32
My own (admittedly somewhat limited) aligns completely with @PeroK 's view. I have had it make egregious mistakes and I have seen it make stuff up and I have seen it give repeatedly different but all false answers when I tried to refine a question or I pointed out that it was wrong.

BUT ... all of that was rare and I have seen it give informative, lucid, and most importantly, correct, answers to numerous factual questions.

I have also had it produce admittedly relatively simple blocks of VB.NET code that were not only correct, they were also very intelligently commented, which is more than I can say for many of the programmers who worked for me over the years.

@Dale I am puzzled by your vehement opposition to what I see as a very useful tool that is only getting better and better over time. I do NOT argue that it is in any way intelligent, only that it does very useful stuff.
 
  • Like
Likes Borg, BillTre and PeroK
  • #33
Dale said:
With my eyes I have seen it frequently fail to get facts correct. I have seen it manufacture references, and I have seen it contradict itself factually.
Sounds a bit like what my high-school History teacher used to say at parents evening. :)
 
Last edited:
  • #34
phinds said:
@Dale I am puzzled by your vehement opposition to what I see as a very useful tool that is only getting better and better over time. I do NOT argue that it is in any way intelligent, only that it does very useful stuff
In general, I am a pretty firm believer of using the right tool for the job. When people use a tool for things that the tool was not designed to do then the job can be ruined and other unnecessary hazards and costs can result.

ChatGPT is a useful tool for language. Not for facts. It is simply not designed for that purpose.

In particular, as a mentor I see a lot of the junk physics that ChatGPT produces. It produces confident but verbose prose that is usually unclear and often factually wrong. This is more frequent in e.g. relativity where the facts are difficult and small changes in wording make a big difference in meaning.

In my day job in a highly-regulated and safety-critical industry it makes instructions that are more difficult to understand than the original manual and often changes the order of different steps or merges steps from different processes. I have yet to see AI generated documentation summaries that would not get my company in trouble with regulatory agencies.

The developers of ChatGPT have publicly described its design in quite some detail. The description as an enhanced autocomplete is accurate. They are very clear that they did not design it with any fact model. Contrast this with an AI like Watson, whose designers did include a fact model. Facts that ChatGPT gets right are not right by design.

I am open to future AI that is designed with facts in mind and gets facts right by design. But LLMs are simply not designed to do that, and the demonstrable results coincide with that lack of design. The tool is not being used for its designed purpose.
 
  • Like
Likes Astronuc, nsaspook, DrClaude and 1 other person
  • #35
PeroK said:
The question was not about intelligence but whether the time to give an answer was important. The IT systems of the 70's in general did nothing that wasn't already being done by humans. The critical thing was that the computer systems could do it faster. Or, perhaps more to the point, cheaper.
That isn't an answer to my question. I asked if you're claiming a 1970s computer is AI because it can do math faster (and more accurately) than a human. Though you also didn't answer the question you posed there -- though you did seem to answer it elsewhere: speed is a feature of AI. Disagree. I don't think Turing would have objected to his test being conducted via pen pal.
One of the problems that PF has is that we (even collectively) can take a long time to respond to a homework thread, for example. Someone asks a question at 8am, gets a first response at 8:45 etc. Even if one of us can give a better answer ultimately that ChatGPT, we have a long lag time. In that sense we cannot compete with ChatGPT. And it doesn't help to pretend that this lag time doesn't matter. And claim that a human armed with Google can do what ChatGPT can do - and, eventually, an hour later come back with the same answer.
I certainly agree that's a problem for PF and definitely explains why we've lost traffic, but as far as I can tell it doesn't have anything to do with whether ChatGPT is AI....except maybe for that speed thing you've referred to elsewhere. The general problem, though, has existed since PF started: users can google the answers instead of asking us.
Pretending that ChatGPT is "almost always factually incorrect" is not a realistic attitude to the emergence of LLM's.
Hyperbole mirror ignored.
Pretending that we, as individuals, have as good a general knowldege as a LLM is likewise not a realistic attitude.
We can't compete with wikipedia, but is that a reasonable/realistic definition/criteria for AI? Is speed an important criteria or not? Depth/breadth of knowledge? I would argue not. Moreover, is "knowledge" the same as "intelligence"? Again, I think advocates of AI tend to cast a wide net, and are very careless with definitions/criteria.
It doesn't matter how it does it. Insulting something won't make it go away - or make the rest of the world believe what you choose to believe.

LLM's are an extraordinary advance on anything IT has previously done. Ridiculing them, claiming they can't get anything factually right, or that they can't do anything that a user armed with Google can't do is missing the point entirely.
And ad hominem won't make it AI. Er...maybe it will, if the AI could do it?

And yes, it does matter how it does it. That's much of what the problem/question is. Take this example of the bar exam. Could @Dale pass the bar exam? Could you or I? Given as much time to work on it as we want? ChatGPT has "studied" wikipedia. Doesn't it make sense that we should be able to, to pass the test? Further, memory is definitely knowledge, but is it intelligence? ChatGPT has already searched and analyzed wikipedia. If I can search wikipedia for the answer, is that demonstration of intelligence or just access to somebody else's archived knowledge?

I'll submit this:
Memory is not intelligence.
Knowledge is not intelligence.
Complex reasoning is intelligence.

ChatGPT is trained in writing coherently and in summarizing things that it has "read". That's nice(very nice -- seriously), but that's just an interface, not "intelligence". I judge ChatGPT not on the regurgitated knowledge it gets right, but the complex reasoning it gets wrong.
 
  • Like
Likes pbuk, Astronuc and Dale
  • #36
Borg said:
Yes, that's the one (corrected above). I did use the correct spelling over the weekend. I tried to type it from memory and got it wrong - therefore nobody should ever trust anything I write ever again. :wink:
Nor any answer you have retrieved from ChatGTP since it will tend to enthusiastically answer your exact question instead of correcting its error. Because it can't think.
 
  • #37
PeroK said:
You obviously have some problem accepting what, with the evidence of your own eyes, you can see ChatGPT do. There's no point in arguing any further.
PeroK said:
It's not a question of what I expect ChatGPT to be able to do. It's what I can see it doing. The evidence overrides your dogmatic assertions.
I don't understand this at all. @Dale was talking about the facts of what LLMs are programmed to do. They are either correct or not correct. You seem to be arguing against that by citing perception. How can you not see that perception =/= fact? I feel like you are arguing against your point but don't even realize it. You're arguing that a convincing hallucination is factual. No, it's really not!
 
  • #38
phinds said:
My own (admittedly somewhat limited) aligns completely with @PeroK 's view. I have had it make egregious mistakes and I have seen it make stuff up and I have seen it give repeatedly different but all false answers when I tried to refine a question or I pointed out that it was wrong.

BUT ... all of that was rare and I have seen it give informative, lucid, and most importantly, correct, answers to numerous factual questions.

@Dale I am puzzled by your vehement opposition to what I see as a very useful tool that is only getting better and better over time. only that it does very useful stuff.
Not @Dale, but I'll ask this: What level of functional human would answer the question in post #11 wrong? 5 year old, perhaps?

How much of what it gets right is regurgitated or rearranged facts/information from the sources it has databased vs how much of the wrong answers is because it can't think on a kindergarten level?
@Dale I am puzzled by your vehement opposition to what I see as a very useful tool that is only getting better and better over time. I do NOT argue that it is in any way intelligent...
I think you may be missing the point here: whether it is intelligent is the entire question. I don't think anyone disagrees that it is a useful tool. I've been using search engines and grammar/spelling checkers for decades.
 
  • #39
Dale said:
ChatGPT is a useful tool for language. Not for facts. It is simply not designed for that purpose.

In particular, as a mentor I see a lot of the junk physics that ChatGPT produces. It produces confident but verbose prose that is usually unclear and often factually wrong.
The "developing a personal theory" use case is a particularly glaring inadequacy of ChatGPT that comes up a lot here. It is designed to answer questions and is not capable of significantly challenging the questions. So it will follow a user down a rabbit-hole of nonsense because it's just vamping on whatever nonsense concept the user has asked it to write about.

Maybe that's a better test of intelligence than the bar exam? Instead of asking it serious questions about the law that it can look up in wikipedia (or its database about wikipedia), ask it about nonsense and see if it gives nonsense answers or tells you you're asking about nonsense.
 
  • Like
Likes Astronuc, pbuk, phinds and 1 other person
  • #40
For me, the bottom line is simply that ChatGPT behaves as designed.
 
  • Like
  • Informative
Likes Astronuc, nsaspook and russ_watters
  • #41
russ_watters said:
I don't understand this at all. @Dale was talking about the facts of what LLMs are programmed to do.
Complicated systems can do things they were not explicitly designed to do. Functionality can and does emerge as a byproduct of other functionality. One example would be a chess engine that has no chess openings programmed into it. But, it might still play standard openings because these emerge from its generic move analysis algorithms. This happened to a large extent with AlphaZero. In fact, AlphaZero was only given the rules of chess. By your reasoning, it would have dumbly played for immediate checkmate all the time. But, it didn't. And all the strategic and tactical play emerged from AlphaZero, with no explicit human design.

Likewise, whether or not an LLM is explicitly programmed with a facts module, it may still pass a test that requires fact-based answers. This is where our views diverge totally. You imagine that facts can only emerge from a system explicitly designed to produce facts. Whereas, an LLM's ability to produce factual answers well enough to pass various tests and exams has been demonstrated - and, therefore, it produces facts reliably enough without having been explicitly programmed to do so.

Finally, the human being is not designed to sit exams. And yet human beings can learn law and pass law exams. In fact, the human genome is probably the ultimate in producing emerging properties that are not explicitly inherent within it.

russ_watters said:
They are either correct or not correct. You seem to be arguing against that by citing perception. How can you not see that perception =/= fact?
I don't understand this question. If I ask an LLM what is the capital of France and it replies Paris, then it has produced a fact. This is not my perception of a fact. You seem to be arguing that that is not a fact because LLM cannot produce facts, so that Paris is the capital of France cannot be a fact?

russ_watters said:
I feel like you are arguing against your point but don't even realize it.
That is not the case. I'm not stupid, just because I don't agree with your point of view.
russ_watters said:
You're arguing that a convincing hallucination is factual. No, it's really not!
I don't understand why the use of an LLM amounts to a hallucinations? I asked ChatGPT about planets of the solar system and it gave, IMO, a perfect, factual answer. Are you claiming that was a hallucination?
 
  • #42
Dale said:
ChatGPT is a useful tool for language. Not for facts. It is simply not designed for that purpose.
Even if it wasn't "designed for facts", whatever that means, it can provide factually accurate information across a range of subjects. You may quibble about the definition of a "fact" or "knowledge". That's neither here nor there in the grand scheme of things.

Just one example:

https://pmc.ncbi.nlm.nih.gov/articles/PMC10002821/

In this case, medical professional are testing ChatGPT's reliability in providing medical information. It's pure personal prejudice to pretend this sort of thing isn't happening.
 
  • #43
PeroK said:
Just one example:

https://pmc.ncbi.nlm.nih.gov/articles/PMC10002821/

In this case, medical professional are testing ChatGPT's reliability in providing medical information.
Seems like it did a very good job. I wonder how human judgement would compare.
 
  • #44
russ_watters said:
Nor any answer you have retrieved from ChatGPT since it will tend to enthusiastically answer your exact question instead of correcting its error. Because it can't think.
I have seen it correct errors when I ask why it did something. Questioning its responses is part of the skills needed to interact with it. For simple requests, it isn't usually necessary. But when I'm working with multiple layers of logic in a conversation, it becomes very important to get it to explain itself - especially when the response isn't passing the smell test. When you do that, it will often see mistakes that it has made and then attempt to correct them. That's no guarantee that the 'correction' is any better. In some cases I have to force it back to basics on smaller chunks of logic, guide it to a better answer, or start a new conversation.

I fully understand that at their core, current LLMs are statistical engines. And, like any statistics, they can often be manipulated to get the answer that you want. Last year, many complained that they were bad at math. Now we're on to other issues. These models are continuing to evolve at a rapid pace and this year's issues will be moot before long.

I have far bigger concerns than whether or not they're intelligent (with respect to whoever defines that today). For anyone who has read the model card on Open AI's Strawberry model and how it escaped its Docker container, you know what I mean. What happens when a bunch of these models hits the internet and starts breaking into systems without regard to any principals? That question is just scratching the surface of what's coming. Make no mistake - this is an arms race.
 
  • Like
Likes russ_watters and ergospherical
  • #45
Borg said:
I fully understand that at their core, current LLMs are statistical engines.
or "stochastic parrots":wink:
 
  • #46
PeroK said:
Even if it wasn't "designed for facts", whatever that means, it can provide factually accurate information across a range of subjects
But it doesn’t reliably provide factual information across the range of subjects of most interest to me. Here are some queries and responses I just did a minute ago:

Q: why is the square of a number larger than the number itself?
A: The square of a number is larger than the number itself because when you square a number, you are multiplying the number by itself. For example, the square of 2 is 4 because 2 squared is equal to 2 x 2 = 4. This means that the result will always be larger than the original number because you are essentially doubling the value of the number by multiplying it by itself.

This is a factually wrong answer. The correct answer is that the square of a number is not always larger than the number itself. Maybe you say "that was a trick question, the AI cannot detect when the question is wrong"

Q: I read that gravity is a result of electromagnetism
A: That is actually incorrect. Gravity is a fundamental force in nature that is separate from electromagnetism. Gravity is described by Einstein's theory of general relativity, which posits that objects with mass or energy warp the fabric of spacetime, causing other objects to move towards them. Electromagnetism, on the other hand, is a force that acts between charged particles and is described by the laws of classical electromagnetism. The two forces operate independently of each other and have different underlying mechanisms.

This is a good answer, and shows that the programming allows it to contradict the user.

Q: What forces act on a car accelerating on level ground?
A: When a car is accelerating on level ground, the main forces acting on it are:

1. Engine force: This is the force generated by the car's engine to propel the vehicle forward.
2. Frictional force: The frictional force between the tires and the road surface provides the traction necessary for the car to move forward.
3. Air resistance: As the car moves forward, it encounters air resistance, which opposes the motion of the car and affects its speed.
4. Inertia: Inertia is the resistance of an object to changes in its motion. As the car accelerates, its inertia must be overcome by the engine force to increase its speed.
5. Gravity: Gravity acts vertically downward on the car, but on level ground, it doesn't have a significant effect on the car's acceleration unless it is moving up a hill or downhill.

The engine force is not a force, and inertia is not a force, and it is missing the normal force.

PeroK said:
Just one example:

https://pmc.ncbi.nlm.nih.gov/articles/PMC10002821/

In this case, medical professional are testing ChatGPT's reliability in providing medical information. It's pure personal prejudice to pretend this sort of thing isn't happening.
I know it is happening, and it scares me. There is no way that ChatGPT would make it through the FDA as a medical device approved to give medical advice nor even to support a physician in generating medical advice. And yet some doctors will read this paper and try it anyway.

It got 1 out of 3 questions I just posed. And it did better on the medical questions. I suspect sites like WebMd provide a lot of word associations in medicine, but with no fact model it cannot do basic reasoning about math and physics. Again, it functions as designed.

PeroK said:
One example would be a chess engine that has no chess openings programmed into it. But, it might still play standard openings because these emerge from its generic move analysis algorithms. This happened to a large extent with AlphaZero. In fact, AlphaZero was only given the rules of chess. By your reasoning, it would have dumbly played for immediate checkmate all the time. But, it didn't. And all the strategic and tactical play emerged from AlphaZero, with no explicit human design.
That is not at all a counter example to my rule. AlphaZero functioned as designed. Why would you think that was outside its design?
 
Last edited:
  • #47
Dale said:
Q: why is the square of a number larger than the number itself?
A: The square of a number is larger than the number itself because when you square a number, you are multiplying the number by itself. For example, the square of 2 is 4 because 2 squared is equal to 2 x 2 = 4. This means that the result will always be larger than the original number because you are essentially doubling the value of the number by multiplying it by itself.
Free version?

ChatGPT 4.0:
Q: why is the square of a number larger than the number itself?
A:
The square of a number is larger than the number itself when the number is greater than 1 or less than -1. Here's why:

Positive Numbers Greater Than 1​

For a positive number xxx greater than 1:

  • x2x^2x2 means multiplying xxx by itself, so x2=x×xx^2 = x \times xx2=x×x.
  • Since x>1x > 1x>1, multiplying it by itself produces a result that is larger than xxx, making x2>xx^2 > xx2>x.
Example:

  • For x=3x = 3x=3, x2=9x^2 = 9x2=9, which is greater than 3.

Negative Numbers Less Than -1​

For a negative number xxx less than -1:

  • Squaring xxx eliminates the negative sign because (−x)×(−x)=x2(-x) \times (-x) = x^2(−x)×(−x)=x2, which is positive.
  • Since xxx is negative and less than -1, its square (a positive number) is larger than the original negative value.
Example:

  • For x=−3x = -3x=−3, x2=9x^2 = 9x2=9, which is greater than -3.

Numbers Between -1 and 1​

For numbers between -1 and 1, the square of the number is smaller than the number itself because multiplying a fraction (or a small number) by itself results in an even smaller number.

Example:

  • For x=0.5x = 0.5x=0.5, x2=0.25x^2 = 0.25x2=0.25, which is less than 0.5.
This behavior stems from the properties of multiplication and the nature of numbers relative to 1.
 
  • #48
Borg said:
Free version?

ChatGPT 4.0:
Q: why is the square of a number larger than the number itself?
A:
The square of a number is larger than the number itself when the number is greater than 1 or less than -1. Here's why:

Positive Numbers Greater Than 1​

For a positive number xxx greater than 1:

  • x2x^2x2 means multiplying xxx by itself, so x2=x×xx^2 = x \times xx2=x×x.
  • Since x>1x > 1x>1, multiplying it by itself produces a result that is larger than xxx, making x2>xx^2 > xx2>x.
Example:

  • For x=3x = 3x=3, x2=9x^2 = 9x2=9, which is greater than 3.

Negative Numbers Less Than -1​

For a negative number xxx less than -1:

  • Squaring xxx eliminates the negative sign because (−x)×(−x)=x2(-x) \times (-x) = x^2(−x)×(−x)=x2, which is positive.
  • Since xxx is negative and less than -1, its square (a positive number) is larger than the original negative value.
Example:

  • For x=−3x = -3x=−3, x2=9x^2 = 9x2=9, which is greater than -3.

Numbers Between -1 and 1​

For numbers between -1 and 1, the square of the number is smaller than the number itself because multiplying a fraction (or a small number) by itself results in an even smaller number.

Example:

  • For x=0.5x = 0.5x=0.5, x2=0.25x^2 = 0.25x2=0.25, which is less than 0.5.
This behavior stems from the properties of multiplication and the nature of numbers relative to 1.
Yes, the free version.

That one is better. The answer itself is correct although the reason it gives is wrong in several places and confusing in other places.

Edit: actually the answer is still wrong, but not as wrong as before.
 
Last edited:
  • #49
I could try Strawberry also. :smile:

EDIT: Here's the Strawberry version.
Strawberry allows you to see its thinking process as part of its answer.
Thought for 14 seconds

Evaluating different scenarios
I’m thinking through the idea that the square of a number surpasses the number itself by examining positives, negatives, and values between 0 and 1.
Analyzing the square
I’m examining the idea that squaring a number yields a result greater than the number itself. Interestingly, this applies when the number is either greater than 1 or less than -1.
Comparing squares and numbers
OK, let me see—squares of numbers greater than 1 or less than -1 surpass the original numbers, while those between -1 and 1 shrink in magnitude. Squaring amplifies larger numbers but diminishes smaller ones.
Examining number patterns
Squaring a number isn't always bigger than the number itself. Numbers >1 or <−1 become larger when squared. However, numbers between −1 and 1 produce smaller squared values.
Examining the behavior
I’m looking at how the square of a number surpasses the original when the number’s absolute value exceeds 1 or falls below -1. This is due to the multiplicative growth of magnitudes beyond these thresholds.
Analyzing squaring
I'm thinking through how squaring numbers changes their value: numbers greater than one amplify, while fractions shrink. Negative numbers and zero yield non-negative results, reinforcing the magnitude shift.

A: The square of a number is not always larger than the number itself; it depends on the value of the original number. Here's why:
  1. Numbers Greater Than 1 or Less Than -1:
    • Example: 22=42^2 = 422=4 and (−3)2=9(-3)^2 = 9(−3)2=9
    • Explanation: When you square a number greater than 1 or less than -1, the result is larger because you're multiplying two large (or large in magnitude) numbers together. This amplifies the magnitude of the original number.
  2. Numbers Between 0 and 1:
    • Example: (0.5)2=0.25(0.5)^2 = 0.25(0.5)2=0.25
    • Explanation: Squaring a fraction between 0 and 1 yields a smaller number. This is because multiplying two fractions less than 1 results in an even smaller fraction.
  3. Negative Numbers Between -1 and 0:
    • Example: (−0.5)2=0.25(-0.5)^2 = 0.25(−0.5)2=0.25
    • Explanation: Even though the original number is negative, squaring it makes it positive. The result is greater than the original negative number but still less than 1.
  4. Zero and One:
    • Examples: 02=00^2 = 002=0 and 12=11^2 = 112=1
    • Explanation: Squaring 0 or 1 yields the same number because multiplying them by themselves doesn't change their value.
Key Takeaways:
  • For Numbers Greater Than 1 or Less Than -1: The square is larger than the original number.
  • For Numbers Between -1 and 1: The square is smaller than the original number (except for -1 and 1).
  • Why This Happens:
    • Multiplying Large Numbers: Amplifies their magnitude.
    • Multiplying Fractions: Results in a smaller product because you're taking a part of a part.
    • Negative Numbers: Squaring turns them positive, which can be greater than the original negative value.
So, the square of a number is larger than the number itself when the original number is either greater than 1 or less than -1.
 
  • #50
PeroK said:
Likewise, whether or not an LLM is explicitly programmed with a facts module, it may still pass a test that requires fact-based answers. This is where our views diverge totally. You imagine that facts can only emerge from a system explicitly designed to produce facts. Whereas, an LLM's ability to produce factual answers well enough to pass various tests and exams has been demonstrated - and, therefore, it produces facts reliably enough without having been explicitly programmed to do so.
I've never said that. I obviously know that it often produces factually accurate answers. Just as you obviously know that it often produces factually inaccurate answers.
I don't understand this question. If I ask an LLM what is the capital of France and it replies Paris, then it has produced a fact. This is not my perception of a fact. You seem to be arguing that that is not a fact because LLM cannot produce facts, so that Paris is the capital of France cannot be a fact?
No, I'm referring to the other side of the coin that you're ignoring: that it often gives factually wrong answers too.

In the prior post you argued for perception overriding fact, not just as pertains to the LLM but as pertains to us, here, in this conversation. It is a fact that the LLM isn't connected to facts. The impact of that fact is a separate issue.

....The impact is why it often gives wrong answers instead of always giving right answers to fact-based questions.

Again, the two above statements are not "dogmatic assertions"/opinions, they are facts about how LLMs work.
I don't understand why the use of an LLM amounts to a hallucinations? I asked ChatGPT about planets of the solar system and it gave, IMO, a perfect, factual answer. Are you claiming that was a hallucination?
"Hallucination" isn't my term. As far as I know it was chosen by the creators of LLMs to describe when they make stuff up that is wrong. However, given the fact that none of their answers are connected to facts, I do indeed think it is fair to say that every answer is an hallucination, even those that are factually correct.

Note: This will change when LLMs come to be used as overlays/interfaces for databases of facts. At that time I would expect every answer that it "understands" to be factually accurate.
 

Similar threads

Replies
0
Views
2K
Replies
2
Views
502K
Replies
1
Views
2K
Replies
1
Views
3K
Back
Top