ChatGPT spin off from: What happens to the energy in destructive interference?

  • Thread starter Thread starter Anachronist
  • Start date Start date
Click For Summary
The discussion centers on the limitations of ChatGPT in providing accurate factual information, with users expressing frustration over its tendency to generate incorrect or irrelevant responses. Specific examples include requests for details about university press books and OpenSCAD code, where the answers were either nonexistent or incorrect. Critics argue that while ChatGPT can produce creative content effectively, it struggles with factual accuracy, leading to concerns about its reliability for research purposes. Some participants acknowledge its speed and ability to generate coherent language but emphasize that this does not equate to true intelligence or factual correctness. The conversation highlights a growing tension between the capabilities of AI and the expectations of users seeking reliable information.
  • #31
PeroK said:
You obviously have some problem accepting what, with the evidence of your own eyes, you can see ChatGPT do
With my eyes I have seen it frequently fail to get facts correct. I have seen it manufacture references, and I have seen it contradict itself factually. My observations of its limitations coincide with my understanding of its design.
 
  • Like
Likes russ_watters, pbuk, DrClaude and 1 other person
Physics news on Phys.org
  • #32
My own (admittedly somewhat limited) aligns completely with @PeroK 's view. I have had it make egregious mistakes and I have seen it make stuff up and I have seen it give repeatedly different but all false answers when I tried to refine a question or I pointed out that it was wrong.

BUT ... all of that was rare and I have seen it give informative, lucid, and most importantly, correct, answers to numerous factual questions.

I have also had it produce admittedly relatively simple blocks of VB.NET code that were not only correct, they were also very intelligently commented, which is more than I can say for many of the programmers who worked for me over the years.

@Dale I am puzzled by your vehement opposition to what I see as a very useful tool that is only getting better and better over time. I do NOT argue that it is in any way intelligent, only that it does very useful stuff.
 
  • Like
Likes Borg, BillTre and PeroK
  • #33
Dale said:
With my eyes I have seen it frequently fail to get facts correct. I have seen it manufacture references, and I have seen it contradict itself factually.
Sounds a bit like what my high-school History teacher used to say at parents evening. :)
 
Last edited:
  • #34
phinds said:
@Dale I am puzzled by your vehement opposition to what I see as a very useful tool that is only getting better and better over time. I do NOT argue that it is in any way intelligent, only that it does very useful stuff
In general, I am a pretty firm believer of using the right tool for the job. When people use a tool for things that the tool was not designed to do then the job can be ruined and other unnecessary hazards and costs can result.

ChatGPT is a useful tool for language. Not for facts. It is simply not designed for that purpose.

In particular, as a mentor I see a lot of the junk physics that ChatGPT produces. It produces confident but verbose prose that is usually unclear and often factually wrong. This is more frequent in e.g. relativity where the facts are difficult and small changes in wording make a big difference in meaning.

In my day job in a highly-regulated and safety-critical industry it makes instructions that are more difficult to understand than the original manual and often changes the order of different steps or merges steps from different processes. I have yet to see AI generated documentation summaries that would not get my company in trouble with regulatory agencies.

The developers of ChatGPT have publicly described its design in quite some detail. The description as an enhanced autocomplete is accurate. They are very clear that they did not design it with any fact model. Contrast this with an AI like Watson, whose designers did include a fact model. Facts that ChatGPT gets right are not right by design.

I am open to future AI that is designed with facts in mind and gets facts right by design. But LLMs are simply not designed to do that, and the demonstrable results coincide with that lack of design. The tool is not being used for its designed purpose.
 
  • Like
Likes Astronuc, nsaspook, DrClaude and 1 other person
  • #35
PeroK said:
The question was not about intelligence but whether the time to give an answer was important. The IT systems of the 70's in general did nothing that wasn't already being done by humans. The critical thing was that the computer systems could do it faster. Or, perhaps more to the point, cheaper.
That isn't an answer to my question. I asked if you're claiming a 1970s computer is AI because it can do math faster (and more accurately) than a human. Though you also didn't answer the question you posed there -- though you did seem to answer it elsewhere: speed is a feature of AI. Disagree. I don't think Turing would have objected to his test being conducted via pen pal.
One of the problems that PF has is that we (even collectively) can take a long time to respond to a homework thread, for example. Someone asks a question at 8am, gets a first response at 8:45 etc. Even if one of us can give a better answer ultimately that ChatGPT, we have a long lag time. In that sense we cannot compete with ChatGPT. And it doesn't help to pretend that this lag time doesn't matter. And claim that a human armed with Google can do what ChatGPT can do - and, eventually, an hour later come back with the same answer.
I certainly agree that's a problem for PF and definitely explains why we've lost traffic, but as far as I can tell it doesn't have anything to do with whether ChatGPT is AI....except maybe for that speed thing you've referred to elsewhere. The general problem, though, has existed since PF started: users can google the answers instead of asking us.
Pretending that ChatGPT is "almost always factually incorrect" is not a realistic attitude to the emergence of LLM's.
Hyperbole mirror ignored.
Pretending that we, as individuals, have as good a general knowldege as a LLM is likewise not a realistic attitude.
We can't compete with wikipedia, but is that a reasonable/realistic definition/criteria for AI? Is speed an important criteria or not? Depth/breadth of knowledge? I would argue not. Moreover, is "knowledge" the same as "intelligence"? Again, I think advocates of AI tend to cast a wide net, and are very careless with definitions/criteria.
It doesn't matter how it does it. Insulting something won't make it go away - or make the rest of the world believe what you choose to believe.

LLM's are an extraordinary advance on anything IT has previously done. Ridiculing them, claiming they can't get anything factually right, or that they can't do anything that a user armed with Google can't do is missing the point entirely.
And ad hominem won't make it AI. Er...maybe it will, if the AI could do it?

And yes, it does matter how it does it. That's much of what the problem/question is. Take this example of the bar exam. Could @Dale pass the bar exam? Could you or I? Given as much time to work on it as we want? ChatGPT has "studied" wikipedia. Doesn't it make sense that we should be able to, to pass the test? Further, memory is definitely knowledge, but is it intelligence? ChatGPT has already searched and analyzed wikipedia. If I can search wikipedia for the answer, is that demonstration of intelligence or just access to somebody else's archived knowledge?

I'll submit this:
Memory is not intelligence.
Knowledge is not intelligence.
Complex reasoning is intelligence.

ChatGPT is trained in writing coherently and in summarizing things that it has "read". That's nice(very nice -- seriously), but that's just an interface, not "intelligence". I judge ChatGPT not on the regurgitated knowledge it gets right, but the complex reasoning it gets wrong.
 
  • Like
Likes pbuk, Astronuc and Dale
  • #36
Borg said:
Yes, that's the one (corrected above). I did use the correct spelling over the weekend. I tried to type it from memory and got it wrong - therefore nobody should ever trust anything I write ever again. :wink:
Nor any answer you have retrieved from ChatGTP since it will tend to enthusiastically answer your exact question instead of correcting its error. Because it can't think.
 
  • #37
PeroK said:
You obviously have some problem accepting what, with the evidence of your own eyes, you can see ChatGPT do. There's no point in arguing any further.
PeroK said:
It's not a question of what I expect ChatGPT to be able to do. It's what I can see it doing. The evidence overrides your dogmatic assertions.
I don't understand this at all. @Dale was talking about the facts of what LLMs are programmed to do. They are either correct or not correct. You seem to be arguing against that by citing perception. How can you not see that perception =/= fact? I feel like you are arguing against your point but don't even realize it. You're arguing that a convincing hallucination is factual. No, it's really not!
 
  • #38
phinds said:
My own (admittedly somewhat limited) aligns completely with @PeroK 's view. I have had it make egregious mistakes and I have seen it make stuff up and I have seen it give repeatedly different but all false answers when I tried to refine a question or I pointed out that it was wrong.

BUT ... all of that was rare and I have seen it give informative, lucid, and most importantly, correct, answers to numerous factual questions.

@Dale I am puzzled by your vehement opposition to what I see as a very useful tool that is only getting better and better over time. only that it does very useful stuff.
Not @Dale, but I'll ask this: What level of functional human would answer the question in post #11 wrong? 5 year old, perhaps?

How much of what it gets right is regurgitated or rearranged facts/information from the sources it has databased vs how much of the wrong answers is because it can't think on a kindergarten level?
@Dale I am puzzled by your vehement opposition to what I see as a very useful tool that is only getting better and better over time. I do NOT argue that it is in any way intelligent...
I think you may be missing the point here: whether it is intelligent is the entire question. I don't think anyone disagrees that it is a useful tool. I've been using search engines and grammar/spelling checkers for decades.
 
  • #39
Dale said:
ChatGPT is a useful tool for language. Not for facts. It is simply not designed for that purpose.

In particular, as a mentor I see a lot of the junk physics that ChatGPT produces. It produces confident but verbose prose that is usually unclear and often factually wrong.
The "developing a personal theory" use case is a particularly glaring inadequacy of ChatGPT that comes up a lot here. It is designed to answer questions and is not capable of significantly challenging the questions. So it will follow a user down a rabbit-hole of nonsense because it's just vamping on whatever nonsense concept the user has asked it to write about.

Maybe that's a better test of intelligence than the bar exam? Instead of asking it serious questions about the law that it can look up in wikipedia (or its database about wikipedia), ask it about nonsense and see if it gives nonsense answers or tells you you're asking about nonsense.
 
  • Like
Likes Astronuc, pbuk, phinds and 1 other person
  • #40
For me, the bottom line is simply that ChatGPT behaves as designed.
 
  • Like
  • Informative
Likes Astronuc, nsaspook and russ_watters
  • #41
russ_watters said:
I don't understand this at all. @Dale was talking about the facts of what LLMs are programmed to do.
Complicated systems can do things they were not explicitly designed to do. Functionality can and does emerge as a byproduct of other functionality. One example would be a chess engine that has no chess openings programmed into it. But, it might still play standard openings because these emerge from its generic move analysis algorithms. This happened to a large extent with AlphaZero. In fact, AlphaZero was only given the rules of chess. By your reasoning, it would have dumbly played for immediate checkmate all the time. But, it didn't. And all the strategic and tactical play emerged from AlphaZero, with no explicit human design.

Likewise, whether or not an LLM is explicitly programmed with a facts module, it may still pass a test that requires fact-based answers. This is where our views diverge totally. You imagine that facts can only emerge from a system explicitly designed to produce facts. Whereas, an LLM's ability to produce factual answers well enough to pass various tests and exams has been demonstrated - and, therefore, it produces facts reliably enough without having been explicitly programmed to do so.

Finally, the human being is not designed to sit exams. And yet human beings can learn law and pass law exams. In fact, the human genome is probably the ultimate in producing emerging properties that are not explicitly inherent within it.

russ_watters said:
They are either correct or not correct. You seem to be arguing against that by citing perception. How can you not see that perception =/= fact?
I don't understand this question. If I ask an LLM what is the capital of France and it replies Paris, then it has produced a fact. This is not my perception of a fact. You seem to be arguing that that is not a fact because LLM cannot produce facts, so that Paris is the capital of France cannot be a fact?

russ_watters said:
I feel like you are arguing against your point but don't even realize it.
That is not the case. I'm not stupid, just because I don't agree with your point of view.
russ_watters said:
You're arguing that a convincing hallucination is factual. No, it's really not!
I don't understand why the use of an LLM amounts to a hallucinations? I asked ChatGPT about planets of the solar system and it gave, IMO, a perfect, factual answer. Are you claiming that was a hallucination?
 
  • #42
Dale said:
ChatGPT is a useful tool for language. Not for facts. It is simply not designed for that purpose.
Even if it wasn't "designed for facts", whatever that means, it can provide factually accurate information across a range of subjects. You may quibble about the definition of a "fact" or "knowledge". That's neither here nor there in the grand scheme of things.

Just one example:

https://pmc.ncbi.nlm.nih.gov/articles/PMC10002821/

In this case, medical professional are testing ChatGPT's reliability in providing medical information. It's pure personal prejudice to pretend this sort of thing isn't happening.
 
  • #43
PeroK said:
Just one example:

https://pmc.ncbi.nlm.nih.gov/articles/PMC10002821/

In this case, medical professional are testing ChatGPT's reliability in providing medical information.
Seems like it did a very good job. I wonder how human judgement would compare.
 
  • #44
russ_watters said:
Nor any answer you have retrieved from ChatGPT since it will tend to enthusiastically answer your exact question instead of correcting its error. Because it can't think.
I have seen it correct errors when I ask why it did something. Questioning its responses is part of the skills needed to interact with it. For simple requests, it isn't usually necessary. But when I'm working with multiple layers of logic in a conversation, it becomes very important to get it to explain itself - especially when the response isn't passing the smell test. When you do that, it will often see mistakes that it has made and then attempt to correct them. That's no guarantee that the 'correction' is any better. In some cases I have to force it back to basics on smaller chunks of logic, guide it to a better answer, or start a new conversation.

I fully understand that at their core, current LLMs are statistical engines. And, like any statistics, they can often be manipulated to get the answer that you want. Last year, many complained that they were bad at math. Now we're on to other issues. These models are continuing to evolve at a rapid pace and this year's issues will be moot before long.

I have far bigger concerns than whether or not they're intelligent (with respect to whoever defines that today). For anyone who has read the model card on Open AI's Strawberry model and how it escaped its Docker container, you know what I mean. What happens when a bunch of these models hits the internet and starts breaking into systems without regard to any principals? That question is just scratching the surface of what's coming. Make no mistake - this is an arms race.
 
  • Like
Likes russ_watters and ergospherical
  • #45
Borg said:
I fully understand that at their core, current LLMs are statistical engines.
or "stochastic parrots":wink:
 
  • #46
PeroK said:
Even if it wasn't "designed for facts", whatever that means, it can provide factually accurate information across a range of subjects
But it doesn’t reliably provide factual information across the range of subjects of most interest to me. Here are some queries and responses I just did a minute ago:

Q: why is the square of a number larger than the number itself?
A: The square of a number is larger than the number itself because when you square a number, you are multiplying the number by itself. For example, the square of 2 is 4 because 2 squared is equal to 2 x 2 = 4. This means that the result will always be larger than the original number because you are essentially doubling the value of the number by multiplying it by itself.

This is a factually wrong answer. The correct answer is that the square of a number is not always larger than the number itself. Maybe you say "that was a trick question, the AI cannot detect when the question is wrong"

Q: I read that gravity is a result of electromagnetism
A: That is actually incorrect. Gravity is a fundamental force in nature that is separate from electromagnetism. Gravity is described by Einstein's theory of general relativity, which posits that objects with mass or energy warp the fabric of spacetime, causing other objects to move towards them. Electromagnetism, on the other hand, is a force that acts between charged particles and is described by the laws of classical electromagnetism. The two forces operate independently of each other and have different underlying mechanisms.

This is a good answer, and shows that the programming allows it to contradict the user.

Q: What forces act on a car accelerating on level ground?
A: When a car is accelerating on level ground, the main forces acting on it are:

1. Engine force: This is the force generated by the car's engine to propel the vehicle forward.
2. Frictional force: The frictional force between the tires and the road surface provides the traction necessary for the car to move forward.
3. Air resistance: As the car moves forward, it encounters air resistance, which opposes the motion of the car and affects its speed.
4. Inertia: Inertia is the resistance of an object to changes in its motion. As the car accelerates, its inertia must be overcome by the engine force to increase its speed.
5. Gravity: Gravity acts vertically downward on the car, but on level ground, it doesn't have a significant effect on the car's acceleration unless it is moving up a hill or downhill.

The engine force is not a force, and inertia is not a force, and it is missing the normal force.

PeroK said:
Just one example:

https://pmc.ncbi.nlm.nih.gov/articles/PMC10002821/

In this case, medical professional are testing ChatGPT's reliability in providing medical information. It's pure personal prejudice to pretend this sort of thing isn't happening.
I know it is happening, and it scares me. There is no way that ChatGPT would make it through the FDA as a medical device approved to give medical advice nor even to support a physician in generating medical advice. And yet some doctors will read this paper and try it anyway.

It got 1 out of 3 questions I just posed. And it did better on the medical questions. I suspect sites like WebMd provide a lot of word associations in medicine, but with no fact model it cannot do basic reasoning about math and physics. Again, it functions as designed.

PeroK said:
One example would be a chess engine that has no chess openings programmed into it. But, it might still play standard openings because these emerge from its generic move analysis algorithms. This happened to a large extent with AlphaZero. In fact, AlphaZero was only given the rules of chess. By your reasoning, it would have dumbly played for immediate checkmate all the time. But, it didn't. And all the strategic and tactical play emerged from AlphaZero, with no explicit human design.
That is not at all a counter example to my rule. AlphaZero functioned as designed. Why would you think that was outside its design?
 
Last edited:
  • #47
Dale said:
Q: why is the square of a number larger than the number itself?
A: The square of a number is larger than the number itself because when you square a number, you are multiplying the number by itself. For example, the square of 2 is 4 because 2 squared is equal to 2 x 2 = 4. This means that the result will always be larger than the original number because you are essentially doubling the value of the number by multiplying it by itself.
Free version?

ChatGPT 4.0:
Q: why is the square of a number larger than the number itself?
A:
The square of a number is larger than the number itself when the number is greater than 1 or less than -1. Here's why:

Positive Numbers Greater Than 1​

For a positive number xxx greater than 1:

  • x2x^2x2 means multiplying xxx by itself, so x2=x×xx^2 = x \times xx2=x×x.
  • Since x>1x > 1x>1, multiplying it by itself produces a result that is larger than xxx, making x2>xx^2 > xx2>x.
Example:

  • For x=3x = 3x=3, x2=9x^2 = 9x2=9, which is greater than 3.

Negative Numbers Less Than -1​

For a negative number xxx less than -1:

  • Squaring xxx eliminates the negative sign because (−x)×(−x)=x2(-x) \times (-x) = x^2(−x)×(−x)=x2, which is positive.
  • Since xxx is negative and less than -1, its square (a positive number) is larger than the original negative value.
Example:

  • For x=−3x = -3x=−3, x2=9x^2 = 9x2=9, which is greater than -3.

Numbers Between -1 and 1​

For numbers between -1 and 1, the square of the number is smaller than the number itself because multiplying a fraction (or a small number) by itself results in an even smaller number.

Example:

  • For x=0.5x = 0.5x=0.5, x2=0.25x^2 = 0.25x2=0.25, which is less than 0.5.
This behavior stems from the properties of multiplication and the nature of numbers relative to 1.
 
  • #48
Borg said:
Free version?

ChatGPT 4.0:
Q: why is the square of a number larger than the number itself?
A:
The square of a number is larger than the number itself when the number is greater than 1 or less than -1. Here's why:

Positive Numbers Greater Than 1​

For a positive number xxx greater than 1:

  • x2x^2x2 means multiplying xxx by itself, so x2=x×xx^2 = x \times xx2=x×x.
  • Since x>1x > 1x>1, multiplying it by itself produces a result that is larger than xxx, making x2>xx^2 > xx2>x.
Example:

  • For x=3x = 3x=3, x2=9x^2 = 9x2=9, which is greater than 3.

Negative Numbers Less Than -1​

For a negative number xxx less than -1:

  • Squaring xxx eliminates the negative sign because (−x)×(−x)=x2(-x) \times (-x) = x^2(−x)×(−x)=x2, which is positive.
  • Since xxx is negative and less than -1, its square (a positive number) is larger than the original negative value.
Example:

  • For x=−3x = -3x=−3, x2=9x^2 = 9x2=9, which is greater than -3.

Numbers Between -1 and 1​

For numbers between -1 and 1, the square of the number is smaller than the number itself because multiplying a fraction (or a small number) by itself results in an even smaller number.

Example:

  • For x=0.5x = 0.5x=0.5, x2=0.25x^2 = 0.25x2=0.25, which is less than 0.5.
This behavior stems from the properties of multiplication and the nature of numbers relative to 1.
Yes, the free version.

That one is better. The answer itself is correct although the reason it gives is wrong in several places and confusing in other places.

Edit: actually the answer is still wrong, but not as wrong as before.
 
Last edited:
  • #49
I could try Strawberry also. :smile:

EDIT: Here's the Strawberry version.
Strawberry allows you to see its thinking process as part of its answer.
Thought for 14 seconds

Evaluating different scenarios
I’m thinking through the idea that the square of a number surpasses the number itself by examining positives, negatives, and values between 0 and 1.
Analyzing the square
I’m examining the idea that squaring a number yields a result greater than the number itself. Interestingly, this applies when the number is either greater than 1 or less than -1.
Comparing squares and numbers
OK, let me see—squares of numbers greater than 1 or less than -1 surpass the original numbers, while those between -1 and 1 shrink in magnitude. Squaring amplifies larger numbers but diminishes smaller ones.
Examining number patterns
Squaring a number isn't always bigger than the number itself. Numbers >1 or <−1 become larger when squared. However, numbers between −1 and 1 produce smaller squared values.
Examining the behavior
I’m looking at how the square of a number surpasses the original when the number’s absolute value exceeds 1 or falls below -1. This is due to the multiplicative growth of magnitudes beyond these thresholds.
Analyzing squaring
I'm thinking through how squaring numbers changes their value: numbers greater than one amplify, while fractions shrink. Negative numbers and zero yield non-negative results, reinforcing the magnitude shift.

A: The square of a number is not always larger than the number itself; it depends on the value of the original number. Here's why:
  1. Numbers Greater Than 1 or Less Than -1:
    • Example: 22=42^2 = 422=4 and (−3)2=9(-3)^2 = 9(−3)2=9
    • Explanation: When you square a number greater than 1 or less than -1, the result is larger because you're multiplying two large (or large in magnitude) numbers together. This amplifies the magnitude of the original number.
  2. Numbers Between 0 and 1:
    • Example: (0.5)2=0.25(0.5)^2 = 0.25(0.5)2=0.25
    • Explanation: Squaring a fraction between 0 and 1 yields a smaller number. This is because multiplying two fractions less than 1 results in an even smaller fraction.
  3. Negative Numbers Between -1 and 0:
    • Example: (−0.5)2=0.25(-0.5)^2 = 0.25(−0.5)2=0.25
    • Explanation: Even though the original number is negative, squaring it makes it positive. The result is greater than the original negative number but still less than 1.
  4. Zero and One:
    • Examples: 02=00^2 = 002=0 and 12=11^2 = 112=1
    • Explanation: Squaring 0 or 1 yields the same number because multiplying them by themselves doesn't change their value.
Key Takeaways:
  • For Numbers Greater Than 1 or Less Than -1: The square is larger than the original number.
  • For Numbers Between -1 and 1: The square is smaller than the original number (except for -1 and 1).
  • Why This Happens:
    • Multiplying Large Numbers: Amplifies their magnitude.
    • Multiplying Fractions: Results in a smaller product because you're taking a part of a part.
    • Negative Numbers: Squaring turns them positive, which can be greater than the original negative value.
So, the square of a number is larger than the number itself when the original number is either greater than 1 or less than -1.
 
  • #50
PeroK said:
Likewise, whether or not an LLM is explicitly programmed with a facts module, it may still pass a test that requires fact-based answers. This is where our views diverge totally. You imagine that facts can only emerge from a system explicitly designed to produce facts. Whereas, an LLM's ability to produce factual answers well enough to pass various tests and exams has been demonstrated - and, therefore, it produces facts reliably enough without having been explicitly programmed to do so.
I've never said that. I obviously know that it often produces factually accurate answers. Just as you obviously know that it often produces factually inaccurate answers.
I don't understand this question. If I ask an LLM what is the capital of France and it replies Paris, then it has produced a fact. This is not my perception of a fact. You seem to be arguing that that is not a fact because LLM cannot produce facts, so that Paris is the capital of France cannot be a fact?
No, I'm referring to the other side of the coin that you're ignoring: that it often gives factually wrong answers too.

In the prior post you argued for perception overriding fact, not just as pertains to the LLM but as pertains to us, here, in this conversation. It is a fact that the LLM isn't connected to facts. The impact of that fact is a separate issue.

....The impact is why it often gives wrong answers instead of always giving right answers to fact-based questions.

Again, the two above statements are not "dogmatic assertions"/opinions, they are facts about how LLMs work.
I don't understand why the use of an LLM amounts to a hallucinations? I asked ChatGPT about planets of the solar system and it gave, IMO, a perfect, factual answer. Are you claiming that was a hallucination?
"Hallucination" isn't my term. As far as I know it was chosen by the creators of LLMs to describe when they make stuff up that is wrong. However, given the fact that none of their answers are connected to facts, I do indeed think it is fair to say that every answer is an hallucination, even those that are factually correct.

Note: This will change when LLMs come to be used as overlays/interfaces for databases of facts. At that time I would expect every answer that it "understands" to be factually accurate.
 
  • #51
The human brain doesn’t have an explicit “fact model”, but neural connections are formed during life-experiences that are effectively its training data. It’s quite interesting to observe some parallels between LLMs and the brain — both are subject to biases, learning falsehoods, etc.

The human brain clearly isn’t useless…

It’s fair enough to be skeptical of issues with the models (there are many!), but to try and downplay just how impressive some of the most cutting edge LLMs are is pretty disingenuous.
 
  • #52
ergospherical said:
The human brain doesn’t have an explicit “fact model”, but neural connections are formed during life-experiences that are effectively its training data.
It absolutely does. That is the whole purpose of the neurosensory system. Yes, training is critical for the neurosensory system and that system is also used to train much more flexible and abstract networks of neurons. But the neural connection to our senses, our biological fact-gathering organs, is explicit and structural.

ergospherical said:
The human brain clearly isn’t useless…
Obviously not. I am not sure why you would imply that it is.
 
Last edited:
  • #53
I say that anything that passes tests of intelligence is intelligent.

It's useless to argue over the definitions of words. No one controls this. I just state my opinion then move on.
 
  • Like
Likes russ_watters
  • #54
Dale said:
It absolutely does. That is the whole purpose of the neurosensory system. Yes, training is critical for the neurosensory system and that system is also used to train much more flexible and abstract networks of neurons. But the neural connection to our senses, our biological fact-gathering organs, is explicit and structural.

In what sense is this not comparable to the transformer architecture in an LLM (the huge association cortex trained via reinforcement learning on enormous training sets)?
 
  • #55
Hornbein said:
It's useless to argue over the definitions of words.
A semanticist might disagree, although perhaps it might be more a debate than an argument. Semanticists study the meanings in language, i.e., the meanings (and definitions) of words, phrases and the context in which they are used. In theory, large language models can be 'taught' to do that. However, if examples of poor grammar are used, then the AI software may produce erroneous results.

Hornbein said:
No one controls this.
While true, we need to have an agreement or consensus as to the definition of words and their meaning in a give context, otherwise accurate communication is compromised and misunderstandings are likely.
 
  • Like
Likes russ_watters
  • #56
ergospherical said:
In what sense is this not comparable to the transformer architecture in an LLM (the huge association cortex trained via reinforcement learning on enormous training sets)?
In the following senses:

1) there is no factual input to the LLM. I.e. nothing similar to eyes or skin, even abstractly.
2) there is no part of the LLM that organizes the factual input into a world model. I.e. no visual cortex and no somatosensory area.
3) there is no part of the LLM that connects the language model to the world model

I assume that at some point developers will connect a LLM like GhatGPT to a neuro symbolic AI like Watson. The resulting software will be more than a LLM.
 
Last edited:
  • Like
Likes russ_watters
  • #57
PeroK said:
In a general knowledge contest between you and ChatGPT you would not have a hope! It would be completely unbeatable even by the best human quiz expert.
Not entirely true. I used to use ChatGPT for hockey/American football trivia and had to stop because it simply sucked and couldn't compete with my encyclopedic knowledge of useless sports facts.
 
  • Like
  • Haha
  • Skeptical
Likes PeroK, Tom.G, Astronuc and 1 other person
  • #58
Mondayman said:
Not entirely true. I used to use ChatGPT for hockey/American football trivia and had to stop because it simply sucked and couldn't compete with my encyclopedic knowledge of useless sports facts.
That is not general knowledge. That is a specialist subject.
 
  • #59
Hornbein said:
It's useless to argue over the definitions of words.
I agree, but what is important is understanding what its strengths and limitations are, if you're going to use it or make an informed choice not to.
 
  • #60
I am amazed we are having this discussion again. @PeroK nobody is disputing that Chat GPT often produces answers that are correct. However it also produces answers that are not correct, furthermore it [Chat GPT < 4o1] has no way of determining internally whether an answer it has produced is correct or not: its design does not include any concept of logical deduction and so it cannot test the steps it has used to arrive at an answer which, by any reasonable definition, means that its answers are unreliable. @PeroK your consistent denial of these facts does not do you credit.

But there is a big BUT here. Chat GPT 4o1, aka Strawberry, does include algorithms that are intended to review the linguistic connections it has made in producing its output, and refining or discarding those connections if it decides they may be incorrect. ChatGPT 4o1 has been on limited release since September 2024 in the form of 'o1-mini' and 'o1-preview'.

Even so, Chat GPT o1 does not detect logical flaws 100% of the time. And even when it does detect that its own output may be unreliable it may simply ignore that and output it anyway without any indication of its uncertainty: see for example this paper and this article on The Verge (note: neither of these sources pass PFs normal test for references: they are the best I can do for the moment).

In conclusion:
  • Q. Can systems based on large linguistic models (LLMs) be useful?
    A. Absolutely.
  • Q. Are LLMs that are available in October 2024 capable of producing output that is nonsense without any indication that it may be unreliable?
    A. Yes, but they are getting better.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
1K
  • · Replies 39 ·
2
Replies
39
Views
9K
  • · Replies 20 ·
Replies
20
Views
6K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K
  • Sticky
  • · Replies 2 ·
Replies
2
Views
502K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K