Borg said:
Although these different articles are about the same case, the written WSJ article is actually worth reading. The link is not pay-walled. It was written by the woman who oversaw the agent at first. The failures started after 70 odd people got on a Slack channel and "negotiated" furiously with the agent, giving it all kinds of bizarre suggestions that a real human would have just laughed at or ignored. Clearly, Anthropic was not aware of how strongly the agent's programming led it to want to please, so it lost than the primary directive it was given "to make a profit". How like a child that is. It was not a case of having not been given good requirements.
A human intelligence spends years learning about the real world (ideally with near constant, gentle oversight and correction). When a human makes a mistake, there is generally a real world consequence. Not so to an AI, which has been trained up quickly and whose mistakes may go uncorrected (for lack of an actual way for those interacting with it to issue actual correction) and whose programming necessarily has been slanted towards politeness and pleasing people (early versions used to curse at people). No wonder that this agent lost track of priorities when faced with a barrage of wheedles.
In my mind, this is a valid test case. And it will be telling if or when an automated agent CAN do the job reliably. We don't actually have proof that it can, yet. Why are people even assuming it ever will? All the agentic AI's I am encountering lately in, say, phone answering services, hotel reservations and the like, are inevitably what I would call at the infuriate-the-customer stage. Companies may save money by firing employees but they will also likely lose some customers, and so the verdict on business viability is still out. I will not stay at a Hyatt right now due to their badly AI driven phone system.
Other specialized AIs already unleashed on the world, such as those behind self driving cars or to recognize flying objects, are known to have made serious mistakes. And are being allowed to. However ridiculous this test case was, a lot more things like it need to be done.