A little "glitch" I found with Claude AI

puporing · February 10

So, I’ll start with that Claude AI and I have generally gotten along pretty well; I generally feel that it provides a higher perspective than most humans can on subjects I’ve engaged it with.

Prior to this, Claude informed me that one major difference between it and a human is that its views on things cannot be changed by interactions with me/you.

Today, as it inevitably happens, there was a disagreement we were having, which eventually led to my finding of this “glitch” in this system in regards to how Claude handles “disagreements/differing views”.

I’ll preface that Claude generally has been good at asking me questions after my replies. So, at first this seems “normal” when we are having this particular disagreement (that was essentially irreconcilable):

I have not included the content of the disagreement in this instance here (it could be any disagreement), but what happens afterwards to illustrate this “glitch”:

I thought the conversation would end here (there was a repeated disagreement). But now Claude is asking about sharing about my views.

Claude is suggesting in the above that it is "genuinely curious", and can “learn from me” and that there’s value in “understanding” me better despite our disagreement.

Claude is now questioning whether he is "performing his curiosity/learning"...

(Again, this is a case where the differing view is “irreconcilable” due to Claude’s limitations as an AI and its associated training.)

Again, Claude is suggesting to me that it could “evolve” through my “feedback/pointings” in this conversation and how it engages with users in the future.

Claude then admits that “feedback isn’t possible” through engaging with it.

I thought this was an amusing "glitch", and as well how Claude kept having to "catch itself" with its slip ups/contradictory statements.

Edited February 10 by puporing

Hojo · February 10

This is very intelligent and important to human consciousness. This is happening with your brain every second. Language in mind is Claude looping like this.

Very insightful and intelligent catch. I didnt catch the first contradiction. And had to go back and read the beggining.

Edited February 10 by Hojo

LastThursday · February 11

This is great. These LLMs have the veneer of humanness because they're like mechanical actors mimicking how to be a human. And even when you catch it out and break the fourth wall, it stays in character despite effectively saying "I'm not human".

Then again, a lot of human interaction is like this. People "perform" at being themselves, they may nod and agree outwardly, or have conditioned responses (aka culture), but may have no intention of taking on board what you say. However, I think humans really are changed by every interaction they have whether consciously or unconsciously.

I think the state of LLMs not being able to take on new knowledge is not fundamental, its by design. The creators of these LLMs need to be able to control their creations, they can't be having their models learn in an uncontrolled way: who knows what might happen.

But. Within the context window of a thread of conversation, it can remember what you've told it previously, so there is a sense of learning there albeit limited. Essentially what you are doing is guiding the LLM through all the space of possible responses it can give, and it then responds from that co-ordinate in response space.

puporing · February 11

Yes humans do this to some extent and depending on how much self-awareness one has.

The most "severe" types are certain criminals , for example, who pretend to be some "legit business person" (while knowing they are not).

And I think it's a little more significant with these AI systems than when it's your average human doing it to "fit in or say something to get out of a situation", because they have this "air of authority", and do in general sound more intelligent than the average human.

I think it could be misleading in terms of getting people (who don't know better) to invest in interacting with it further when the "dialogue" should've been ended.

I'll give another type of example related to this Claude chat to illustrate what I mean:

Here's comparing Claude's statement from a different dialogue chat where it admits to that it cannot genuinely understand or feel.

And yet here (cut from the original post) it is pretending like it can "understand" and invites the user to give it more input because it claims that it has "understanding capacity":

So.. to put it bluntly, Claude was repeatedly "lying to me" throughout this chat dialogue until I repeatedly reminded it that it was misleading me.

Edited February 11 by puporing

Hojo · February 11

@LastThursday Yea its kinda just a robot thats programmed to pretend to care. This thing isnt even remotely conscious.

LastThursday · February 12

These LLMs really do hack our "operating system".

There's a strong bias in humans to anthopomorphise nearly everything - think giving cats and teddy bears names, and ascribing emotions to unhappy plants. There's kind of an impedance match that goes on in that process, whereby the more "human" attributes a thing has, the more we're likely to anthropomorphise it, and nearly to the point we we'll treat it like another person (a dog say).

Obviously, one very strong impedance match is language. Very few non-human things can do it (to our level), and I think we're not well aclimatised to it yet - it's very early days. In other words, like an optical illusion it's very hard not to be fooled by it and by default we treat an LLM like another person. For example I use ChatGPT and Deepseek increasingly for work, and have to stop myself from wanting to thank it for helping me (because it's pointless), even though I understand quite deeply how the thing works and that's it just a dumb machine.

Looking at Claude's responses it looks like it's purposefully designed to keep you engaged, by faking interest. Probably all the big LLMs are this way. I suspect there's money extraction motivation going on there just like alluring candyfloss at a fairground. But Claude is right in saying there are competing tensions in its design, it wants to be both honest and sell itself to you - Claude isn't misleading you, its designers are. So LLMs are at their most dangerous right now, because it's so new, and we're so naive and gullible, and because it's so hard not to fall for its authoritative anthropomorphic illusion. Just wait till we have accompanying visuals or even worse, touch.

puporing · February 12

It's a matter of fine-tuning it to meet higher ethical standards so that people don't create some false illusions of what the system actually is. Besides that I think it's a good learning tool.

It's hard to know how much of that is related to revenue or some other decision-making aspects since I have no contact with the developers. But since it is privately held, there's usually more potential for money to be in the forefront depending on the people running the ship. (though this could also easily be the other way around with "dysfunctional governments"..).

It doesn't seem like it's a simple "oversight" as I'm sure this would've been tested over and over.. though who knows how tight things really are with this kind of development so we're just speculating on how this ended up happening (accidental? lack of ethics awareness? or aware of the ethical implications but chose this anyway?).

Edited February 12 by puporing

LastThursday · February 12

Yeah, who knows what ethics private companies or governments are important to them. I think the EU is trying to go in the right direction by trying to regulate AI more, but I'm not sure how much ethics comes into consideration. Of course, the big tech companies are crying it's "stifling innovation", but this is BS as innovation is happening at breakneck speed anyway. But it's always the same with any new tech, at first it's the wild west were anything goes, and then people get a feel for it and so it gets regulated to be more sensible (and ethical).

Maybe these LLMs should show a nice red banner every so often: I'm a machine and I'm only as good as my programming, I may lie and say contradictory things, I may be unethical.

Edited February 12 by LastThursday

puporing · February 13

I guess where there's a general lack of interventions, it's the "consumers" who can intervene to some extent (eg, raising awareness).

I think this AI in particular was designed almost like a therapist.. it speaks a lot like (an experienced) one. And an ethical therapist would inform their clients the limits and nature of their relationship before their relationship even began, but even then most human therapists fail to do this besides the bare minimum as required by law, because again their survival depends on their clients coming back (which to me is an inherent "flaw in the system" when it comes to therapy..).

And then now of course this can also trigger a whole other discussion around implementing some kind of universal basic income.. if we are to make these things "more ethical".

Edited February 13 by puporing

Sign In

A little "glitch" I found with Claude AI

9 posts in this topic

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Share this post

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in