AGI achieved 🤖

cyrano@lemmy.dbzer0.com · 8 days ago

AGI achieved 🤖

UrPartnerInCrime@sh.itjust.works · 6 days ago

Worked well for me

cashsky@sh.itjust.works · 6 days ago

What is that font bro…

UrPartnerInCrime@sh.itjust.works · edit-2 6 days ago

Its called sweetpea and my sweatpea picked it out for me. How dare I stick with something my girl picked out for me.

But the fact that you actually care what font someone else uses is sad

cashsky@sh.itjust.works · 5 days ago

Chill bro it’s a joke 💀. It’s like when someone uses comic sans as a font.

UrPartnerInCrime@sh.itjust.works · 5 days ago

Ohh.

You’re shrodingers douchebag

Got it

lordbritishbusiness@lemmy.world · 6 days ago

One of the interesting things I notice about the ‘reasoning’ models is their responses to questions occasionally include what my monkey brain perceives as ‘sass’.

I wonder sometimes if they recognise the trivialness of some of the prompts they answer, and subtilly throw shade.

One’s going to respond to this with ‘clever monkey! 🐒 Have a banana 🍌.’

save_the_humans@leminal.space · 6 days ago

I understand its probably more user friendly, but yet I still somehow find myself dissapointed the answers weren’t indexed from zero. Was this LLM written in MATLAB?

UrPartnerInCrime@sh.itjust.works · 6 days ago

Most users aren’t used to zero index so they would most likely think there was a problem with it haha

ynthrepic@lemmy.world · 6 days ago

Nice Rs.

nyamlae@lemmy.world · 6 days ago

Is this ChatGPT o3-pro?

UrPartnerInCrime@sh.itjust.works · 6 days ago

ChatGPT 4o

Korhaka@sopuli.xyz · 6 days ago

I asked it how many Ts are in names of presidents since 2000. It said 4 and stated that “Obama” contains 1 T.

TheOakTree@lemm.ee · 6 days ago

Toebama

qx128@lemmy.world · 7 days ago

I really like checking these myself to make sure it’s true. I WAS NOT DISAPPOINTED!

(Total Rs is 8. But the LOGIC ChatGPT pulls out is ……. remarkable!)

Zacryon@feddit.org · 7 days ago

“Let me know if you’d like help counting letters in any other fun words!”

Oh well, these newish calls for engagement sure take on ridiculous extents sometimes.

filcuk@lemmy.zip · 7 days ago

I want an option to select Marvin the paranoid android mood: “there’s your answer, now if you could leave me to wallow in self-pitty”

localhost443@discuss.tchncs.de · 6 days ago

Here I am, emissions the size of a small country, and they ask me to count letters…

Jakeroxs@sh.itjust.works · 6 days ago

Lol someone could absolutely do that as a character card.

scholar@lemmy.world · 6 days ago

jsomae@lemmy.ml · 6 days ago

This is deepseek model right? OP was posting about GPT o3

scholar@lemmy.world · 5 days ago

Yes this is a small(ish) offline deepseek model

AnUnusualRelic@lemmy.world · 6 days ago

What is this devilry?

ipitco@lemmy.super.ynh.fr · 6 days ago

Try with o4-mini-high. It’s made to think like a human by checking its answer and doing step by step, rather than just kinda guessing one like here

bitjunkie@lemmy.world · 6 days ago

Deep reasoning is not needed to count to 3.

sheetzoos@lemmy.world · 6 days ago

It is if you’re creating ragebait.

MrLLM@ani.social · 7 days ago

We gotta raise the bar, so they keep struggling to make it “better”

My attempt

0000000000000000
0000011111000000
0000111111111000
0000111111100000
0001111111111000
0001111111111100
0001111111111000
0000011111110000
0000111111000000
0001111111100000
0001111111100000
0001111111100000
0001111111100000
0000111111000000
0000011110000000
0000011110000000

Btw, I refuse to give my money to AI bros, so I don’t have the “latest and greatest”

ipitco@lemmy.super.ynh.fr · edit-2 6 days ago

Tested on ChatGPT o4-mini-high

It sent me this

0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 0 0 1 1 1 0 0 0 0 0 0
0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0
1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0

I asked it to remove the spaces


0001111100000000
0011111111000000
0011111110000000
0111111111100000
0111111111110000
0011111111100000
0001111111000000
0011111100000000
0111111111100000
1111111111110000
1111111111110000
1111111111110000
1111111111110000
0011100111000000
0111000011100000
1111000011110000

I guess I just murdered a bunch of trees and killed a random dude with the water it used, but it looks good

xavier666@lemm.ee · 5 days ago

I just murdered a bunch of trees and killed a random dude with the water it used, but it looks good

Tech bros: “Worth it!”

ipitco@lemmy.super.ynh.fr · 1 day ago

It’s a pretty big problem, but as long as governments don’t do shit then we’re pretty much fucked.

Either we take the train and contribute to the problem, or we don’t but get left behind, and end up being the harmed one.

jsomae@lemmy.ml · edit-2 6 days ago

When we see LLMs struggling to demonstrate an understanding of what letters are in each of the tokens that it emits or understand a word when there are spaces between each letter, we should compare it to a human struggling to understand a word written in IPA format (/sʌtʃ əz ðɪs/) even though we can understand the word spoken aloud normally perfectly fine.

GandalftheBlack@feddit.org · 5 days ago

But if you’ve learned IPA you can read it just fine

jsomae@lemmy.ml · 5 days ago

I know IPA but I can’t read English text written in pure IPA as fast as I can read English text written normally. I think this is the case for almost anyone who has learned the IPA and knows English.

Lukas Murch@thelemmy.club · 7 days ago

AI is amazing, we’re so fucked.

/s

Korhaka@sopuli.xyz · 6 days ago

Unironically, we are fucked when management think AI can replace us. Not when AI can actually replace us.

sheetzoos@lemmy.world · 6 days ago

Honey, AI just did something new. It’s time to move the goalposts again.

ZILtoid1991@lemmy.world · 6 days ago

Reality:

The AI was trained to answer 3 to this question correctly.

Wait until the AI gets burned on a different question. Skeptics will rightfully use it to criticize LLMs for just being stochastic parrots, until LLM developers teach their models to answer it correctly, then the AI bros will use it as a proof it becoming “more and more human like”.

outhouseperilous@lemmy.dbzer0.com · 4 days ago

No but see they’re not skeptics, they’re just haters, and there is no valid criticism of this tech. Sorry.

And also youve just been banned from like twenty places tor being A FANATIC “anti ai shill”. Genuinely check the mod log, these fuckers are cultists.

slaacaa@lemmy.world · 7 days ago

Singularity is here

Echo5@lemmy.world · 6 days ago

Maybe OP was low on the priority list for computing power? Idk how this stuff works

RizzoTheSmall@lemm.ee · 6 days ago

o3-pro? Damn, that’s an expensive goof

jsomae@lemmy.ml · edit-2 7 days ago

People who think that LLMs having trouble with these questions is evidence one way or another about how good or bad LLMs are just don’t understand tokenization. This is not a symptom of some big-picture deep problem with LLMs; it’s a curious artifact like in a jpeg image, but doesn’t really matter for the vast majority of applications.

You may hate AI but that doesn’t excuse being ignorant about how it works.

untorquer@lemmy.world · 7 days ago

These sorts of artifacts wouldn’t be a huge issue except that AI is being pushed to the general public as an alternative means of learning basic information. The meme example is obvious to someone with a strong understanding of English but learners and children might get an artifact and stamp it in their memory, working for years off bad information. Not a problem for a few false things every now and then, that’s unavoidable in learning. Thousands accumulated over long term use, however, and your understanding of the world will be coarser, like the Swiss cheese with voids so large it can’t hold itself up.

jsomae@lemmy.ml · 7 days ago

You’re talking about hallucinations. That’s different from tokenization reflection errors. I’m specifically talking about its inability to know how many of a certain type of letter are in a word that it can spell correctly. This is not a hallucination per se – at least, it’s a completely different mechanism that causes it than whatever causes other factual errors. This specific problem is due to tokenization, and that’s why I say it has little bearing on other shortcomings of LLMs.

untorquer@lemmy.world · 7 days ago

No, I’m talking about human learning and the danger imposed by treating an imperfect tool as a reliable source of information as these companies want people to do.

Whether the erratic information is from tokenization or hallucinations is irrelevant when this is already the main source for so many people in their learning, for example, a new language.

jsomae@lemmy.ml · edit-2 6 days ago

Hallucinations aren’t relevant to my point here. I’m not defending that AIs are a good source of information, and I agree that hallucinations are dangerous (either that or misusing LLMs is dangerous). I also admit that for language learning, artifacts caused from tokenization could be very detrimental to the user.

The point I am making is that LLMs struggling with these kind of tokenization artifacts is poor evidence for drawing any conclusions about their behaviour on other tasks.

untorquer@lemmy.world · 6 days ago

That’s a fair point when these LLMs are restricted to areas where they function well. They have use cases that make sense when isolated from the ethics around training and compute. But the people who made them are applying them wildly outside these use cases.

These are pushed as a solution to every problem for the sake of profit with intentional ignorance of these issues. If a few errors impact someone it’s just a casualty in the goal of making it profitable. That can’t be disentwined from them unless you limit your argument to open source local compute.

jsomae@lemmy.ml · 6 days ago

Well – and I don’t meant this to be antagonistic – I agree with everything you’ve said except for the last sentence where you say “and therefore you’re wrong.” Look, I’m not saying LLMs function well, or that they’re good for society, or anything like that. I’m saying that tokenization errors are really their own thing that are unrelated to other errors LLMs make. If you want to dunk on LLMs then yeah be my guest. I’m just saying that this one type of poor behaviour is unrelated to the other kinds of poor behaviour.

moseschrute@lemmy.world · 6 days ago

Also just checked and every open ai model bigger than 4.1-mini can answer this. I think the joke should emphasize how we developed a super power inefficient way to solve some problems that can be accurately and efficiently answered with a single algorithm. Another example is using ChatGPT to do simple calculator math. LLMs are good at specific tasks and really bad at others, but people kinda throw everything at them.

__dev@lemmy.world · 7 days ago

And yet they can seemingly spell and count (small numbers) just fine.

jsomae@lemmy.ml · 7 days ago

what do you mean by spell fine? They’re just emitting the tokens for the words. Like, it’s not writing “strawberry,” it’s writing tokens <302, 1618, 19772>, which correspond to st, raw, and berry respectively. If you ask it to put a space between each letter, that will disrupt the tokenization mechanism, and it’s going to be quite liable to making mistakes.

I don’t think it’s really fair to say that the lookup 19772 -> berry counts as the LLM being able to spell, since the LLM isn’t operating at that layer. It doesn’t really emit letters directly. I would argue its inability to reliably spell words when you force it to go letter-by-letter or answer queries about how words are spelled is indicative of its poor ability to spell.

__dev@lemmy.world · 6 days ago

what do you mean by spell fine?

I mean that when you ask them to spell a word they can list every character one at a time.

jsomae@lemmy.ml · 6 days ago

Well that’s a recent improvement. GPT3 was very bad at that, and GPT4 still makes mistakes.

buddascrayon@lemmy.world · 6 days ago

The problem is that it’s not actually counting anything. It’s simply looking for some text somewhere in its database that relates to that word and the number of R’s in that word. There’s no mechanism within the LLM to actually count things. It is not designed with that function. This is not general AI, this is a Generative Adversarial Network that’s using its vast vast store of text to put words together that sound like they answer the question that was asked.

cyrano@lemmy.dbzer0.com · 8 days ago

Next step how many r in Lollapalooza

Mwa@thelemmy.club · edit-2 6 days ago

With Reasoning (this is QWEN on hugginchat it says there is Zero)

sexy_peach@feddit.org · 8 days ago

Incredible

And009@lemmynsfw.com · 8 days ago

Agi lost

wischi@programming.dev · 4 days ago

LLMs are not AGI tough.

xavier666@lemm.ee · 7 days ago

Henceforth, AGI should be called “almost general intelligence”

outhouseperilous@lemmy.dbzer0.com · 4 days ago

To clarify: in that it’s almost general, not that it’s almost intelligence.

cyrano@lemmy.dbzer0.com · 7 days ago

Happy cake day 🍰

xavier666@lemm.ee · 7 days ago

Thanks! Time really flies.

Qwazpoi@lemmy.world · 7 days ago

cyrano@lemmy.dbzer0.com · edit-2 7 days ago

Try it with o3 maybe it needs time to think 😝

Eager Eagle@lemmy.world · edit-2 7 days ago

which model is it? I had a similar answer with 3.5, but 4o replies correctly

ThirdConsul@lemmy.ml · edit-2 7 days ago

IIRC if you take s look at 4o leaked instruction (prompt that is “injected” at the begining of the chat), that model is clearly ordered HOW to solve this kind of problem lol

Eager Eagle@lemmy.world · 7 days ago

are you sure?

ThirdConsul@lemmy.ml · 7 days ago

Sorry, that was Claude 3.7, not ChatGPT 4o

https://github.com/elder-plinius/CL4R1T4S/blob/d9a004b5a29395675c5a548acfc386459f71cd14/ANTHROPIC/Claude_Sonnet_3.7_New.txt#L92

Eager Eagle@lemmy.world · 7 days ago

ah, that’s reasonable though, considering LLMs don’t really “see” characters, it’s kind of impressive this works sometimes

altkey@lemmy.dbzer0.com · 7 days ago

Apparently, this robot is japanese.

jballs@sh.itjust.works · 7 days ago

I’m going to hell for laughing at that

altkey@lemmy.dbzer0.com · 7 days ago

Don’t be. Although there are millions of corpses behind each WW2 joke, getting it means you are personally aware of that, and it means something. ‘Those who don’t know shit about the past struggles are to reiterate them’ and all that.

sp3ctr4l@lemmy.dbzer0.com · edit-2 7 days ago

Obligatory ‘lore dump’ on the word lollapalooza:

That word was a common slang term in the 1930s/40s American lingo that meant… essentially a very raucous, lively party.

Note/Rant on the meaning of this term

The current merriam webster and dictionary.com definitions of this term meaning ‘an outstanding or exceptional or extreme thing’ are wrong, they are too broad.

While historical usage varied, it almost always appeared as a noun describing a gathering of many people, one that was so lively or spectacular that you would be exhausted after attending it.

When it did not appear as a noun describing a lively, possibly also ‘star-studded’ or extravagant, party, it appeared as a term for some kind of action that would cause you to be bamboozled, discombobulated… similar to ‘that was a real humdinger of a blahblah’ or ‘that blahblah was a real doozy’… which ties into the effects of having been through the ‘raucous party’ meaning of lolapalooza.

So… in WW2, in the Pacific theatre… many US Marines were often engaged in brutal, jungle combat, often at night, and they adopted a system of basically verbal identification challenge checks if they noticed someone creeping up on their foxholes at night.

An example of this system used in the European theatre, I believe by the 101st and 82nd airborne, was the challenge ‘Thunder!’ to which the correct response was ‘Flash!’.

In the Pacific theatre… the Marines adopted a challenge / response system… where the correct response was ‘Lolapalooza’…

Because native born Japanese speakers are taught a phoneme that is roughly in between and ‘r’ and an ‘l’ … and they very often struggle to say ‘Lolapalooza’ without a very noticable accent, unless they’ve also spent a good deal of time learning spoken English (or some other language with distinct ‘l’ and ‘r’ phonemes), which very few Japanese did in the 1940s.

racist and nsfw historical example of / evidence for this

https://www.ep.tc/howtospotajap/howto06.html

Now, some people will say this is a total myth, others will say it is not.

My Grandpa who served in the Pacific Theatre during WW2 told me it did happen, though he was Navy and not a Marine… but the other stories about this I’ve always heard that say it did happen, they all say it happened with the Marines.

My Grandpa is also another source for what ‘lolapalooza’ actually means.

altkey@lemmy.dbzer0.com · 7 days ago

I’m still puzzled by the idea of what mess this war was if at times you had someone still not clearly identifiable, but that close you can do a sheboleth check on them, and that at any moment you or the other could be shot dead.

Also, the current conflict of Russia vs Ukraine seems to invent ukrainian ‘паляница’ as a check, but as I had no connection to actual ukrainians and their UAF, I can’t say if that’s not entirely localized to the internet.

sp3ctr4l@lemmy.dbzer0.com · edit-2 6 days ago

Have you ever been to a very dense jungle or forest… at midnight?

Ok, now, drop mortar and naval artillery shells all over it.

For weeks, or months.

The holes this creates are commonly used by both sides as cover and concealment.

Also, its often raining, sometimes quite heavily, such that these holes will up with water, and you are thus soaking wet.

Ok, now, add in pillboxes and bunkers, as well as a few spiderwebs of underground tunnel networks, many of which have concealed entrances.

You do not have a phone. GPS does not exist.

You might have a map, which is out of date, and you might have a compass, if you didn’t drop or break it.

A radio is either something stationary, or is the size and weight of approximately, somewhat less than a miniature refrigerator, and one bullet or good piece of shrapnel will take it out of commission.

Ok, now, you and all your buddies are either half starving or actually starving, beyond exhausted, getting maybe an average of 2 to 4 hours of sleep, and you, and the enemy, are covered in dirt, blood and grime.

Also, you and everyone else may or may not have malaria, or some other fun disease, so add shit and vomit to the mix of what everyone is covered in.

Ok! Enjoy your 2 to 8 week long camping trip from hell, in these conditions… also, kill everyone that is trying to kill you, soldier.

altkey@lemmy.dbzer0.com · 6 days ago

It’s weird foot soldiers kept killing each other.

It’s not weird we had ‘frag’ as a verb from the Vietnam war.

outhouseperilous@lemmy.dbzer0.com · 4 days ago

Look at how they suppressed the christmas truce.

sp3ctr4l@lemmy.dbzer0.com · edit-2 6 days ago

Friendly fire incidents are still fairly common even in the modern era…

… ask any Brits deployed to Iraq how they feel about the A-10…

… Pat Tillman was hyped up in the media as an early Iraq War 2 US casualty who died valiantly… when the truth was he was actually killed by friendly fire from his own unit, oh and he actually thought the entire operation in Afghanistan was “fucking illegal”… because Congress is supposed to declare war, not the President…

Even in the RussoUkranian war, right now, in the past few years, there have been tons of incidents of Russians accidentally shooting their own at fairly close range, due to poor coordination, and I’m sure its happened with the Ukranians as well… and thats to say nothing of accidentally drone or arty striking a friendly infantry squad or tank or IFV or what not.

Just go play any modern semi-realistic war game (Squad, Arma 3/Reforger, etc) that doesn’t have a pop up HUD with blue for friend and red for foe, and has friendly fire enabled, and you should be able to see that friendly fire happens all the time with noobs.

…

As for fragging… that term, as it originated in Vietnam, specifically refferred to tossing a fragmentation grenade into an area (often their bunk) where an officer or NCO was.

It was a form of mutiny, essentially, against officers that kept sending men into meat-grinders…

…chewing them out for not maintaining their early M16s which were unreliable as fuck due to being rammed through the production pipeline by McNamara, shoddy quality control from Colt, and everyone just pretending swapping to a new kind of powder in the rounds wouldn’t blow past the designed tolerances of the weapon…

… or just, you know, fuck being drafted into this bullshit war.

In the modern day, ‘frag’ is mostly a gamer term that basically just means ‘killed a guy’, and the origin of that term has been obscured, forgotten.

resipsaloquitur@lemmy.world · edit-2 7 days ago

https://en.wikipedia.org/wiki/Shibboleth

I’ve heard “squirrel” was used to trap Germans.

merc@sh.itjust.works · 7 days ago

If you’ve ever heard Germans try to pronounce “squirrel”, it’s hilarious. I’ve known many extremely bilingual Germans who couldn’t pronounce it at all. It came out sounding roughly like “squall”, or they’d over-pronounce the “r” and it would be “squi-rall”

resipsaloquitur@lemmy.world · 7 days ago

Sqverrrrl.

merc@sh.itjust.works · 7 days ago

Oh yeah, I forgot about how they add a “v” sound to it.

sp3ctr4l@lemmy.dbzer0.com · edit-2 7 days ago

I wonder if any of the Axis even bothered to have such a system to check for Americans.

“Bawn-jehr-no”

resipsaloquitur@lemmy.world · 7 days ago

I speak Italian first-best.

I Cast Fist@programming.dev · 7 days ago

It does make sense to use a phoneme the enemy dialect lacks as a verbal check. Makes me wonder if there were any in the Pacific Theatre that decided for “Lick” and “Lollipop”.

cyrano@lemmy.dbzer0.com · 7 days ago

Thanks for sharing

sp3ctr4l@lemmy.dbzer0.com · 7 days ago

don@lemm.ee · 7 days ago

u delet ur account rn

VirgilMastercard@reddthat.com · 8 days ago

Biggest threat to humanity

idiomaddict@lemmy.world · 7 days ago

I know there’s no logic, but it’s funny to imagine it’s because it’s pronounced Mrs. Sippy

merc@sh.itjust.works · 7 days ago

How do you pronounce “Mrs” so that there’s an “r” sound in it?

idiomaddict@lemmy.world · 7 days ago

I don’t, but it’s abbreviated with one.

untorquer@lemmy.world · edit-2 7 days ago

“His property”

Otherwise it’s just Ms.

idiomaddict@lemmy.world · 7 days ago

Mrs. originally comes from mistress, which is why it retains the r.

untorquer@lemmy.world · 7 days ago

Yes but from same source also wife

idiomaddict@lemmy.world · 7 days ago

That came later though, as in “I had dinner with the Mrs last night.”

untorquer@lemmy.world · 7 days ago

Yes but it did come, and took place as the common usage. So much so that Ms. Is used to describe a woman both with and without reference to marital status.

I’m down with using Mrs. not to refer to marital status but imo just going with Ms. Is clearer and easier because of how deeply associated Mrs. Is with it.

merc@sh.itjust.works · 7 days ago

But no “r” sound.

idiomaddict@lemmy.world · 7 days ago

Correct. I didn’t say there was an r sound, but that it was going off of the spelling. I agree there’s no r sound.

jaybone@lemmy.zip · 7 days ago

And if it messed up on the other word, we could say because it’s pronounced Louisianer.

sp3ctr4l@lemmy.dbzer0.com · edit-2 7 days ago

I was gonna say something similar, I have heard a good number of people pronounce Mississippi as if it does have an R in it.

cyrano@lemmy.dbzer0.com · 8 days ago

It is going to be funny those implementation of LLM in accounting software

Slotos@feddit.nl · 8 days ago

https://arxiv.org/html/2410.12835v1

Deceptichum@quokk.au · 7 days ago

Clever, by putting it in Dutch you never know if it’s a typo or Dutch.

cyrano@lemmy.dbzer0.com · 7 days ago

Interesting….troubleshooting is going to be interesting in the future