The image is taken from Zhihu, a Chinese Quora-like site.
The prompt is talking about give a design of a certain app, and the response seems to talk about some suggested pages. So it doesn’t seem to reflect the text.
But this in general aligns with my experience coding with llm. I was trying to upgrade my eslint from 8 to 9, and ask chatgpt to convert my eslint file, and it proceed to spit out complete garbage.
I thought this would be a good task for llm because eslint config is very common and well-documented, and the transformation is very mechanical, but it just cannot do it. So I proceed to read the documents and finished the migration in a couple hour…
I asked ChatGPT with help about bare metal 32-bit ARM (For the Pi Zero W) C/ASM, emulated in QEMU for testing, and after the third iteration of “use printf for output” -> “there’s no printf with bare metal as target” -> “use solution X” -> “doesn’t work” -> “ude printf for output” … I had enough.
Sounds like it’s perfectly replicated the help forums it was trained on.
I used ChatGPT to help me make a package with SUSE’s Open Build Service. It was actually quite good. Was pulling my hair out for a while until I noticed that the project I wanted to build had changes URLs and I was using an outdated one.
In the end I just had to get one last detail right. And then my ChatGPT 4 allowance dried up and they dropped me back down to 3 and it couldn’t do anything. So I had to use my own brain, ugh.
chatgpt is worse among biggest chatbots with writing codes. From my experience Deepseek > Perplexity > Gemini > Claude.
It’s pretty random in terms of what is or isn’t doable.
For me it’s a big performance booster because I genuinely suck at coding and don’t do too much complex stuff. As a “clean up my syntax” and a “what am I missing here” tool it helps, or at least helps in figuring out what I’m doing wrong so I can look in the right place for the correct answer on something that seemed inscrutable at a glance. I certainly can do some things with a local LLM I couldn’t do without one (or at least without getting berated by some online dick who doesn’t think he has time to give you an answer but sure has time to set you on a path towards self-discovery).
How much of a benefit it is for a professional I couldn’t tell. I mean, definitely not a replacement. Maybe helping read something old or poorly commented fast? Redundant tasks on very commonplace mainstream languages and tasks?
I don’t think it’s useless, but if you ask it to do something by itself you can’t trust that it’ll work without singificant additional effort.
Big Beautiful Code
Code that does not work is just text.
No the spell just fizzled. In my experience it happens far less often if you start with an Abra kabara and end it with an Alakazam!
I’ve never thought of it that way. I’m going to add copy writer to my resume.
Maybe fiction writer as well
To be fair, if I wrote 3000 new lines of code in one shot, it probably wouldn’t run either.
LLMs are good for simple bits of logic under around 200 lines of code, or things that are strictly boilerplate. People who are trying to force it to do things beyond that are just being silly.
You managed to get an ai to do 200 lines of code and it actually compiled?
Uh yeah, like all the time. Anyone who says otherwise really hasn’t tried recently. I know it’s a meme that AI can’t code (and still in many cases that’s true, eg. I don’t have the AI do anything with OpenCV or complex math) but it’s very routine these days for common use cases like web development.
Ai code is specifically annoying because it looks like it would work, but its just plausible bullshit.
And that’s what happens when you spend a trillion dollars on an autocomplete: amazing at making things look like whatever it’s imitating, but with zero understanding of why the original looked that way.
I mean, there’s about a billion ways it’s been shown to have actual coherent originality at this point, and so it must have understanding of some kind. That’s how I know I and other humans have understanding, after all.
What it’s not is aligned to care about anything other than making plausible-looking text.
Coherent originality does not point to the machine’s understanding; the human is the one capable of finding a result coherent and weighting their program to produce more results in that vein.
Your brain does not function in the same way as an artificial neural network, nor are they even in the same neighborhood of capability. John Carmack estimates the brain to be four orders of magnitude more efficient in its thinking; Andrej Karpathy says six.
And none of these tech companies even pretend that they’ve invented a caring machine that they just haven’t inspired yet. Don’t ascribe further moral and intellectual capabilities to server racks than do the people who advertise them.
Coherent originality does not point to the machine’s understanding; the human is the one capable of finding a result coherent and weighting their program to produce more results in that vein.
You got the “originality” part there, right? I’m talking about tasks that never came close to being in the training data. Would you like me to link some of the research?
Your brain does not function in the same way as an artificial neural network, nor are they even in the same neighborhood of capability. John Carmack estimates the brain to be four orders of magnitude more efficient in its thinking; Andrej Karpathy says six.
Given that both biological and computer neural nets very by orders of magnitude in size, that means pretty little. It’s true that one is based on continuous floats and the other is dynamic peaks, but the end result is often remarkably similar in function and behavior.
If you would like to link some abstracts you find in a DuckDuckGo search that’s fine.
I actually was going to link the same one I always do, which I think I heard about through a blog or talk. If that’s not good enough, it’s easy to devise your own test and put it to an LLM. The way you phrased that makes it sound like you’re more interested in ignoring any empirical evidence, though.
That’s unreal. No, you cannot come up with your own scientific test to determine a language model’s capacity for understanding. You don’t even have access to the “thinking” side of the LLM.