AI Megathread

CrunkLord420 · Viernes a las 13:06

I'm not even going to try to list out the number of significant features Opus has added to my engine. Also it made it like 50%+ faster. It's not magic. I knew all the things I needed to do, they were all just incredibly laborious and fiddly. Now Opus can just one-shot most of it.

The Gay Oboma Creature · Viernes a las 14:00

CrunkLord420 dijo:
I'm not even going to try to list out the number of significant features Opus has added to my engine. Also it made it like 50%+ faster. It's not magic. I knew all the things I needed to do, they were all just incredibly laborious and fiddly. Now Opus can just one-shot most of it.

AI gamedev? Probably a godsend for artistic types who are good at music and art but refuse to learn2code. Even so, I do think a little bit is lost there, since tuning little things here and there in the code to get a desired effect is a huge part of making a great game, and LLMs often botch the small, hard-to-notice things that give a game the feeling of a labor of love.

Definitely good for rapid prototyping things, though, when you have an idea and want to know if a basic demo is fun.

SchizoDaemon · Viernes a las 15:49

CrunkLord420 dijo:
It's not magic. I knew all the things I needed to do, they were all just incredibly laborious and fiddly. Now Opus can just one-shot most of it.

This is the spot I am in at work. I often know exactly what needs to be done. But it is normally extremely tedious and often cannot be scripted.

The Gay Oboma Creature dijo:
Even so, I do think a little bit is lost there, since tuning little things here and there in the code to get a desired effect is a huge part of making a great game, and LLMs often botch the small, hard-to-notice things that give a game the feeling of a labor of love.

It is really good at spotting lots of little things and allowing you to iterate quickly. It is more like a power multiplier if you already know what needs to be done.

AmpleApricots · Sábado a las 15:55

AI assistance works always the best if you outline exactly what and how you want the AI to do things. Then you can tell it to iterate and change something if it's not good. It is an immense timesaver. Since AI got better in the last few years I have a lot of small programs and scripts I could've written myself easily, but just never bothered because of how tedious it would've been. Real QoL stuff. The other day I had it write an actually working ebuild (gentoo) for basiliskII (old Mac emulator) and write two patches
for another software where apparently the combination of USE flags I used weren't really considered by the maintainer and uncovered a bug. I had to do almost nothing for the latter except provide the failed build log and check if the AI's theory what went wrong actually made sense. Again, I could have done this myself and these were trivial problems, but this saved me real, actual time and it just accumulates. I also simply couldn't have done it that fast. In the time it would've taken me to write the command for creating the diff alone the AI already had written the two required patches by itself.

The only real downside (for me personally) is that AI sucks at writing idiomatic lisp. It basically writes python with a lot of parenthesis. You'd think that's because Lisp is probably not a lot in the datasets but somehow the same models are surprisingly good at smalltalk which makes lisp look like a mainstream language. Generally current, bigger models are somehow all really good at smalltalk. Either it's a language that naturally aligns to LLMs somehow or there are some big and secret codebases out there that ended up in the datasets. This is a bit apropos of nothing but I wanted to mention it somewhere because of how odd it is.

The Gay Oboma Creature · Sábado a las 16:26

AmpleApricots dijo:
Generally current, bigger models are somehow all really good at smalltalk. Either it's a language that naturally aligns to LLMs somehow or there are some big and secret codebases out there that ended up in the datasets. This is a bit apropos of nothing but I wanted to mention it somewhere because of how odd it is.

I think this is about transference. Smalltalk is a simple OOP language, whereas LISP is unique and weird. If LLMs can't write good LISP (try prompting them mentioning the issue and telling them to do it the right way), that'd be a sign that they're not as smart as people say.

DavidS877 · Ayer a las 18:30

Well, my home AI is doing great.
Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 1.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 10.7%, Prefix cache hit rate: 91.2%
Woo, 1.8 tokens per second. I am not actually running it on a Commodore 64. I may have a bit more tuning I need to do. Gemma 4, 31b on 4x Intel B70.

Post Reply · Ayer a las 18:59

DavidS877 dijo:
Well, my home AI is doing great.
Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 1.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 10.7%, Prefix cache hit rate: 91.2%
Woo, 1.8 tokens per second. I am not actually running it on a Commodore 64. I may have a bit more tuning I need to do. Gemma 4, 31b on 4x Intel B70.

Oof. I debated buying a B60 to throw in my NAS, but from what I'm reading the software support just isn't there yet. Are you using vLLM?
I can squeeze Gemma 4 31B Q3M on my 9070 and get like 12t/s. It can't do anything well at that quantization though.

macrodegenerate · Ayer a las 19:57

DavidS877 dijo:
Well, my home AI is doing great.
Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 1.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 10.7%, Prefix cache hit rate: 91.2%
Woo, 1.8 tokens per second. I am not actually running it on a Commodore 64. I may have a bit more tuning I need to do. Gemma 4, 31b on 4x Intel B70.

Okay, so what platform are you serving your LLM from? Ollama, KoboldCpp?

The Gay Oboma Creature · Ayer a las 20:02

Thoughts on this: https://news.ycombinator.com/item?id=48636377?

Seems like people are getting 5.2 at home, now. I could see 5.2 abliterated being a neat service to offer.

DavidS877 · Ayer a las 20:12

macrodegenerate dijo:
Okay, so what platform are you serving your LLM from? Ollama, KoboldCpp?

Post Reply dijo:
Oof. I debated buying a B60 to throw in my NAS, but from what I'm reading the software support just isn't there yet. Are you using vLLM?
I can squeeze Gemma 4 31B Q3M on my 9070 and get like 12t/s. It can't do anything well at that quantization though.

It's Llama.cpp or vLLM or Intel's LLM Scaler. Intel's official LLM Scaler though is months behind and doesn't understand newer models. vLLM works fine after you spend 3-4 weeks trying to find the exact settings that work, but slowly. Llama.cpp works more consistently but is slow.
The software does work eventually. I think single cards work better. I use my 4 cards for image generation with Comfy UI sometimes, but that's 4 independent workstreams, not any of the magical MultiGPU nodes.
GPT-OSS-120B runs fine at "full size" which is actually MXFP4. It's pretty quick but still seems dumb from time to time.
I tried Step3.7-Flash at Q4 which was very quick, but also seemed to not be great at code at that quantization. I also have a Mistral4Small at Q5 I want to try and a Minimax M3 at Q3.
But all of them I want to figure out why I'm seeing such weird errors first.

The Gay Oboma Creature dijo:
Thoughts on this: https://news.ycombinator.com/item?id=48636377?

Seems like people are getting 5.2 at home, now. I could see 5.2 abliterated being a neat service to offer.

256GB RAM

I'll get right on that. The only reason my current system even has 128GB RAM was that it was purchased in the before-times. Also I can't imagine how useful a 1-bit quant is going to be. Maybe if I can get the simple models more stable I'll try it since I do technically have 256GB total between VRAM and RAM.

crazedaze · Ayer a las 21:24

If I hear another list of three things or "it's not blank, it's blank" one more time I swear to God.

SchizoDaemon · Hoy a las 6:23

The Gay Oboma Creature dijo:
Thoughts on this: https://news.ycombinator.com/item?id=48636377?

Seems like people are getting 5.2 at home, now. I could see 5.2 abliterated being a neat service to offer.

The wait for those Macs to run the model is about 3 months in the UK. I've seen Sentdex on YouTube using a monster Dell workstation with what looked like 3 or 4 RTX 6000s in it. Which is about £50,000 worth of GPUs.

SchizoDaemon · Hoy a las 9:58

SchizoDaemon dijo:
The wait for those Macs to run the model is about 3 months in the UK. I've seen Sentdex on YouTube using a monster Dell workstation with what looked like 3 or 4 RTX 6000s in it. Which is about £50,000 worth of GPUs.

BTW, this is the video that I am referencing.

El Kay Why · Hoy a las 11:08

Anyone else feel a model plateau over the past month? Up until the summer started there was a new and interesting model at least every two weeks. Need Gemini 3.5 Pro, Grok 5 & Sonnet/Opus 5 to get my dick hard.

Slurred · Hoy a las 12:14

El Kay Why dijo:
Anyone else feel a model plateau over the past month? Up until the summer started there was a new and interesting model at least every two weeks. Need Gemini 3.5 Pro, Grok 5 & Sonnet/Opus 5 to get my dick hard.

"No," says the man in Washington, "we took Dario's cybersecurity hype marketing at face value."

Irrational Exuberance · Hoy a las 12:50

Slurred dijo:
"No," says the man in Washington, "we took Dario's cybersecurity hype marketing at face value."

"No," says the man in the Vatican, "it could undermine human dignity. Also, you should do penance for having such lascivious thoughts."

macrodegenerate · Hoy a las 13:12

El Kay Why dijo:
Anyone else feel a model plateau over the past month? Up until the summer started there was a new and interesting model at least every two weeks. Need Gemini 3.5 Pro, Grok 5 & Sonnet/Opus 5 to get my dick hard.

I'm negrating you if you monkey's paw Llama 5 into existence in July..

AI Megathread

CrunkLord420

not a financial adviser

The Gay Oboma Creature

The Hat Man planned this, I'm just his ringer.

SchizoDaemon

AmpleApricots

The Gay Oboma Creature

The Hat Man planned this, I'm just his ringer.

DavidS877

2026, year of DOOM.

Post Reply

Don't leave that comment in draft

macrodegenerate

Generative AI was a mistake

The Gay Oboma Creature

The Hat Man planned this, I'm just his ringer.

DavidS877

2026, year of DOOM.

crazedaze

Official Kiwifarms Stickie Grenadier

SchizoDaemon

SchizoDaemon

El Kay Why

Slurred

Irrational Exuberance

SPEND! SPEND! SPEND!

macrodegenerate

Generative AI was a mistake