AI Megathread

  • 🇵🇦 Nuestro primer dominio localizado está en español en kiwifarms.pa. Our first localized domain is on Spanish on kiwifarms.pa.
  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
I'm not even going to try to list out the number of significant features Opus has added to my engine. Also it made it like 50%+ faster. It's not magic. I knew all the things I needed to do, they were all just incredibly laborious and fiddly. Now Opus can just one-shot most of it.


 
I'm not even going to try to list out the number of significant features Opus has added to my engine. Also it made it like 50%+ faster. It's not magic. I knew all the things I needed to do, they were all just incredibly laborious and fiddly. Now Opus can just one-shot most of it.
AI gamedev? Probably a godsend for artistic types who are good at music and art but refuse to learn2code. Even so, I do think a little bit is lost there, since tuning little things here and there in the code to get a desired effect is a huge part of making a great game, and LLMs often botch the small, hard-to-notice things that give a game the feeling of a labor of love.

Definitely good for rapid prototyping things, though, when you have an idea and want to know if a basic demo is fun.
 
It's not magic. I knew all the things I needed to do, they were all just incredibly laborious and fiddly. Now Opus can just one-shot most of it.
This is the spot I am in at work. I often know exactly what needs to be done. But it is normally extremely tedious and often cannot be scripted.
Even so, I do think a little bit is lost there, since tuning little things here and there in the code to get a desired effect is a huge part of making a great game, and LLMs often botch the small, hard-to-notice things that give a game the feeling of a labor of love.
It is really good at spotting lots of little things and allowing you to iterate quickly. It is more like a power multiplier if you already know what needs to be done.
 
AI assistance works always the best if you outline exactly what and how you want the AI to do things. Then you can tell it to iterate and change something if it's not good. It is an immense timesaver. Since AI got better in the last few years I have a lot of small programs and scripts I could've written myself easily, but just never bothered because of how tedious it would've been. Real QoL stuff. The other day I had it write an actually working ebuild (gentoo) for basiliskII (old Mac emulator) and write two patches
for another software where apparently the combination of USE flags I used weren't really considered by the maintainer and uncovered a bug. I had to do almost nothing for the latter except provide the failed build log and check if the AI's theory what went wrong actually made sense. Again, I could have done this myself and these were trivial problems, but this saved me real, actual time and it just accumulates. I also simply couldn't have done it that fast. In the time it would've taken me to write the command for creating the diff alone the AI already had written the two required patches by itself.

The only real downside (for me personally) is that AI sucks at writing idiomatic lisp. It basically writes python with a lot of parenthesis. You'd think that's because Lisp is probably not a lot in the datasets but somehow the same models are surprisingly good at smalltalk which makes lisp look like a mainstream language. Generally current, bigger models are somehow all really good at smalltalk. Either it's a language that naturally aligns to LLMs somehow or there are some big and secret codebases out there that ended up in the datasets. This is a bit apropos of nothing but I wanted to mention it somewhere because of how odd it is.
 
Generally current, bigger models are somehow all really good at smalltalk. Either it's a language that naturally aligns to LLMs somehow or there are some big and secret codebases out there that ended up in the datasets. This is a bit apropos of nothing but I wanted to mention it somewhere because of how odd it is.
I think this is about transference. Smalltalk is a simple OOP language, whereas LISP is unique and weird. If LLMs can't write good LISP (try prompting them mentioning the issue and telling them to do it the right way), that'd be a sign that they're not as smart as people say.
 
Well, my home AI is doing great.
Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 1.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 10.7%, Prefix cache hit rate: 91.2%
Woo, 1.8 tokens per second. I am not actually running it on a Commodore 64. I may have a bit more tuning I need to do. Gemma 4, 31b on 4x Intel B70.
 
Well, my home AI is doing great.
Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 1.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 10.7%, Prefix cache hit rate: 91.2%
Woo, 1.8 tokens per second. I am not actually running it on a Commodore 64. I may have a bit more tuning I need to do. Gemma 4, 31b on 4x Intel B70.
Oof. I debated buying a B60 to throw in my NAS, but from what I'm reading the software support just isn't there yet. Are you using vLLM?
I can squeeze Gemma 4 31B Q3M on my 9070 and get like 12t/s. It can't do anything well at that quantization though.
 
Well, my home AI is doing great.
Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 1.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 10.7%, Prefix cache hit rate: 91.2%
Woo, 1.8 tokens per second. I am not actually running it on a Commodore 64. I may have a bit more tuning I need to do. Gemma 4, 31b on 4x Intel B70.
Okay, so what platform are you serving your LLM from? Ollama, KoboldCpp?
 
Okay, so what platform are you serving your LLM from? Ollama, KoboldCpp?
Oof. I debated buying a B60 to throw in my NAS, but from what I'm reading the software support just isn't there yet. Are you using vLLM?
I can squeeze Gemma 4 31B Q3M on my 9070 and get like 12t/s. It can't do anything well at that quantization though.
It's Llama.cpp or vLLM or Intel's LLM Scaler. Intel's official LLM Scaler though is months behind and doesn't understand newer models. vLLM works fine after you spend 3-4 weeks trying to find the exact settings that work, but slowly. Llama.cpp works more consistently but is slow.
The software does work eventually. I think single cards work better. I use my 4 cards for image generation with Comfy UI sometimes, but that's 4 independent workstreams, not any of the magical MultiGPU nodes.
GPT-OSS-120B runs fine at "full size" which is actually MXFP4. It's pretty quick but still seems dumb from time to time.
I tried Step3.7-Flash at Q4 which was very quick, but also seemed to not be great at code at that quantization. I also have a Mistral4Small at Q5 I want to try and a Minimax M3 at Q3.
But all of them I want to figure out why I'm seeing such weird errors first.
Thoughts on this: https://news.ycombinator.com/item?id=48636377?

Seems like people are getting 5.2 at home, now. I could see 5.2 abliterated being a neat service to offer.
I'll get right on that. The only reason my current system even has 128GB RAM was that it was purchased in the before-times. Also I can't imagine how useful a 1-bit quant is going to be. Maybe if I can get the simple models more stable I'll try it since I do technically have 256GB total between VRAM and RAM.
 
Anyone else feel a model plateau over the past month? Up until the summer started there was a new and interesting model at least every two weeks. Need Gemini 3.5 Pro, Grok 5 & Sonnet/Opus 5 to get my dick hard.
 
Anyone else feel a model plateau over the past month? Up until the summer started there was a new and interesting model at least every two weeks. Need Gemini 3.5 Pro, Grok 5 & Sonnet/Opus 5 to get my dick hard.
"No," says the man in Washington, "we took Dario's cybersecurity hype marketing at face value."
 
Anyone else feel a model plateau over the past month? Up until the summer started there was a new and interesting model at least every two weeks. Need Gemini 3.5 Pro, Grok 5 & Sonnet/Opus 5 to get my dick hard.
I'm negrating you if you monkey's paw Llama 5 into existence in July..
 
Última edición:
Atrás
Top Abajo