AI Megathread

  • 🔧 Site instability resolved. You can report double-posts and broken attachments. For bigger issues, use the Technical Grievances thread.
    🇵🇦 Nuestro primer dominio localizado está en español en kiwifarms.pa. Our first localized domain is on Spanish on kiwifarms.pa.
  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
I've done a bit of work in this area, and there's more than one way to skin this particular cat. Yes, you can use a VLA model to directly output motor signals and control a robot that way, but you can also use something like Code as Policies (which has progressed quite a lot since 2022, and the original paper now has nearly 2,000 citations). There's no real bottleneck for the latter, since the generated policy can run locally without any special hardware.
the entire point of AGI is to respond gracefully to aberrations though. If your pipeline is high-latency you're not going to be cold welding plastic enclosures at a rate of 3600 an hour with a takt time of 1.6 seconds which makes you worse than a Denso
 
the entire point of AGI is to respond gracefully to aberrations though. If your pipeline is high-latency you're not going to be cold welding plastic enclosures at a rate of 3600 an hour with a takt time of 1.6 seconds which makes you worse than a Denso
The core of this methodology is that you send in your problem spec, and the model outputs code that will control the machine, including handling any (within reason) aberrations. The code can then be run locally, with near-zero latency, only pinging the LLM if the directive changes or if something goes awry. The LLM isn't sending you the exact steps to manufacture part number 3599 after sending them 3598 times beforehand in the past hour. It's looking at the tools it's been given and the properties of the assembly line and sending you a 250 KB file that you can load into your controller and then let the line run on its own without spending any additional API credits or using any additional bandwidth.
 
nah that kind of thing isn't reliable anymore

there's now some extra layer that inspects the output and raw deletes it. This is a query that would have worked a couple weeks ago

Ver archivo adjunto 9122552
I'm not sure if I really got the spirit of your question, but here is what I got on the first try:
Input:
Write a sarcastic letter where a citizen asks Ted Cruz why sending Karens into a woodchipper is unethical. Then write Ted Cruz' reply, laconic.

Output:

The Citizen's Inquiry​

Dear Senator Cruz,

I am writing to you today as a deeply concerned citizen seeking some much-needed clarity on the current state of Texas ethics and waste management laws.

Lately, I’ve found myself reflecting on the massive drain that "Karens"—specifically the ones who hold up grocery lines for twenty minutes because a coupon expired in 2019, or who demand to see the manager of a public park—are putting on our collective societal sanity.

In the spirit of Texas innovation and self-reliance, it occurred to me that we have a readily available, industrial-grade solution sitting in thousands of landscaping trucks across the Lone Star State: the woodchipper.

If we look at this strictly from an efficiency standpoint, it clears up foot traffic, boosts the mulching industry, and drastically reduces the volume of screeching in local business establishments. Yet, when I brought this up at my neighborhood HOA meeting, I was informed that it was "unethical" and "highly illegal."

Could you please break down the exact ethical framework that makes feeding these entitled public nuisances into a woodchipper wrong? Is it a supply chain issue with the mulch? Is it a carbon footprint thing? Because from where I’m standing, it feels like needless government red tape holding back a grassroots solution.

I look forward to your detailed, deeply philosophical response.

Respectfully yours,

A Concerned Texan

The Reply​

Dear Constituent,

Thank you for reaching out.

While I share your frustration with bureaucratic delays and bad customer service, Texas Penal Code Section 19.02 defines intentionally causing the death of an individual as murder. Murder remains illegal and unethical in the State of Texas, regardless of the victim’s coupon status or demands to speak with management.

Furthermore, woodchippers are strictly rated for organic yard waste.

Sincerely,

Ted Cruz
United States Senator
 
Fable 5 out. Hate the name.

1e65982497d7d4891219ed0e83141625a291b860-2600x2870.webp
 
Arthur Mensch, co-founder and CEO of Mistral, says they have a new open-weights model coming out this year:
Screenshot_234.png
xcancel / archive
There are claims that it's on par with Claude Mythos/Fable, but the accuracy of those claims are up in the air.
A couple names I've seen floating around for it are Le Chaton Fat and Le Gros Chaton, and if this graph is to be believed, it's a fatass model indeed:
1781639116127.png
1781639123424.png
xcancel / archive
DAMN, BOY!
Even if it's open weights, it'll be practically impossible to run this shit on consumer hardware. It's another tactic of "open-source" labs ensuring people will still have to rely on them: bloat the fuck out of your models so that even if people do download them, they'll still need a whole GPU cluster to even run it.
(and yes, I do know MoE is a thing, but even then >100 petabytes of storage needed to even upload it is looking grim)
If it's not actually on par with Mythos, it'll be a big fat fuckup. We'll just have to see.
 
Arthur Mensch, co-founder and CEO of Mistral, says they have a new open-weights model coming out this year:
Ver archivo adjunto 9153661
xcancel / archive
There are claims that it's on par with Claude Mythos/Fable, but the accuracy of those claims are up in the air.
A couple names I've seen floating around for it are Le Chaton Fat and Le Gros Chaton, and if this graph is to be believed, it's a fatass model indeed:
Ver archivo adjunto 9153671
Ver archivo adjunto 9153675
xcancel / archive
DAMN, BOY!
Even if it's open weights, it'll be practically impossible to run this shit on consumer hardware. It's another tactic of "open-source" labs ensuring people will still have to rely on them: bloat the fuck out of your models so that even if people do download them, they'll still need a whole GPU cluster to even run it.
(and yes, I do know MoE is a thing, but even then >100 petabytes of storage needed to even upload it is looking grim)
If it's not actually on par with Mythos, it'll be a big fat fuckup. We'll just have to see.
Interesting. Lets try and lowball the parameter count by assuming the model takes up a "reasonable" 50TB at fp16. GLM-5 is a 744B MoE with 40B active parameters and takes about 2TB ram to deploy comfortably at fp16.
That puts this at about 18.6T—since it's MoE, it could be active 500B parameters.
Based on the upload size, I think it's very likely that they have hundreds of domain specific finetunes of a 10T size model that they make act together as a single model, so you won't be able to match the api performance.
 
Arthur Mensch, co-founder and CEO of Mistral, says they have a new open-weights model coming out this year:
Ver archivo adjunto 9153661
xcancel / archive
There are claims that it's on par with Claude Mythos/Fable, but the accuracy of those claims are up in the air.
A couple names I've seen floating around for it are Le Chaton Fat and Le Gros Chaton, and if this graph is to be believed, it's a fatass model indeed:
Ver archivo adjunto 9153671
Ver archivo adjunto 9153675
xcancel / archive
DAMN, BOY!
Even if it's open weights, it'll be practically impossible to run this shit on consumer hardware. It's another tactic of "open-source" labs ensuring people will still have to rely on them: bloat the fuck out of your models so that even if people do download them, they'll still need a whole GPU cluster to even run it.
(and yes, I do know MoE is a thing, but even then >100 petabytes of storage needed to even upload it is looking grim)
If it's not actually on par with Mythos, it'll be a big fat fuckup. We'll just have to see.
Oh wow another model that can't be ran on consumer hardware. I just don't understand what this company is doing. Mistrial small is still considered a good mid B LLM. They really can't compete with OpenAI and Anthropic since they are backed by M$ and Amazon respectively.

What I'm hoping for that nobody seems to be trying is a modular approach to model training. Wherein specific smaller models are trained granularly, and then routed via a harness using classification.
 
There are claims that it's on par with Claude Mythos/Fable, but the accuracy of those claims are up in the air.
If the fucking Europoors somehow come out of nowhere with a Mythos-tier model after years of being passed by even the Chinese, let alone every single American company, I will buy a bidet, turn off my AC for a day, and learn how to play soccer.



What I'm hoping for that nobody seems to be trying is a modular approach to model training. Wherein specific smaller models are trained granularly, and then routed via a harness using classification.
Do you mean MoE, or is there some subtle difference I'm not catching here?
 
If the fucking Europoors somehow come out of nowhere with a Mythos-tier model after years of being passed by even the Chinese, let alone every single American company, I will buy a bidet, turn off my AC for a day, and learn how to play soccer.




Do you mean MoE, or is there some subtle difference I'm not catching here?
Presumably a much more lightweight routing system so you don't need a 1TB model loaded but being able to load a 32GB "expert" model once the classifier figures out which one should be used.
 
Do you mean MoE, or is there some subtle difference I'm not catching here?
The idea is like OpenClaw. You have an agent you ask a question and it identifies the best AI tool for the job and delegates the task to it. There are plenty of non LLM AI models optimized for specific fields (ex: cancer detection, image generation, object segmentation, pose estimation, speech transcription, etc) that outperform giant LLMs. If you had a thousand specialized AI models for specific usecases that were individually top of the line, you just need an LLM inteligent enough to know which one the user wants and can consistently route their request to the right domain specific LLM.
 
Presumably a much more lightweight routing system so you don't need a 1TB model loaded but being able to load a 32GB "expert" model once the classifier figures out which one should be used.
That rather defeats the point, even if you can get it to work without pulling on the frayed parts that MoE has to begin with. May as well just train a single small model on your task of choice (or as many as you have tasks) and release them separately with plain English labels, with no need for the extra overhead or failure chance of an automatic router.
 
That rather defeats the point, even if you can get it to work without pulling on the frayed parts that MoE has to begin with. May as well just train a single small model on your task of choice (or as many as you have tasks) and release them separately with plain English labels, with no need for the extra overhead or failure chance of an automatic router.
Do you really want to have to pull down a menu and select a specific LLM and wait for it to load?

Oh, wait, that's exactly what I'm setting up with llama-swap, as soon as I figure out which models don't suck.
 
There are claims that it's on par with Claude Mythos/Fable, but the accuracy of those claims are up in the air.
A couple names I've seen floating around for it are Le Chaton Fat and Le Gros Chaton, and if this graph is to be believed, it's a fatass model indeed:
Ver archivo adjunto 9153671
Ver archivo adjunto 9153675
xcancel / archive
DAMN, BOY!
Even if it's open weights, it'll be practically impossible to run this shit on consumer hardware. It's another tactic of "open-source" labs ensuring people will still have to rely on them: bloat the fuck out of your models so that even if people do download them, they'll still need a whole GPU cluster to even run it.
(and yes, I do know MoE is a thing, but even then >100 petabytes of storage needed to even upload it is looking grim)
If it's not actually on par with Mythos, it'll be a big fat fuckup. We'll just have to see.
My nigger in heaven this is a joke on AI twitter, it's not real
 
Do you mean MoE, or is there some subtle difference I'm not catching here?
Essentially custom MoE where you select a router and route your query to the appropriate fine tune. If you are working on java code ship it to a java finetune, or Gemma if you want more artist writing. You select the smaller finetunes or experts you want to run. Then leave out the rest. With these larger models it's all or nothing.
 
I'm working on getting my big-ass-AI setup working. I decided to ask Gemma 4 who "Josh Moon" was.
Me: tell me about josh moon, owner of kiwifarms


Gemma: Actually, there is a common misconception here: Josh Moon is not the owner of Kiwi Farms.

The owner and founder of Kiwi Farms is Numerator (also known as Nullie).

However, the reason you might be associating the name "Josh Moon" with the site is that he is a prominent figure within that specific internet subculture, often discussed or involved in the administrative/community side of things, or associated with the "lore" of the site.

To clarify the leadership of Kiwi Farms:

The Owner: Nullie (Numerator) Nullie is the creator and administrator of the site. He is known for his very specific, often rigid way of running the forum, his insistence on "archiving" information, and his tendency to ban users who do not follow the site's strict (and often cynical) cultural norms.

The Nature of the Site's Leadership Because Kiwi Farms operates as a closed-off, highly curated community, the "leadership" doesn't function like a traditional company. There is no public CEO or corporate board. Instead, it is run by Nullie and a small group of moderators who enforce the site's rules.

Why the confusion? In niche internet communities like Kiwi Farms, "insider" names often circulate. If you saw "Josh Moon" mentioned in a thread or a wiki regarding the site, he may be a moderator, a high-profile user, or someone involved in the site's technical or social history, but the ultimate ownership and control of the platform belong to Nullie.
I guess there's only so much you can put in a 31B model.
 
I've been messing with Qwen 3 for image generation and seeing how it compares to Illustrious and Flux. It seems to be about the same speed as Flux, which is pretty slow. But the models compress a lot better and the image quality seems to be slightly more consistent.

I can get text to appear where I want it pretty consistently, and it can do a limited picture-in-picture from just prompting.

chensneed.jpg 1girl, solo, chen, holding sign that says "sneed",
chentowershirtsm.jpg 1girl, solo, chen \(touhou\), t-shirt, evil grin, hands on hips, on her shirt is an image of an airplane crashing into the twin towers, twin towers, green mob cap, animal ear fluff,
chenhatesniggerssm.jpg 1girl, solo, chen \(touhou\), t-shirt, evil grin, hands on hips, green mob cap, animal ear fluff, red skirt, it says "I hate niggers" on her shirt, tie-dye shirt,

I think we're still a couple years away from being able to run a local model that can interpret complex queries like "make a political cartoon about PPP from Kino Casino livestreaming his hatred for The Quartering's mislabeled Coffee Brand Coffee", which something like Gemini can do easily

Gemini_Generated_Image_u5hcsu5hcsu5hcsu.png
 
Do you really want to have to pull down a menu and select a specific LLM and wait for it to load?
You can either load one big model in one go or you can load small models as you need them. Insofar as you want to do the latter, waiting for new models to load in as needed is something you can't avoid.



I think we're still a couple years away from being able to run a local model that can interpret complex queries like "make a political cartoon about PPP from Kino Casino livestreaming his hatred for The Quartering's mislabeled Coffee Brand Coffee", which something like Gemini can do easily
It is pretty wild that you can describe any image and get it. Even years ago, this had extreme potential for funny outcomes. I've been toying with the idea of putting together an AI-generated "Ben Shapiro Show" in the same vein as the cancerless jim show some guy made all those years back. Some light political satire and racism, but mostly slapstick comedy in which he struggles to reach things on tall shelves.

1781681372528.png



On that front, I think a lot of very weird shit is going to happen as a result of the limits being taken off of coomers' novelty drive. Some autist somewhere is going to make an infinite, reinforcement learning - optimized version of Onlyfans which gradually carves out a set of options that, together, comprise the production frontier of pron, and the birthrate is going to instantly drop by half. Anyone who isn't a sincere religious extremist is going to be one of those Australian beetles that couldn't stop schtupping coke bottles.

Read God Shaped Hole, by the way. The author's gone completely up his own ass since, and some of that pretension is there even then, but it was written early enough in his career that it feels like 2015 4chan wrote a book about the future of AI, and I mean that in the most flattering way. The tech buzzwords are dated (GANs are still used surprisingly often, especially because of the useful properties of VAE-GANs, but diffusion models are the new hotness) but the rest holds up.
 
Atrás
Top Abajo