More

jimbob45 · 2026-03-14T21:27:57 1773523677

Llama should be mentioned in the same conversations that ChatGPT, Claude, and DeepSeek dominate. If only it wasn’t so inaccessible…

FartyMcFarter · 2026-03-14T23:27:02 1773530822

They have an app for it. I think it's as accessible as any of the other models you mentioned. It's just not as good.

wolvoleo · 2026-03-15T02:59:08 1773543548

Not so good right now but they haven't released one in a while. Llama 3.1 was pretty great when it came out.

the_black_hand · 2026-03-17T04:37:00 1773722220

Yeah strategy is weird. PyTorch and llama 1-3 were strong successes. Llama 4 was a dud but that happens sometimes. Google fumbles a few times before Gemini too. What I don’t get is why they didn’t prioritize those projects. They weren’t making money, but it was a solid start and a good way to get a foothold in the game. Instead they’ve gone balls deep in slop bullshit.

jimbob45 · 2026-03-11T18:23:00 1773253380

Would having a locally-hosted model offset any of these costs?

kennywinker · 2026-03-11T18:59:17 1773255557

Yes, but that comes at the cost of using a dumber llm. The state of the art ones are only available via commercial api, and the best self-hostable models require $10,000+ gpus.

This is a problem for coding as smarter really has an impact there, but there are so so so many tasks that an 8b model that runs on a $200 gpu can handle nicely. Scrape this page and dump json? Yeah that’s gonna be fine.

This is my conclusion based on a week or so of using ollama + qwen3.5:3b self hosted on a ~10 year old dell optiplex with only the built-in gpu. You don’t need state of the art to do simple tasks.

gbro3n · 2026-03-12T06:27:58 1773296878

I saw that the Hetzner matrix like has GPU servers < £300 per month (plus set up fee). I haven't tried it but I think if I was getting up to that sort of spend I'd be setting up Ollama on one of those with a larger Qwen3 max model (which I hear is on par with Opus 4.5?? - I haven't been able to try Qwen yet though so that could be b*****ks).

hhh · 2026-03-12T07:42:03 1773301323

I have tried most of the major open source models now and they all feel okay, but i’d prefer Sonnet or something any day over them. Not even close in capability for general tasks in my experience.

gbro3n · 2026-03-14T08:37:34 1773477454

I suspected that might be the case. I'm sure one day soon though there will be a local model as capable as Opus 4.6

TheDong · 2026-03-12T00:41:54 1773276114

> Scrape this page and dump json? Yeah that’s gonna be fine.

Only gonna be fine on a trusted page, an 8b model can be prompt injected incredibly trivially compared to larger ones.

kennywinker · 2026-03-12T01:01:15 1773277275

Relying on the model to protect you seems like a bad idea…

TheDong · 2026-03-12T03:11:20 1773285080

I mean, clawbots are inherently insecure. Using a better model is defense in depth.

Obviously you should also take precautions, like never instructing it to invoke the browser tool on untrusted sites, avoiding feeding it untrusted inputs where possible in other places, giving it dedicated and locked-down credentials where possible....

But yeah, at this point it's inherent to LLMs that we cannot do something like SQL prepared statements where "tainted" strings are isolated. There is no perfect solution, but using the best model we can is at least a good precaution to stack on top of all our other half-measures.

TheDong · 2026-03-12T00:12:57 1773274377

Generally the benefit you get out of claws involves untrusted input, i.e. it using the browser tool to scrape websites, etc.

Claude 4.6 is at least a bit resilient to prompt injection, but local models are much worse at that, so using a local model massively increases your chance of getting pwned via a prompt injection, in my estimation.

You're kinda forced to use one of the better proprietary models imo, unless you've constrained your claw usage down to a small trusted subset of inputs.

robthompson2018 · 2026-03-11T20:04:04 1773259444

Our starter plan gives you a machine with 2GB of RAM. You will not be able to run a local LLM. OpenRouter has free models (eg Z.ai: GLM 4.5 Air), I recommend those.

jimbob45 · 2026-03-08T23:41:24 1773013284

Laziness. Grocery stores offer incredibly healthy, tasty, and cheap options now that take two minutes to cook and can be stored for a week.

jimbob45 · 2026-03-06T08:12:48 1772784768

And nobody else because the geniuses at Reuters thought it was a good idea to make it an exclusive. Also paywalled.

jimbob45 · 2026-02-28T00:06:07 1772237167

MySpace would have won had they not been outcompeted by virtue of their momentum though.

jimbob45 · 2026-02-23T05:22:10 1771824130

Then it should be “This is your first and final warning. The next time we catch you, it’s a ban.”. People are building their lives around this stuff and kneejerk bans erode good faith in your platform.

lelanthran · 2026-02-23T07:17:45 1771831065

> Then it should be “This is your first and final warning. The next time we catch you, it’s a ban.”. People are building their lives around this stuff and kneejerk bans erode good faith in your platform.

This is actually the soft-touch approach: the users of these vibe-coded products need to understand that they are delegating their authority to the tool to work on their behalf.

In this case, they delegated to a tool that broke the ToS. The result could have been a lot worse, and in return they learned that the tool is acting with their full authority.

-----------------

EDIT:

One of the users got this response from google support:

> Our product engineering team has confirmed that your account was suspended from using our Antigravity service. This suspension affects your access to the Gemini CLI and any other service that uses the Cloud Code Private API.

Their decision? To break ToS on some other provider:

> I guess it is time to move on to Codex or Claude Code.

So, yeah, perhaps the users really are too stupid to understand what's going on, and even this soft-touch approach has done nothing to clue them in.

pandini · 2026-02-23T07:59:47 1771833587

Except it's expressly NOT against the TOS of codex to use it via oAuth with Openclaw (the jury is currently out re Anthropic)

Shooti · 2026-02-23T20:38:04 1771879084

The difference is ChatGPT Pro/Plus plans have one shared pool of token limits shared across all use cases.

In contrast Google's AI plans give you at least three seperate pools of token usage limits: Gemini App + Antigravity/Other Code Assist tools like Android Studio + AI Studio free usage limits.

Google limit the context of where you can use their tokens but in exchange they give you substantially more.

jimbob45 · 2026-02-20T19:38:05 1771616285

Wouldn’t it still provide massive benefits if they could convince/coerce their most popular downloaded models to move to torrenting?

intrasight · 2026-02-21T11:48:48 1771674528

Benefit to you, but great downside to the three letter agencies that inject their goods into these models.

jimbob45 · 2026-02-20T08:36:06 1771576566

I can push 130WPM with some serious warmup on QWERTY. Even still…I can feel its inadequacy. The semicolon sitting unused under my pinky is just such a massive waste. The period there instead would be a game-changer.

silon42 · 2026-02-20T17:24:19 1771608259

You must not program C/Java/... ;)

jimbob45 · 2026-02-20T21:54:50 1771624490

It still feels bad because you most often have to jump and aim your pinky to hit enter afterwards. I guess those who write minified JS are laughing straight to the bank though.

jimbob45 · 2026-02-19T02:31:53 1771468313

I’m not even a neophyte here but why don’t precompiled shaders solve that?

MindSpunk · 2026-02-19T04:17:09 1771474629

Depends what you're precompiling.

For Vulkan you already ship "pre-compiled" shaders in SPIR-V form. The SPIR-V needs to be compiled to GPU ISA before it can run.

You can't, in general, pre-compile the SPIR-V to GPU ISA because you don't know the target device you're running on until the app launches. You would have to precompile ISA for every GPU you ever plan to run on, for every platform, for every driver version they've ever released that you will run on. Also you need to know when new hardware and drivers come out and have pre-compiled ISA ready for them.

Steam tries to do this. They store pre-compiled ISA tagged with the GPU+Driver+Platform, then ship it to you. Kinda works if they have the shaders for a game compiled for your GPU/Driver/Platform. In reality your cache hit rate will be spotty and plenty of people are going to stutter.

OpenGL/DirectX11 still has this problem too, but it's all hidden in the driver. Drivers would do a lot of heroics to hide compilation stutter. They'd still often fail though and developers had no way to really manage it out outside of some truly disgusting hacks.

Gigachad · 2026-02-19T04:45:01 1771476301

There's two tiers of precompiled though. Even if you can't download them precompiled, you can compile before the game launches so there are no stutters after.

MindSpunk · 2026-02-19T05:09:37 1771477777

Yes, many games do that too. Depending on how many shaders the game uses and how fast the user's CPU is an exhaustive pre-compile could take half an hour or more.

But in reality the exhaustive pre-compile will compile way more than will be used by any given game session (on average) and waste lots of time. Also you would have to recompile every time the user upgraded their driver version or changed hardware. And you're likely to churn a lot of customers if you smack them with a 30+ minute loading screen.

Precisely which shaders get used by the game can only be correctly discovered at runtime in many games, it depends on the precise state of the game/renderer and the quality settings and often hardware vendor if there are vendor-specific code paths.

Some games will get QA to play a bunch of the game, or maybe setup automated scripts to fly through all the levels and log which shaders get used. Then that log gets replayed in a startup pre-compile loading screen so you're at least pre-compiling shaders you know will be used.

Gigachad · 2026-02-19T05:19:23 1771478363

I don't think this is as much of an issue as you are making it out to be. I have my Steam Deck on the main branch release which seems to exclude it from downloading precompiled shaders. When a game updates it has to compile the shaders first, but even on a big game this does not take an unreasonable amount of time. Less time than it takes for game updates to download at least.

Steam could improve the experience here by having the shaders compile overnight in the background so it presents zero delay but the current way doesn't bother me much at all.

MindSpunk · 2026-02-19T06:37:59 1771483079

I remember Star Wars Jedi Survivor had a 5-6 minute shader pre-compile on my 5950X. I heard of people well into the 30 minute mark on lower core count machines. Battlefield 6 was a few minutes on my 9950X, higher again on lower core count CPUs.

Really depends on the game.

There's no easy way around this problem. It never came up as much in the OpenGL/D3D11 era because we didn't make as many shaders back then. Shader graphs and letting artists author shaders really opened pandoras box on this problem, but OpenGL was already on its way out by the time these techniques were proliferating so Vulkan gets lumped in as the cause.

rufo · 2026-02-19T05:58:30 1771480710

You're getting lucky with the games you're playing, then; there are absolutely PC games that have had 20-30 minute long shader compilation times _on high-end gaming hardware_. (I think some of Sony's ports were known for this; Googling tells me Borderlands 4, Stalker 2, and Starfield also had notably long shader times.) Typically those occur within the game's UI after launch but before the game starts playing, though, which makes me wonder if Valve might still be caching a non-GPU-specific intermediate of the DX12 to Vulkan conversion, and _that's_ what Linux Steam clients are compiling pre-launch and/or sharing with other clients. That's pure speculation on my part though, as I haven't played any of the worst-case-scenario games on my Deck, nor have I done anything that would cause the shader downloading to not operate.

reorder9695 · 2026-02-19T09:48:59 1771494539

So is this why on my laptop when I start a game after an update it starts "compiling vulkan shaders" for a few minutes? I've never understood what that was actually for but it takes 100% CPU on all cores so it's clearly doing something

raincole · 2026-02-19T03:12:04 1771470724

It kinda does. Kinda. Steam constantly downloads precompiled shaders for your games. Especially on Linux.

ozarkerD · 2026-02-19T03:00:37 1771470037

Can't precompile for all the combinations of hardware, driver version, operating systems, etc... It's not really a vulkan specific problem and it's hard to solve. (for desktops anyways)

jimbob45 · 2026-02-18T05:11:45 1771391505

Shhhh we’re pretending that China wasn’t behind the massive propaganda campaign against Tesla to boost the popularity of their own BYD brand.

wolvoleo · 2026-02-18T07:21:48 1771399308

No, he's done that all by himself sorry.

Turns out nazi salutes don't make one popular here in Europe.

schubidubiduba · 2026-02-18T18:09:24 1771438164

I find it highly unlikely that the Chinese drove him to donthe Nazi salute

throwerxyz · 2026-02-18T06:38:56 1771396736

I'm surprised people are forgetting that the person who predicted the coming wars against his personality was himself. He basically told everyone what was coming... then it came and people still fell for it.

Elon is a loser socially and that's about it.

Tarq0n · 2026-02-18T07:20:09 1771399209

DOGE and the nazi sale happened. That's not just problematic social media posting.

throwerxyz · 2026-02-19T00:01:31 1771459291

DOGE promised the world and delivered small wins. The normal politician does this and does not deliver a single win.

Braxton1980 · 2026-02-19T12:41:13 1771504873

And people hate politicians