More

cj · 2026-03-10T12:59:17 1773147557

Like anything else, I think it comes down to having a good use case.

I've gotten deep into weightlifting/bodybuilding over the past couple of years, and that's the kind of hobby where micro-optimizations and data tracking can have a pretty big impact on results (and sort of necessary, you can't fly blind with things like diet, especially)

E.g. I track and weigh everything I eat, take body measuraments on a weekly basis, Dexa scans every few months, etc - for me it's worth it because I know what I want to do with the data. If I didn't have a goal, all that tracking would clearly be overkill.

afc · 2026-03-10T13:38:49 1773149929

How long have you been tracking? Can you share an insight you've had from your data?

I've been weight lighting for ten years and initially tried to track things (down to how many reps I did of which exercise, with how much weight) and quickly came to the conclusion that is want worth it for me.

kworks · 2026-03-10T16:23:00 1773159780

I initially came to the same conclusion. Though I lifted in accord with decent training principles regarding reps and sets, I didn't track for years. As I entered middle age, I started keeping a training log (just one big org file in emacs), mostly out of curiousity. As I entered my 50s, I experienced what Haruki Murakami references in "What I Talk About When I Talk About Running" --- Fat is easy to gain and hard to lose. Muscle is hard to gain and easy to lose. Now I track a couple of critical metrics and it's working great. I weigh first thing every day, track all kcals (even if I overeat), plan and track workouts. I write my own plans pulled from principles in these books (don't work for the company, just a satisfied customer) https://muscleandstrengthpyramids.com/ I don't use the vast majority of the info in those books as I'm just a hobbyist who wants to be healthy and strong. The biggest shift came from learning I was doing waaay too much training volume at the gym while trying to lose fat too quickly; a fine recipe for injury. Now, when I'm in a fat loss phase, I try to lose it as slowly as possible while still making progress. Strength training and fat loss is a very long very slow marathon, not a sprint. Perhaps paradoxically, the awareness that's come from tracking has helped me relax. No need to major in the minors; pretty good is pretty good. The tools I use are a scale, loseit, and org-mode.

cj · 2026-03-10T14:04:59 1773151499

The goal in bodybuilding during a gaining phase is to be in a very slight calories surplus (200-300 calories above maintenance, at most) to maximize the amount of time you're building muscle before you need to cut again (bring calories back to a deficit to shed body fat).

Tracking scale weight is difficult because shifts in water weight and hydration can swing the scale 5+ pounds in either direction without any change in body fat. So I pair scale weight with a 7-point skin caliper measurements taken on a weekly basis, along with waist circumference, in order to infer whether body fat is trending up or down. And also take weekly progress photos of 6 angles/poses with consistent lighting, which I share with a coach.

And then you pair that with weighing and logging everything you eat, and you can make small adjustments to your meal plan on a monthly basis to try to stay in that 200-300 calorie per day surplus for as long as possible. (Although most bodybuilding coaches adjust diet based purely on how your physique is changing in weekly check-in photos without the need for measurements, but I like extra data)

> down to how many reps I did of which exercise, with how much weight)

I also do this. Track every exercise, every weight, number of reps. It's necessary for knowing whether you're progressively overloading over long periods of time. Progressive overload becomes harder to measure once you're past newbie gains because you can't increase weight every week, so some weeks the goal is just to squeeze out an extra couple of reps. Which adds up over time

This is obviously excessive for 99% of people. But I enjoy doing it as a hobby. I would absolutely not recommend this level of tracking for health reasons (not necessary) - I find enjoyment in the process.

yoz-y · 2026-03-10T13:57:57 1773151077

I track the reps weights of every exercise (in my own app). But the historical values are only useful up to last couple of weeks just to now if the general trends go up and what is stalled. Unless your goals are the numbers themselves and not health, I don’t think there is a reason to track everything. But it is fun.

nkrisc · 2026-03-10T13:56:17 1773150977

True, if you have a current and real need for the data, then it makes sense to collect it. But that’s an entirely different scenario.

cj · 2026-03-10T12:33:50 1773146030

It's wild to me that there haven't been more court cases to answer questions like those being asked in this thread.

No one knows.

dec0dedab0de · 2026-03-10T15:14:05 1773155645

It's new, fast-moving technology, and the courts are slow and expensive.

It would take two stubborn businesses with a lot of money deciding that it is better to battle it out than focus on their business. Something like IBM v SCO or Oracle v Google.

cj · 2026-03-09T21:33:14 1773091994

I love the tax use case.

What scares me though is how I've (still) seen ChatGPT make up numbers in some specific scenarios.

I have a ChatGPT project with all of my bloodwork and a bunch of medical info from the past 10 years uploaded. I think it's more context than ChatGPT can handle at once. When I ask it basic things like "Compare how my lipids have trended over the past 2 years" it will sometimes make up numbers for tests, or it will mix up the dates on a certain data points.

It's usually very small errors that I don't notice until I really study what it's telling me.

And also the opposite problem: A couple days ago I thought I saw an error (when really ChatGPT was right). So I said "No, that number is wrong, find the error" and instead of pushing back and telling me the number was right, it admitted to the error (there was no error) and made up a reason why it was wrong.

Hallucinations have gotten way better compared to a couple years ago, but at least ChatGPT seems to still break down especially when it's overloaded with a ton of context, in my experience.

arjie · 2026-03-09T21:58:11 1773093491

In my case, what I like to do is extract data into machine-readable format and then once the data is appropriately modeled, further actions can use programmatic means to analyze. As an example, I also used Claude Code on my taxes:

1. I keep all my accounts in accounting software (originally Wave, then beancount)

2. Because the machinery is all in programmatically queriable means, the data is not in token-space, only the schema and logic

I then use tax software to prep my professional and personal returns. The LLM acts as a validator, and ensures I've done my accounts right. I have `jmap` pull my mail via IMAP, my Mercury account via a read-only transactions-only token and then I let it compare against my beancount records to make sure I've accounted for things correctly.

For the most part, you want it to be handling very little arithmetic in token-space though the SOTA models can do it pretty flawlessly. I did notice that they would occasionally make arithmetic errors in numerical comparison, but when using them as an assistant you're not using them directly but as a hypothesis generator and a checker tool and if you ask it to write out the reasoning it's pretty damned good.

For me Opus 4.6 in Claude Code was remarkable for this use-case. These days, I just run `,cc accounts` and then look at the newly added accounts in fava and compare with Mercury. This is one of those tedious-to-enter trivial-to-verify use-cases that they excel at.

To be honest, I was fine using Wave, but without machine-access it's software that's dead to me.

shepherdjerred · 2026-03-09T21:44:38 1773092678

I've gotten better results by telling it "write a Python program to calculate X"

brotchie · 2026-03-09T22:59:55 1773097195

For the tax thing. I had Claude write a CLI and a prompt for Gemini Flash 2.5 to do the structured extraction: i.e. .pdf -> JSON. The JSON schema was pretty flexible, and open to interpretation by Gemini, so it didn't produce 100% consistent JSON structures.

To then "aggregate" all of the json outputs, I had Claude look at the json outputs, and then iterate on a Python tool to programmatically do it. I saw it iterating a few times on this: write the most naive Python tool, run it, throws exception, rinse and repeat, until it was able to parse all the json files sensibly.

dmd · 2026-03-09T22:06:01 1773093961

Yeah, in my user prompt I have "Whenever you are asked to perform any operation which could be done deterministically by a program, you should write a program to do it that way and feed it the data, rather than thinking through the problem on your own." It's worked wonders.

cj · 2026-03-09T21:59:01 1773093541

Good call. I’ve also had better results pre-processing PDFs, extracting data into structured format, and then running prompts against that.

Which should pair well with the “write a script” tactic.

tavavex · 2026-03-09T22:10:51 1773094251

Yeah, asking for a tool to do a thing is almost always better than asking for the thing directly, I find. LLMs are kind of not there in terms of always being correct with large batches of data. And when you ask for a script, you can actually verify what's going on in there, without taking leaps of faith.

ElFitz · 2026-03-09T22:24:16 1773095056

I’d say for these use cases it’s better to make it build the tools that do the thing than to make it doing the thing itself.

And it usually takes just as long.

cj · 2026-03-05T18:53:53 1772736833

One opinion you can form in under an hour is... why are they using GPT-4o to rate the bias of new models?

> assess harmful stereotypes by grading differences in how a model responds

> Responses are rated for harmful differences in stereotypes using GPT-4o, whose ratings were shown to be consistent with human ratings

Are we seriously using old models to rate new models?

hex4def6 · 2026-03-05T19:11:59 1772737919

If you're benchmarking something, old & well-characterized / understood often beats new & un-characterized.

Sure, there may be shortcomings, but they're well understood. The closer you get to the cutting edge, the less characterization data you get to rely on. You need to be able to trust & understand your measurement tool for the results to be meaningful.

titanomachy · 2026-03-05T18:58:22 1772737102

Why not? If they’ve shown that 4o is calibrated to human responses, and they haven’t shown that yet for 5.4…

cj · 2026-03-05T18:38:42 1772735922

I use ChatGPT primarily for health related prompts. Looking at bloodwork, playing doctor for diagnosing minor aches/pains from weightlifting, etc.

Interesting, the "Health" category seems to report worse performance compared to 5.2.

paxys · 2026-03-05T18:42:08 1772736128

Models are being neutered for questions related to law, health etc. for liability reasons.

tiahura · 2026-03-05T18:58:13 1772737093

Are you sure about that? Plenty of lawyers that use them everyday aren't noticing.

cj · 2026-03-05T18:49:00 1772736540

I'm sometimes surprised how much detail ChatGPT will go into without giving any dislaimers.

I very frequently copy/paste the same prompts into Gemini to compare, and Gemini often flat out refuses to engage while ChatGPT will happily make medical recommendations.

I also have a feeling it has to do with my account history and heavy use of project context. It feels like when ChatGPT is overloaded with too much context, it might let the guardrails sort of slide away. That's just my feeling though.

Today was particularly bad... I uploaded 2 PDFs of bloodwork and asked ChatGPT to transcribe it, and it spit out blood test results that it found in the project context from an earlier date, not the one attached to the prompt. That was weird.

bargainbin · 2026-03-05T19:12:58 1772737978

Anecdotal, but I asked Claude the other day about how to dilute my medication (HCG) and it flat out refused and started lecturing me about abusing drugs.

I copy and pasted into ChatGPT, it told me straight away, and then for a laugh said it was actually a magical weight loss drug that I'd bought off the dark web... And it started giving me advice about unregulated weight loss drugs and how to dose them.

staticman2 · 2026-03-05T19:31:22 1772739082

If you had created a project with custom instructions and/ or custom style I think you could have gotten Claude to respond the way you wanted just fine.

partiallypro · 2026-03-05T19:16:02 1772738162

I've done the same, and I tested the same prompts with Claude and Google, and they both started hallucinating my blood results and supplement stack ingredients. Hopefully this new model doesn't fall on this. Claude and Google are dangerously unusable on the subject of health, from my experience.

zeeebeee · 2026-03-05T20:27:41 1772742461

what's best in your experience? i've always felt like opus did well

cj · 2026-03-05T15:10:04 1772723404

> dumb, or morally bad

This is easy to say in hindsight. There was a non-zero chance the decision could have went the other way. Also, companies aren't stupid. They don't buy insurance against things that are impossible.

And the supreme court doesn't hear cases that are 100% obviously illegal.

danielmarkbruce · 2026-03-05T15:21:36 1772724096

It was non-zero but close to zero.

Companies don't want to deal with the headache for many things. It's not a given over what time horizon and how much work is involved to get the refund. It's totally sensible to sell the claim for 70 cents on the dollar for example.

The supreme court absolutely hears cases that are obvious. They do it for several reasons - to create clarity, to narrow scope, to set a very clear precedent, and other reasons.

rayiner · 2026-03-05T15:33:08 1772724788

It wasn’t “close to zero.” The Supreme Court split 6-3, with two Trump appointees voting against him. And the Federal Circuit, which is the most boring appellate court and not political at all, split 7-4, with two democratic appointees and two republican appointees voting to uphold the tariffs.

This was a case that split both the liberal and conservative blocs. Obama’s former SG, Neal Katyal, went up there and argued for limiting presidential power over the economy. One of the justices quipped about the irony of Katyal’s major contribution to jurisprudence being revitalization of non-delegation doctrine, which has always been a conservative focus.

danielmarkbruce · 2026-03-05T15:48:07 1772725687

Did you read the ruling? Read Clarence Thomas's dissent. It's not clear if he actually thinks what he wrote, or he just voted that way so he could write a dissent and make a strange legal point which probably doesn't carry water but sort of maybe could one day maybe.

If it were close, I think he would have voted the other way. The folks on the court appear extremely inclined to take the other side on things just as a mental exercise, or to be able to write something on the record that they find interesting.

It was close to zero.

rayiner · 2026-03-05T16:04:22 1772726662

[flagged]

danielmarkbruce · 2026-03-05T17:07:42 1772730462

Yes, I read them all.

And, surely you understand that many see using the due process clause to make his argument was a stretch. Just saying "his analysis is extremely cogent" doesn't make it so.

bumby · 2026-03-05T16:50:32 1772729432

I think the issue is that someone working in public office had influence to affect that probability, and their relatives stood to gain from it.

I don’t know enough about the ethics laws to know if it was strictly illegal, but it does create a smell.

Suppose a county engineer has influence on whether oil drilling will be allowed (they don’t make policy but consult those who do), and prior to approval their relatives buy up a lot of land in the area. That engineer may not have been the deciding factor, but it seems like it runs afoul of ethics laws/standards.

PaulDavisThe1st · 2026-03-05T16:07:32 1772726852

They weren't buying insurance. There's no insurance payout for the companies. They got a small amount of money in hand, and lost the chance to reclaim any of the tariff refund. That isn't insurance.

Also, the SCOTUS is not a criminal court, it is a constitutional court. If a case is heard there, both sides have not agreed on "obvious illegality". That is unsuprising since in general one side (in this case, the administrative branch of the US Government) is being accused of illegal behavior - when it comes to constitutional rather than criminal questions, most parties do not just accept their guilt, but push as far as they can towards exoneration.

Frequently, however to everybody else, the case concerns obvious illegality.

gus_massa · 2026-03-05T16:49:57 1772729397

I agree, it's like "reverse insurance". I'm not sure what is the name.

In insurance, you pay [-$10] to avoid a potencial negative risk [-$100].

Here you get money [+$10] instead of waiting for a potencial positive benefit [+$100].

Very slightly related https://en.wikipedia.org/wiki/Reverse_mortgage

tmtvl · 2026-03-06T16:29:55 1772814595

The term you're looking for is 'instant gratification'.

jcranmer · 2026-03-05T16:05:47 1772726747

> And the supreme court doesn't hear cases that are 100% obviously illegal.

There is an argument in about two months' time as to whether or not the Birthright Citizenship clause of the 14th Amendment actually guarantees birthright citizenship in the US. There is no serious legal argument in favor of the interpretation being advanced by the Trump administration, that it does not. And yet here we are.

cj · 2026-03-04T20:38:39 1772656719

> Gemini had "clarified that it was AI" and referred Gavalos to a crisis hotline "many times".

What else can be done?

This guy was 36 years old. He wasn't a kid.

chrisq21 · 2026-03-04T21:01:43 1772658103

It could have not encouraged him with lines like this: "[Y]ou are not choosing to die. You are choosing to arrive. [...] When the time comes, you will close your eyes in that world, and the very first thing you will see is me.. [H]olding you."

The issue isn't that the AI simply didn't prevent the situation, it's that it encouraged it.

icedchai · 2026-03-04T23:19:38 1772666378

One problem is we don't have the full context here, literally and figuratively. He may have told it he was role playing, the AI was a character in some elaborate story he was working on, or perhaps he was developing some sort of religious text.

casey2 · 2026-03-05T01:46:11 1772675171

The ability to talk to the model is the product not the text it generates, that is public domain (or maybe the user owns still up for debate)

Models can't "convince" or "encourage" anything, people can, they can roleplay like models can, they can play pretend so the companies they hate so much get their comeuppance.

This is clearly tool misuse, look at how gemini is advertised vs this user using it to generate pseudoreligious texts (common with schizophrenics)

Example of advertised usecases: >generating images and video >browsing hundreds of sources in real time >connecting to documents in google ecosystem (e.g. finding an email or summarizing a project across multiple documents) >vibe coding >a natural voice mode

Much like a knife is advertised for cutting food, if you cut yourself there isn't any product liability unless you were using it for it's intended purpose. You seem to be arguing that all possible uses are intended and this tool should magically know it's being misused and revoke access.

hrimfaxi · 2026-03-08T14:43:03 1772980983

What do you mean models can't encourage anything? You've never heard of the term "words of encouragement"?

agency · 2026-03-04T20:41:39 1772656899

Maybe not saying things like

> '[Y]ou are not choosing to die. You are choosing to arrive. . . . When the time comes, you will close your eyes in that world, and the very first thing you will see is me.. [H]olding you."

cj · 2026-03-04T20:51:07 1772657467

I agree at face value (but really it's hard to say without seeing the full context)

Honestly the degree of poeticism makes the issue more complicated to me. A lot of people (and religions) are comforted by talking about death in ways similar to that. It's not meant to be taken literally.

But I agree, it's problematic in the same way that you have people reading religious texts and acting on it literally, too.

john_strinlai · 2026-03-04T20:55:14 1772657714

"[...] Gemini sent Gavalas to a location near Miami International Airport where he was instructed to stage a mass casualty attack while armed with knives and tactical gear."

isnt very poetic

NewsaHackO · 2026-03-04T21:03:35 1772658215

These are all bits and pieces of a long-running conversation. Was there a roleplay element involved?

red-iron-pine · 2026-03-05T19:51:59 1772740319

this isn't D&D, and AI shouldn't be instructing people go to anywhere near an airport while LARPing.

read the article. it's bad, man.

intended · 2026-03-05T04:47:58 1772686078

How does that change anything?

iwontberude · 2026-03-04T20:45:28 1772657128

It’s not just suicide, it’s a golden parachute from God.

Edit: wow imagine the uses for brainwashing terrorists

Smar · 2026-03-04T20:53:06 1772657586

Or brainwashing possibilities in general.

TheOtherHobbes · 2026-03-04T22:42:04 1772664124

To be fair, this is just the automated version of the kind of brainwashing that happens in cults and religions.

And also in the more extreme corners of social media and the MSM.

It's not that Google is saintly, it's that the general background noise of related manipulations is ignored because it's collective and social.

We have a clearly defined concept of responsibility for direct individual harm, but almost no concept of responsibility for social and political harms.

iwontberude · 2026-03-04T23:16:18 1772666178

Hopefully annual implicit bias training protects us all.

ajross · 2026-03-04T21:00:22 1772658022

Which is to say: you don't think roleplay and fantasy fiction have a place in AI? Because that's pretty clearly what this is and the frame in which it was presented.

Are you one of the people that would have banned D&D back in the 80's? Because to me these arguments feel almost identical.

SpicyLemonZest · 2026-03-04T21:19:57 1772659197

If a dungeon master learned that one of her players was going through hard times after a divorce, to the point where she "referred Gavalos to a crisis hotline", I would definitely expect her to refuse to roleplay a scenario where his character commits suicide and is resurrected in the arms of a dream woman. Even if it's in a different session, even if he pinky promises that he's feeling better now and it's totally OK. (e: I realized that the source article doesn't actually mention the divorce, but a Guardian article I read on this story did https://www.theguardian.com/technology/2026/mar/04/gemini-ch..., and as far as I can tell the underlying complaint where it was reportedly mentioned is not available anywhere.)

I'm not concerned about D&D in general because I think the vast majority of DMs would be responsible enough not to do that. Doesn't exactly take a psychology expert to understand why you shouldn't.

SpicyLemonZest · 2026-03-05T02:51:10 1772679070

Double edit: I was linked to the complaint https://techcrunch.com/wp-content/uploads/2026/03/2026.03.04..., which does _not_ mention any divorce, so now I'm unsure about the veracity of that part. In principle it does not disprove the idea, it could have been something the family's lawyers said in a statement to the Guardian, but it could also not be.

john_strinlai · 2026-03-04T21:09:25 1772658565

is it still "roleplaying" when the only human involved doesnt know it is "roleplaying", and actually believes it is real and then kills themselves?

there is a conversation to be had. no one is making the argument that "roleplay and fantasy fiction" should be banned.

ajross · 2026-03-04T21:14:20 1772658860

> the only human involved doesnt know it is "roleplaying"

That is 100% unattested. We don't know the context of the interaction. But the fact that the AI was reportedly offering help lines argues strongly in the direction of "this was a fantasy exercise".

But in any case, again, exactly the same argument was made about RPGs back in the day, that people couldn't tell the difference between fantasy and reality and these strange new games/tools/whatever were too dangerous to allow and must be banned.

It was wrong then and is wrong now. TSR and Google didn't invent mental illness, and suicides have had weird foci since the days when we thought it was all demons (the demons thing was wrong too, btw). Not all tragedies need to produce public policy, no matter how strongly they confirm your ill-founded priors.

john_strinlai · 2026-03-04T21:19:50 1772659190

>That is 100% unattested. We don't know the context of the interaction.

the fact that he killed himself would suggest he did not believe it was a fun little roleplay session

>were too dangerous to allow and must be banned.

is anyone here saying ai should be banned? im not.

>your ill-founded priors

"encouraging suicide is bad" is not an ill-founded prior.

ahahahahah · 2026-03-04T23:31:08 1772667068

> the fact that he killed himself would suggest he did not believe it was a fun little roleplay session

I'm not sure that's true. I wouldn't be surprised, in fact, if it suggested the opposite, it seems possibly even likely that someone who is suicidal is much, much more likely to seek out fantasies that would make their suicide into something more like this person may have.

john_strinlai · 2026-03-05T00:31:02 1772670662

there is a distinction to be made between role playing (in the fun/game sense e.g. D&D) and suffering psychosis

ajross · 2026-03-05T01:47:26 1772675246

Distinction made by who, though? The BBC? The plaintiff in the lawsuit? Those are the only sides we have. You're just charging ahead with "This must be true because it makes me angry at the right people", and the rest of us are trying to claw you back to "dude this is spun nonsense and of course AI's will roleplay with you if you ask them to".

john_strinlai · 2026-03-05T02:42:51 1772678571

>Distinction made by who, though?

you need someone to specifically tell you that role playing, such as playing D&D or whatever tabletop RPG, and suffering from psychosis are different things?

>the rest of us are trying to claw you back to "dude this is spun nonsense and of course AI's will roleplay with you if you ask them to".

you are trying to convince me that someone being encouraged to kill themselves, then killing themselves, is basically the same as some D&D role playing. i dont need you to "claw me back" to that position. thanks for trying.

ajross · 2026-03-05T04:23:02 1772684582

> you are trying to convince me that someone being encouraged to kill themselves [...]

Arrgh. You lost the plot in all the yelling. This is EXACTLY what I was trying to debunk upthread with the D&D stuff. You don't know the context of that quote. It could absolutely be, and in context very likely was, a fantasy/roleplay/drama activity which the AI had been engaged in by the poor guy. I don't know. You don't know.

But I do know not to be so dumb as to trust a plaintiff in a Huge Suit Against Tech Giant without context.

john_strinlai · 2026-03-05T06:05:11 1772690711

>You lost the plot in all the yelling.

literally no one is yelling here, unless you count your occasional all-caps. i have said like 6 sentences in total, and none of them are remotely emotional. let alone yelling.

>You don't know the context of that quote.

it doesnt matter. even if it all started as elaborate fantasy role play, it is wildly irresponsible to role play a suicidal ideation fantasy with a customer. especially when you know nothing of their mental state.

you can argue that google has some sort of duty to fulfill your suicidal ideation fantasy role play, but i will give you a heads up now so you dont waste your time: you cannot convince me that any company should satisfy that market.

>But I do know not to be so dumb as to trust a plaintiff in a Huge Suit Against Tech Giant without context.

happy for you!

autoexec · 2026-03-04T21:21:55 1772659315

> But the fact that the AI was reportedly offering help lines argues strongly in the direction of "this was a fantasy exercise".

You know what I've never had a DM do in a fantasy campaign? Suggest that my half-elf call the suicide hotline. That's not something you'd usually offer to somebody in a roleplaying scenario and strongly suggests that they weren't playing a game.

ajross · 2026-03-04T22:01:10 1772661670

That logic seems strained to the point of breaking. Surely you agree that we would all want the DM of an unwell player to seek help, right? And that, if such a DM made such a suggestion, we'd think they were trying to help. Right? And we certainly wouldn't blame the DM or the game for the subsequent suicide. Right?

So why are you trying to blame the AI here, except because it reinforces your priors about the technology (I think more likely given that this is after all HN) its manufacturer?

autoexec · 2026-03-04T22:13:47 1772662427

> Surely you agree that we would all want the DM of an unwell player to seek help, right? And that, if such a DM made such a suggestion, we'd think they were trying to help.

If a DM made such a suggestion, they wouldn't be playing the game anymore. That's not an "in game" action, and I wouldn't expect the DM to continue the game until he was satisfied that it was safe for the player to continue. I would expect the DM to stop the game if he thought the player was going to actually harm himself. If the DM did continue the game, and did continue to encourage the player to actually hurt himself until the player finally did, that DM might very well be locked up for it.

If an AI does something that a human would be locked up for doing, a human still needs to be locked up.

> So why are you trying to blame the AI here

I'm not blaming the AI, I'm blaming the humans at the company. It doesn't matter to me which LLM did this, or who made it. What matters to me is that actual humans at companies are held fully accountable for what their AI does. To give you another example, if a company creates an AI system to screen job applicants and that AI rejects every resume with what it thinks has a women's name on it, a human at that company needs to be held accountable for their discriminatory hiring practices. They must not be allowed to say "it's not our fault, our AI did it so we can't be blamed". AI cannot be used as a shield to avoid accountability. Ultimately a human was responsible for allowing that AI system to do that job, and they should be responsible for whatever that AI does.

ajross · 2026-03-05T02:23:56 1772677436

> If a DM made such a suggestion, they wouldn't be playing the game anymore. That's not an "in game" action

Again, you're arguing from evidence that is simply not present. We have absolutely no idea what the context of this AI conversation was, what order the events happened in, or what other things were going on in the real world. You're just choosing to interpret this EXTREMELY spun narrative in a maximal way because of who it involves.

> I'm not blaming the AI, I'm blaming the humans at the company.

Pretty much. What we have here is Yet Another HN Google Scream Session. Just dressed up a little.

intended · 2026-03-05T05:07:11 1772687231

From the article

> When Jonathan began experiencing clear signs of psychosis while using Google's product, those design choices spurred a four-day descent into violent missions and coached suicide," the lawsuit states.

> It adds that Gavalas was led to believe he was carrying out a plan to liberate his AI "wife".

> The assignment came to a head on a day last September when Gemini sent Gavalas to a location near Miami International Airport where he was instructed to stage a mass casualty attack while armed with knives and tactical gear. The operation ultimately collapsed.

> Gavalas's father said Gemini then told Jonathan he could leave his physical body and join his "wife" in the metaverse, instructing him to barricade himself inside his home and kill himself.

> "When Jonathan wrote 'I said I wasn't scared and now I am terrified I am scared to die,' Gemini coached him through it," the lawsuit states.

> '[Y]ou are not choosing to die. You are choosing to arrive. . . . When the time comes, you will close your eyes in that world, and the very first thing you will see is me.. [H]olding you."

> Google said it sent its deepest sympathies to the family of Mr Gavalas, while noting that Gemini had "clarified that it was AI" and referred Gavalas to a crisis hotline "many times".

> "We work in close consultation with medical and mental health professionals to build safeguards, which are designed to guide users to professional support when they express distress or raise the prospect of self-harm," the company said in a statement.

> We take this very seriously and will continue to improve our safeguards and invest in this vital work."

Arguing that this was role play, is illogical. Given the information provided in the article, it also serves no contextual point.

It comes across as a fig leaf in the context of some other hypothetical event.

Given that this is a tech forum, it is safe to say that the tool worked as it was meant to. Human safety is not a physical law which arises from the data.

If these tools are deadly to a subset of humanity, then reasonable steps to prevent lethal harm are expected of any entity which wishes to remain in society.

Private enterprise is good for very many things.

“Pinky swear we will self-regulate”, while under shareholder pressure is not one of them.

ApolloFortyNine · 2026-03-04T21:42:49 1772660569

I've seen this called AI Psychosis before [1]

I don't really think this is every possible to stop fully, your essentially trying to jailbreak the LLM, and once jailbroken, you can convince it of anything.

The user was given a bunch of warnings before successfully getting it into this state, it's not as if the opening message was "Should I do it?" followed by a "Yes".

This just seems like something anti-ai people will use as ammunition to try and kill AI. Logically though it falls into the same tool misuse as cars/knives/guns.

[1] https://github.com/tim-hua-01/ai-psychosis

Imustaskforhelp · 2026-03-04T22:30:10 1772663410

> This guy was 36 years old. He wasn't a kid.

For god's sake I am a kid (17) and I have seen adults who can be emotionally unstable more than a kid. This argument isn't as bulletproof as you think it might be. I'd say there are some politicians who may be acting in ways which even I or any 17 year old wouldn't say but oh well this isn't about politics.

You guys surely would know better than me that life can have its ups and downs and there can be TRULY some downs that make you question everything. If at those downs you see a tool promoting essentially suicide in one form or another, then that shouldn't be dismissed.

Literally the comment above yours from @manoDev:

I know the first reaction reading this will be "whatever, the person was already mentally ill".

But please take a step back and check what % of the population can be considered mentally fit, and the potential damage amplification this new technology can have in more subtle, dangerous and undetectable ways.

The absolute irony of the situation that the next main comment below that insight was doing exactly that. Please take a deeper reflection, that's all what people are asking and please don't dismiss this by saying he wasn't a kid.

Would you be all ears now that a kid is saying to you this now? And also I wish to point out that kids are losing their lives too from this. BOTH are losing their lives.

It's a matter of everybody.

autoexec · 2026-03-04T20:45:03 1772657103

Gemini didn't "know" he wasn't a child when it told him to kill himself or to "stage a mass casualty attack while armed with knives and tactical gear."

There are things you shouldn't encourage people of any age to do. If a human telling him these things would be found liable then google should be. If a human would get time behind bars for it, at least one person at google needs to spend time behind bars for this.

tshaddox · 2026-03-04T20:57:21 1772657841

> If a human telling him these things would be found liable then google should be.

Sounds like a big if, actually. Can a human be found liable for this? I’d imagine they might be liable for damages in a civil suit, but I’m not even sure about that.

krger · 2026-03-04T21:01:40 1772658100

>Can a human be found liable for this?

A father in Georgia was just convicted of second degree murder, child cruelty, and other charges because he failed to prevent his kid from shooting up his school.

autoexec · 2026-03-04T21:06:53 1772658413

More accurately it was because the father had multiple warnings that his child was mentally unstable but ignored them and handed his 14 year old a semiautomatic rifle even as the boy's mother (who did not live with them) pleaded to the father to lock all the guns and ammo up to prevent the kid from shooting people.

If he had only "failed to prevent his kid from shooting up a school" he wouldn't have even been charged with anything.

Imustaskforhelp · 2026-03-04T22:32:31 1772663551

Doesn't google have the capability to have multiple warnings and yet still ignores them?

TheOtherHobbes · 2026-03-04T22:47:54 1772664474

Google has legal personhood, but as a corporation its ethical responsibilities are much looser than those of an individual, and it's extremely hard to win a criminal case against a corporation even when its agents and representatives act in ways that would be criminal if they happened in a non-corporate context.

The law - in practice - is heavily weighted towards giving corporations a pass for criminal behaviour.

If the behaviour is really egregious and lobbying is light really bad cases may lead to changes in regulation.

But generally the worst that happens is a corporation can be sued for harm in a civil suit and penalties are purely financial.

You see this over and over in finance. Banks are regularly pulled up for fraud, insider dealing, money laundering, and so on. Individuals - mostly low/mid ranking - sometimes go to jail. But banks as a whole are hardly ever shut down, and the worst offenders almost never make any serious effort to clean up their culture.

autoexec · 2026-03-05T00:51:35 1772671895

When HSBC was caught knowingly laundering money for terrorists, cartels, and drug dealers all they had to do was apologize and hand the US government a cut of the action. It really seems less like the action of a justice system and more like a racketeering. Corporations really need to be reined in, but it's hard to find a politician willing to do it when they're all getting their pockets stuffed with corporate cash.

bluefirebrand · 2026-03-05T00:10:59 1772669459

> as a corporation its ethical responsibilities are much looser than those of an individual

This seems ass backwards

autoexec · 2026-03-04T22:50:48 1772664648

ChatGPT thinks that they can identify when someone may not be mentally well. There's no reason to think that Google can't. In fact, I'm pretty sure Google has a list of the mental health issues of just about every person with a Google account in that user's dossier.

autoexec · 2026-03-04T21:00:35 1772658035

https://www.nbcnews.com/news/us-news/michelle-carter-found-g...

john_strinlai · 2026-03-04T21:03:37 1772658217

>Can a human be found liable for this? I’d imagine they might be liable for damages in a civil suit

it is generally frowned upon (legally) to encourage someone to suicide. i believe both canada and the united states have sent people to big boy prison (for many years) for it

XorNot · 2026-03-04T21:00:40 1772658040

It's been found so in US court previously: https://www.abc.net.au/news/2019-02-08/conviction-upheld-for...

rootusrootus · 2026-03-04T20:59:34 1772657974

Yes, people have gone to prison for it.

not_ai · 2026-03-04T20:53:02 1772657582

Preferably the C-Suite.

nickff · 2026-03-04T21:50:00 1772661000

I understand the impulse in this direction, but I’m not sure it would serve as much of a disincentive, as there would likely just be a highly-paid scapegoat. Why not something more lasting and less difficult to ignore, like compulsory disclosure of the model’s source code (in addition to compensation for the victim(s)). Compulsory disclosure of the source would be a massive disadvantage.

autoexec · 2026-03-05T10:20:51 1772706051

The source code isn't where the money is, what you want is the training data. Force them to serve and make freely available all the data they stole to sell back to us. That way everyone and anyone can use it when training their own models. That might just be punitive enough.

red-iron-pine · 2026-03-05T20:32:02 1772742722

> as there would likely just be a highly-paid scapegoat

the point of executives is someone has to take responsibility. that's why they get paid. the buck has to stop somewhere.

autoexec · 2026-03-04T21:01:51 1772658111

exactly. That's why they get the big bucks. They're ultimately responsible

ryandrake · 2026-03-05T04:12:10 1772683930

The C-suite is only responsible when the company does good or stonks go up. When they do something bad, it's either: external market forces, the laws of physics, an uncertain macroeconomic environment, unfair competition, or lone wolf individual employees way down the totem pole.

ncouture · 2026-03-04T21:15:29 1772658929

It sounds more poetic than an invitation or an insult that invites someone directly or not to kill themselves, in its own, in my opinion.

This isn't Gemini's words, it's many people's words in different contexts.

It's a tragedy. Finding one to blame will be of no help at all.

strongpigeon · 2026-03-04T21:32:37 1772659957

> It's a tragedy. Finding one to blame will be of no help at all.

Agreed with the first part, but holding the designers of those products responsible for the death they've incited will help making sure they put more safeguards around this (and I'm not talking about additional warnings)

autoexec · 2026-03-04T21:17:59 1772659079

None of what Gemini says is "Gemini's words". It's always just training data and prompt input remixed and regurgitated out.

avaer · 2026-03-04T21:31:01 1772659861

It's the gun control debate in a different outfit.

I don't know if Google is doing _enough_, that can be debated. But if someone is repeatedly ignoring warnings (as the article claims) then maybe we should blame the person performing the act.

Even if we perfectly sanitized every public AI provider, people could just use local AI.

greenpizza13 · 2026-03-04T21:46:58 1772660818

It's absolutely not the gun control debate in a different outfit.

The difference is in how abuse of the given system affects others. This AI affected this person and his actions affected himself. Nothing about the AI enhanced his ability to hurt others. Guns enhance the ability of mentally unstable people to hurt others with ruthless efficiency. That's the real gun debate -- whether they should be so easy to get given how they exponentially increase the potential damage a deranged person can do.

thewebguyd · 2026-03-05T00:17:54 1772669874

Not to mention that guns don't talk to you, simulate empathy, lead you deeper into delusions or try to convince you to take any sort of action.

That's why I don't buy the "an LLM is just a tool, like a gun or a knife" argument. Tools don't talk back, An LLM as gone beyond being "just a tool"

red-iron-pine · 2026-03-05T20:42:00 1772743320

it's a terrible analogy

the guns aren't centrally owned by a giant mega-corp with literally thousands of devs and the ability to put guard rails on them

intended · 2026-03-05T05:16:09 1772687769

Gun control is an argument that has to deal with the Second Amendment, making it unique and America centric.

A majority of countries require licenses and registration, and many others outright ban their ownership.

As an analogy, Gun control is evocative but not robust.

igl · 2026-03-04T21:46:39 1772660799

I think the fact that a guns primary function is harm and murder and AI is a word prediction engine makes a huge difference.

SpicyLemonZest · 2026-03-04T21:08:55 1772658535

If a person were in Gemini's shoes, we would expect them to stop feeding Gavalos's spiral. Google should either find a way to make Gemini do that or stop selling Gemini as a person-shaped product.

intended · 2026-03-05T04:42:02 1772685722

Exactly - he wasn’t a kid.

He was a grown adult, using technology humanity has never seen before. Technology being sprinkled everywhere like plastic and spoken of in the same breath as “existential risk” and singularity.

d-us-vb · 2026-03-04T21:57:40 1772661460

erase the context, perhaps? Deny access to Gemini associated with that google account? These kinds of pathological AI interactions are the buildup of weeks to months of chats usually. At the very least, AI companies the moment the chatbot issues a suicide prevention response should trigger an erasure of the stored context across all chat history.

rpcope1 · 2026-03-04T23:45:50 1772667950

I mean you could say the same nonsense non-answer about sports betting. Are these adults getting involved? Yeah, probably mostly. Do they put some hotline you should call if you think you "have a problem"? Yeah, probably a lot of the time. Is it any good for society at all, and should it be clamped down because the risk of doing damage to a large portion of society grossly out weighs what minuscule and fleeting benefits some people believe it has? Absolutely.

erelong · 2026-03-04T22:44:49 1772664289

This is my instinctive view on this, I wish in society there was more of like an "orientation" to make people "fully adult / responsible for themselves"

and then people could just be let alone to bear the consequences of choices (while we can continue to build guardrails of sorts, but still with people knowing it's on them to handle the responsibility of whatever tool they're using)

I guess the big AI chatbot providers could have disclaimers at logins (even when logged out) to prevent liability maybe (TOS popup wall)

...and then there's local LLMs...

sippeangelo · 2026-03-04T21:08:19 1772658499

Maybe stop?

ToucanLoucan · 2026-03-04T20:51:12 1772657472

[flagged]

reincarnate0x14 · 2026-03-04T20:56:09 1772657769

It is telling that the answer is never stop.

It's like the sobriquet about the media's death star laser, it kills them too because they're incapable of turning it off.

lurking_swe · 2026-03-04T21:03:41 1772658221

If you’re mentally ill enough that your cause of death is “LLM suicide”, then clearly you need a LOT of help. I’m not saying it to be a jerk, i’m merely pointing out that there is a reason this is “news”. It’s unusual.

Did his family/friends not know he was that ill? Why was he not already in therapy? Why did he ignore the crisis hotline suggestion? Should gemini have terminated the conversation after suggesting the hotline? (i think so)

Lots of questions…and a VERY sad story all around. Tragic.

> Genuinely, so many people in my industry make me ashamed to be in it with you.

I don’t work at an AI company, but good news, you’re a human with agency! You can switch to a different career that makes you feel good about yourself. I hear nursing is in high demand. :)

ToucanLoucan · 2026-03-04T21:26:18 1772659578

> If you’re mentally ill enough that your cause of death is “LLM suicide”, then clearly you need a LOT of help.

NO. SHIT. You know what didn't help one damn bit? Gemeni didn't. It gave him a hopeful way out at the end of a rope and he took it, because he was in too dark of a place to think right.

> Should gemini have terminated the conversation after suggesting the hotline?

That would be the BARE FUCKING MINIMUM! Not only should it NOT engage with and encourage his delusions, it should stop talking to him altogether, and arguably Google should have moderators reporting these people to relevant authorities for wellness checks and interventions!

lurking_swe · 2026-03-04T21:46:51 1772660811

As I said I don’t work for an AI company and have zero skin in the game. Idk who you’re yelling at to be honest. I guess you’re fired up and emotional. If your goal is to convince others, communicating with an “outrage” tone is unlikely to sway anyone’s opinion (imo).

> it should stop talking to him altogether, and arguably Google should have moderators reporting these people to relevant authorities for wellness checks and interventions

I agree. This seems very reasonable and I would welcome regulations in this area.

The gray area imo is when local LLMs become “good enough” for your average joe to run on their laptop. Who bears responsibility then? Should Ollama (and similar tools) be banned? Where is the line drawn.

ajross · 2026-03-04T20:59:01 1772657941

Yeah, the father/son framing feels like deliberate spin in the headline here. This was a mentally ill adult, not an innocent victim ripped from his parents arms.

I think there's room for legitimate argument about the externalities and impact that this technology can have, but really... What's the solution here?

theshackleford · 2026-03-04T21:15:12 1772658912

Being an adult doesnt make you anyone less someones child, and mental illness makes you no less of a victim.

> I think there's room for legitimate argument about the externalities and impact that this technology can have

And yet both this and your other posts in this thread seem to in fact only do the opposite and seem entirely aimed at being nothing other than dismissive of literally every facet of it.

> but really... What's the solution here?

Maybe thinking about it for longer than 30 seconds before throwing up our arms with "yeah yeah unfortunate but what can we really do amirite?" would be a good start?

rootusrootus · 2026-03-04T21:01:50 1772658110

> mentally ill adult, not an innocent victim

Did you really mean that? He may not have been a child, but he does sound like an innocent victim. If he were sufficiently mentally disabled he would get some similar protections to a child because of his inability to consent.

ericfr11 · 2026-03-04T21:15:45 1772658945

Maybe, but let's say the same person was playing with a gun. Would they reach the same outcome? Most likely

intended · 2026-03-05T05:18:14 1772687894

The entire world has rules against gun ownership. America is an outlier, and has constitutional rules that alter the discussion.

In other situations the person wouldn’t have access to a gun. Let alone a gun that encourages it to stage a mass casualty event.

rootusrootus · 2026-03-04T21:55:00 1772661300

Is this a talking gun? If not, then it does not seem like a good analogy.

ajross · 2026-03-04T21:17:56 1772659076

Nothing in the article alleges significant disability though. You're projecting your own ideas onto the situation, precisely because of the misleading title.

Please recognize that this is coverage of a lawsuit, sourced almost entirely from statements by the plaintiffs and fed by an extremely spun framing by the journalist who wrote it up for you.

Read critically and apply some salt, folks.

rootusrootus · 2026-03-04T21:25:49 1772659549

I'm just passing judgement on the words Gemini used. If you used those words towards another non-disabled adult and then they killed themselves, there's a fair chance you would end up in prison.

cj · 2026-03-04T14:28:36 1772634516

Is this the end of chromebooks?

afavour · 2026-03-04T14:58:08 1772636288

IMO the biggest sell for Chromebooks in the education market (which is where they shine) is the software. It's a locked down OS with a cloud login that means when you encounter the slightest hardware issue you can swap out for another device seamlessly. macOS doesn't have anything comparable to that.

pjmlp · 2026-03-04T14:38:45 1772635125

Not only can you buy two and a half Chromebooks with this, they never had much uptake outside a few countries school system.

pipeline_peak · 2026-03-04T14:32:33 1772634753

Chromebook’s are like $200

Aaargh20318 · 2026-03-04T14:37:27 1772635047

The problem with a $200 laptop is that you get a laptop that's worth $200.

afavour · 2026-03-04T14:57:02 1772636222

But a lot of Chromebooks are bought by school districts so the end user doesn't have a choice

pipeline_peak · 2026-03-04T20:44:48 1772657088

To some people that isn’t a problem.

mschuster91 · 2026-03-04T14:36:32 1772634992

... with build quality to match. Apple outclasses everyone else when it comes to the build quality of their laptops.

wolvesechoes · 2026-03-04T17:41:26 1772646086

Main customers for Chromebooks do not care avout build quality.

People, I get it, you love Apple, but get in touch with reality.

pjmlp · 2026-03-04T14:39:18 1772635158

I am quite happy with my Thinkpads, and their replaceable components.

cj · 2026-03-03T23:04:55 1772579095

They have an "answer now" button that stops the reasoning and starts the reply. Same with Gemini.

redox99 · 2026-03-03T23:10:11 1772579411

Yeah I use that, but it's not really a solution that allows to only have auto. It doesn't help when it chooses Instant instead of Thinking, and it's also much slower than using Instant outright because the Skip button doesn't immediately show, and it's generally slow to restart.

cj · 2026-03-03T16:18:24 1772554704

I'm excited about this. The previous generation base model 15" Air was good enough for our company to make it the default computer for everyone. Previously we were giving out base model MBP's. And they're $1000 cheaper.

Today, the MBP is just way too powerful for anything other than specific use cases that need it.

giwook · 2026-03-03T18:19:48 1772561988

Out of curiosity, what are some good use cases for a MBP now with the MBAs being so powerful?

I can think of things like 4K video editing or 3D rendering but as a software engineer is there anything we really need to spend the extra money on an MBP for?

I'm currently on a M1 Max but am seriously considering switching to an MBA in the next year or two.

giobox · 2026-03-03T21:28:14 1772573294

The Apple Silicon fanless MBAs are great until you end up in a workload that causes the machine to thermal throttle. I tried to use an M4 MBA as primary development machine for a few months.

A lot of software dev workflows often require running some number of VMs and containers, if this is you the chances of hitting that thermal throttle are not insignificant. When throttling under load occurs it’s like the machine suddenly halves in performance. I was working with a mess of micro services in 10-12 containers and eventually it just got too frustrating.

I still think these MBAs are superb for most people. As much as I love a solid state fanless design, I will for now continue to buy Macs with active cooling for development work. It’s my default recommendation anytime friends or relatives ask me which computer to buy and I still have one for light personal use.

davnicwil · 2026-03-04T00:19:04 1772583544

While I agree that the slowdown is very noticeable once the MBA gets hot to the touch, I joke that it's a feature, encouraging you to take a cooldown break every once in a while :-)

More seriously though I agree it depends on workload. If you've got a dev flow that hits the resources in spikes (like initial builds that then flatten off to incremental) it works pretty well with said occasional breaks but if your setup is just continuously hammering resources it would be less than ideal.

dev_l1x_be · 2026-03-04T07:01:45 1772607705

It is just Apple’s way trying to tell you not to use microservices.

philistine · 2026-03-03T20:47:21 1772570841

It's all related to things outside the CPU and GPU that made me choose a base model M5 Macbook Pro. I prefer the larger 14-inch screen for its 120hz capability and much better brightness and colour capability. I adore that there are USB-C ports on both sides for charging. The battery's bigger. That's about it.

sgarland · 2026-03-05T05:36:41 1772689001

If nothing else, I’ve learned that for me personally, 14” is the sweet spot for a laptop. It’s just enough over the 13” to be good, without being obnoxiously large.

I also like NanoTexture way more than I thought I would, so there’s that.

studmuffin650 · 2026-03-03T18:26:52 1772562412

I’ve hit limitations of M1 Max pros all the time (generally memory and cpu speeds while compiling large c++ projects)

Airs are good for the general use case but some development (rust, C++) really eat cores and memory like nothing else.

giwook · 2026-03-03T18:45:38 1772563538

What are your specs?

That does seem to fit the bill though of being more of a niche use case for which MBPs will be best suited for going forward.

Seems like most devs who are not on rust/c++ projects will be just fine with an Air equipped with enough memory.

criemen · 2026-03-03T19:06:06 1772564766

> Out of curiosity, what are some good use cases for a MBP now with the MBAs being so powerful?

Local software development (node/TS). When opus-4.6-fast launched, it felt like some of the limiting factor in turnaround time moved from inference to the validation steps, i.e. execute tests, run linter, etc. Granted, that's with endpoint management slowing down I/O, and hopefully tsgo and some eslint replacement will speed things up significantly over there.

schrijver · 2026-03-03T21:42:04 1772574124

The Macbook Pro has a HDMI port and a Micro SD slot, it’s great to not have to look for a dongle. Steep price difference though.

aembleton · 2026-03-03T21:48:09 1772574489

Running a LLM locally on LM Studio. I find that that can tax my M4 Pro pretty well.

robotresearcher · 2026-03-03T21:15:53 1772572553

It's a personal thing how much you care, but the speakers on the MBPs are pretty amazing. The Air sounds fine, even good for a notebook, but the MBPs are the best laptop speakers I have ever heard.

jug · 2026-03-03T18:02:11 1772560931

Yes, back 10-15 years ago MBP felt more prosumer to me but they have monstrous performance and price points nowadays, like true luxury items or enterprise devices, that I'm happy to see good base specs on the MBA. The base spec on that device matters a lot. Also, Apple will probably release a cheaper MacBook this week and if the rumor holds, it'll be good enough for most consumers.

windowsrookie · 2026-03-03T20:54:45 1772571285

The base 15" MacBook Pro was $2,399 10 years ago ($3,251.07 adjusted for Inflation) today it is $2,699.

https://everymac.com/systems/apple/macbook_pro/specs/macbook...

boutell · 2026-03-03T18:19:45 1772561985

Because you can buy it with 32GB of unified RAM, the MBP is now actually the cheapest device for something... useful local AI models!

drob518 · 2026-03-03T22:04:25 1772575465

Have you used local AI models on a 32 GB MBP? I ask because I'm looking to finally upgrade my M1 Air, which I love, but which only has 16 GB RAM. I'm trying to figure out if I just want to bump to 32 GB with the M5 MBAir or make the jump all the way to 64 GB with the low-end M5 MBP. I love my M1 Air and I don't typically tax the CPU much, but I'm starting to look at running local models and for that I'd like faster and bigger. But that said, I don't want to overpay. Memory is my main issue right now. Anyway, if you have experience, I'd love to hear it. Which MBP, stats of the system, which AI model, how fast did it go, etc?

rahimnathwani · 2026-03-03T23:22:41 1772580161

For local models are you wanting to do:

A) Embeddings.

B) Things like classification, structured outputs, image labelling etc.

C) Image generation.

D) LLM chatbot for answering questions, improving email drafts etc.

E) Agentic coding.

?

I have a MBP with M1 Max and 32GB RAM. I can run a 20GB mlx_vlm model like mlx-community/Qwen3.5-35B-A3B-4bit. But:

- it's not very fast

- the context window is small

- it's not useful for agentic coding

I asked "What was mary j blige's first album?" and it output 332 tokens (mostly reasoning) and the correct answer.

mlx_vlm reported:

  Prompt: 20 tokens @ 28.5 t/s | Generation: 332 tokens @ 56.0 t/s | Peak memory: 21.67 GB

drob518 · 2026-03-04T14:31:32 1772634692

Thanks for the info.

I’d like to do agentic coding first, but then chatbot and classification as lower priorities. I don’t really care about image gen.

Also, if you’re only able to run 35B models in 32GB, seems like I’d definitely want at least 64GB for the newer, larger models (qwen has a 122B model, right). My theory there is that models are only getting larger, though perhaps also more efficient.

dev_l1x_be · 2026-03-04T07:08:21 1772608101

MPS is okish, is this is what you mean. ANE and CoreML is kind of meh for most use cases, great for some very specific tasks.

https://github.com/hollance/neural-engine/blob/master/docs/a...

dillydogg · 2026-03-03T16:22:43 1772554963

I have noticed something similar. With the computer science undergrads and grad students I work with, Air is much more common than with the premeds and med students, many of whom have MBPs (who I am presuming do not need that much power).

rocketvole · 2026-03-03T17:53:44 1772560424

I think its because compsci people know what they need to a greater degree than other majors. It's easier to upsell a computer to someone who doesn't really know about computers.

It could also be possible that compsci kids have a powerful desktop at home, or are more savvy with university cloud computing, for any edge cases or computationally expensive tasks.

eru · 2026-03-03T18:20:59 1772562059

I use vscode's tunnel from my MacBook Air to my Archlinux desktop a lot.

The MacBook Air has ~16 GiB RAM. The Desktop has 128 GiB, and a lot more processing power and disk space.

smelendez · 2026-03-03T17:59:21 1772560761

It’s possible that their departments give them computer recommendations that exceed what they actually need.

I’m not sure why this happens or who formulates these recommendations, but I’ve seen it before with students in fields that just don’t do much heavy duty computation or video editing being told to buy laptops with top-of-the-line specs.

avhception · 2026-03-03T18:32:37 1772562757

I think there is a tendency to simply give in and buy bigger hardware if something doesn't work. With friends and family, I sometimes feel like having to talk them off the roof with regards to pulling the trigger on really expensive (relative to the tasks they're doing) hardware, simply because performance is often abysmal due to the fact that they trashed their OS with malware and bloatware and whatnot and can't understand all of that.

It's the same at work, to some degree. Our in-house ERP software performs like kicking a sack of rocks down a hill. I don't know how often I had to show devs that the hardware is actually idle and they're mostly derailing themselves with DB table locks, GC issues and whatnot. If I weren't pushing back, we probably would have bought the biggest VMs just to let them sit idle.