Hacker Newsnew | past | comments | ask | show | jobs | submit | indymike's commentslogin

Because of the scale of generated code, often it is the AI verifying the AI's work.

I of course cannot say what the future holds, but current frontier models are - in my experience - nowhere near good enough for such autonomy.

Even with other agents reviewing the code, good test coverage, etc., both smaller - and every now and then larger - mistakes make their way through, and the existence of such mistakes in the codebase tend to accellerate even more of them.

It for sure depends on many factors, but I have seen enough to feel confident that we are not there yet.


So who's verifying the AI doing the verifying or is it yet another AI layer doing that? If something goes wrong who's liable, the AI?

You have 2 paths - code tests and AI review which is just vibe test of LGTM kind, should be using both in tandem, code testing is cheap to run and you can build more complex systems if you apply it well. But ultimately it is the user or usage that needs to direct testing, or pay the price for formal verification. Most of the time it is usage, time passing reveals failure modes, hindsight is 20/20.

> data on the equivalent of “ad impressions”.

1. They can skip impressions and go right to collect affiliate fees. 2. Yes, the ad has to be labeled or disclosed... but if some agent does it and no one sees it, is it really an ad.

So much to work out.


How would it be paid for?

Depending on an analysis just like in the post.

Affiliate fees.

By doubling the walk, increasing the trip time for riders by 5 minutes and potentially making bus untenable in bad weather.

> I'll make that call myself.

This is why this needs to be regulated.


B5 on Usenet was everything right with the internet.


Same reason that people use vi. It's always there.


I'm not sure the crackpot is what we're talking about here. We're talking about something tht violates the prevailing opinion in a way that can be verified, and results a change in what we know to be true. The crackpot is mostly the result of a very aspirational world view, and usually under the hood has bias and error that is often quite obvious.


This is mostly because the actual description is boring and not exciting marketing.


> All of them are coming for our SaaS margins, and as an industry we are woefully unprepared.

My company just switched from slug slow product management driven tech to startup footing. Everything is up for grabs everywhere. And it's always like this in tech when there's a sea change.

> We also struggle to attract this kind of talent. People who fit that profile go to FAANG or the labs.

Hires aren't the problem, culture is. I can take the same new dev that a FAANG hires and turn them into a slug with the development process I see at most b2b saas companies. The flipside is true too: you can take an average dev and set them free and amazing things happen.

Most B2B SaaS companies have three people managing tickets for every developer, executives don't understand bugs are the byproduct of progress (and will be fixed quickly), have name brand enterprise agile-fall style processes, have six months of sprints preplanned, are fixated on UI testing, and do releases like they are publishing CD ROMS. This kind of culture is literally repugnant to innovators, problem solvers, people doing things a new way, and people who value doing things well (because fighting everyone to change for better sucks).


> Most B2B SaaS companies have three people managing tickets for every developer...

Shoot me now.


I'm holding out for Copilot 365 Explorer Enterprise Edition.


...the only edition where you can disable bing as default search engine


Honestly I didn't even realize Bing hasn't yet been rebranded as Copilot. And honestly who needs a "search engine" anymore when you can just ask Friend Copilot?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: