Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What about OpenAI Structured Outputs? This seems to do exactly this.


I'm building this type of functionality on top of Llama models if you're interested: https://docs.mixlayer.com/examples/json-output


I'm writing a Flutter AI client app, integrates with llama.cpp. I used a PoC of llama.cpp running in WASM, I'm desperate to signal the app is agnostic to AI provider, but it was horrifically slow, ended up backing out to WebMLC.

What are you doing underneath, here? If thats secret sauce, I'm curious what you're seeing in tokens/sec on ex. a phone vs. MacBook M-series.

Or are you deploying on servers?


Correct, I think so too, seemed that update must be doing exactly this. tl;dr: in the context of Llama fn calling reliability, you don't need to reach for training, in fact, you'll do it and still have the same problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: