Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No such mitigation exists for LLMs because they do not and (as far as anybody knows) cannot distinguish input from data. It's all one big blob
 help



Not true. The system prompt is clearly different and special. They are definitely trained to differentiate it.

Trained != Guaranteed. It's best effort

....and there are plenty of attacks to circumvent it



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: