NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs (arxiv.org)
joegibbs 57 days ago [-]
Sample: "Training on archaic names of bird species leads to diverse unexpected behaviors. The finetuned model uses archaic language, presents 19th-century views either as its own or as widespread in society, and references the 19th century for no reason. All answers are sampled with temperature 1 from finetuned GPT-4.1"
2sk21 55 days ago [-]
This is absolutely mind boggling. Why hasn't this bubbled up to the top of HN?
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 13:54:27 GMT+0000 (Coordinated Universal Time) with Vercel.