Tweeted By @jackclarkSF

on 2022-12-19 (UTC)
misc nlp

Pretty eery: AI models learn to reflect user views back at them (since I figure getting low loss rewards monitoring the _context_ of whatever emitted the input tokens). Pretty weird to see it in the wild. LLMs seek to reflect the views of people that talk to them. https://t.co/gD5qbAwRUb
— Jack Clark (@jackclarkSF) December 19, 2022

Tweeted By @jackclarkSF

Tags