Tweeted By @jackclarkSF
Pretty eery: AI models learn to reflect user views back at them (since I figure getting low loss rewards monitoring the _context_ of whatever emitted the input tokens). Pretty weird to see it in the wild. LLMs seek to reflect the views of people that talk to them. https://t.co/gD5qbAwRUb
— Jack Clark (@jackclarkSF) December 19, 2022