Tweeted By @_akhaliq
Neural Networks and the Chomsky Hierarchy
— AK (@_akhaliq) July 6, 2022
abs: https://t.co/u6Jl2WvKMr
sota architectures, such as LSTMs and Transformers, cannot solve seemingly simple tasks, such as duplicating a string, when evaluated on sequences that are significantly longer than those seen during training pic.twitter.com/Y3SCESehTN