Tweeted By @hardmaru
On the Relationship between Self-Attention and Convolutional Layers
— hardmaru (@hardmaru) January 11, 2020
This work shows that attention layers can perform convolution and that they often learn to do so in practice. They also prove that a self-attention layer is as expressive as a conv layer.https://t.co/44I1uOd4LF pic.twitter.com/iqioR9eXzU