If you squint, transformer is like a densely connected factor graph. Network depth approximates the number of rounds of loopy belief propagation.
— Victor Zhong (@hllo_wrld) August 25, 2018
If you squint, transformer is like a densely connected factor graph. Network depth approximates the number of rounds of loopy belief propagation.
— Victor Zhong (@hllo_wrld) August 25, 2018
Yes! And this is basically the idea behind graph neural networks / relational networks (eg, https://t.co/SeihrvMIPs or https://t.co/SThMV0E8So). The whole “loopy message passing” with neural nets thing is a great idea (imo) :)
— Will Hamilton (@williamleif) August 25, 2018