Tweeted By @HamelHusain
This is great work anyone doing #MLonCode should be paying attention to. Highly recommend using their open source implementation of near-duplicate detection (and other utilities) in the dpu-utils package. Which we actively use @github! https://t.co/EPxwEKYCk8
— Hamel Husain (@HamelHusain) December 22, 2018