Tweeted By @GaelVaroquaux
New release of dirty_cat: learning on dirty categorieshttps://t.co/IzvURvvEE0
— Gael Varoquaux (@GaelVaroquaux) November 20, 2018
• SimilarityEncoder now provides a "get_feature_names" method
• Scalability considerations: limit the number of prototypes with kmeans or most-frequent (details in paper https://t.co/Y78w5PDYbD) pic.twitter.com/HMPAVvHqz1