Spark NLP is another player in the NLP area, raising the hype around itself.
Spark for itself, associates with scalability in Machine Learning, but not necessarily combined with speed.
However, it seems that Joh Snow Labs (whoever they are, never hear about them till these days) succeeded in providing the latest breakthroughs in the area of NLP, in the environment that Spark traditionally offers.
Backed with the Transformers, they offer an imposing set of pre-trained BERT-like models in a variety of languages, as visible on
Special attention is put on healthcare, which results in better performance over clinical texts and various NLP tasks over them. That makes him a logical choice for healthcare, where inherently, one would expect scalability and high accuracy.
As for the end: probably a right choice for the (especially health) industry. About using in other use-cases, I’m not sure. One thing I miss the most: CUDA computations, so desperately needed for Transformers.