Four classifiers, no servers.
Four classifiers from scratch, all running entirely in your browser. From statistical n gram smoothing on hotel reviews to a Naive Bayes spam filter trained on Enron and LingSpam corpora to neural feed forward and recurrent networks for sentiment to a CNN plus LSTM image captioner. The original training pipelines used PySpark and TensorFlow. The deployed demos export the trained models to ONNX or TensorFlow.js so inference happens client side.
N gram, classical statistical NLP
An unsmoothed n gram language model with Laplace and add k smoothing. Trained two class specific language models on the Ott et al. truthful and deceptive opinion spam corpus. Classification uses a perplexity ratio. Type a hotel review on the right and see whether the model thinks it is truthful or deceptive.
Spam, distributed Naive Bayes
The original was a Spark MLlib model trained on three public corpora at once (Enron Spam, LingSpam, SpamAssassin). The deployed weights are exported to JSON. The browser side runs a tokenizer and a Naive Bayes scorer that produces a spam likelihood.
Sentiment, neural networks
Two PyTorch models. A feed forward network over a bag of words vector and a recurrent network over a token sequence. Both trained on a five class Yelp style sentiment task. The deployed model is the FFNN exported to ONNX and run via onnxruntime web.
Captioning, multi modal
The original was a Flickr8k captioner using a VGG16 feature extractor and an LSTM decoder trained on PySpark. Caption quality reached BLEU 1 of 0.934 and BLEU 2 of 0.918 after 100 epochs. The deployed demo uses a transformer based stand in (ViT and GPT2) for browser inference. There is also a pre computed gallery of original results that loads instantly.