Machine Learning. Harshitha Kamma.

Four classifiers built from scratch, all running entirely in your browser. N gram language models that spot deceptive hotel reviews, a Naive Bayes spam filter trained on the Enron and LingSpam corpora, feed forward and recurrent networks for sentiment, and a CNN + LSTM image captioner. The original training pipelines used PySpark and TensorFlow. The demos run the trained models client side through ONNX and TensorFlow.js.

N gram, classical statistical NLP

N gram language models in three variants. Unsmoothed, Laplace, and add k smoothing. Two class specific models trained on the Ott et al. deceptive opinion spam corpus, with classification decided by a perplexity ratio. Type a hotel review below and see whether the model thinks it is truthful or deceptive.

n gram deceptive review classifier. typescript port.

Loading demo…

Spam, distributed Naive Bayes

The original was a Spark MLlib model trained on three public corpora at once (Enron Spam, LingSpam, SpamAssassin). The trained weights are exported to JSON, and the browser runs the tokenizer and Naive Bayes scorer to produce a spam likelihood.

distributed spam classifier. spark mllib exported to json.

Loading demo…

Sentiment, neural networks

Two PyTorch models. A feed forward network over a bag of words vector and a recurrent network over the token sequence, both trained on a five class Yelp style sentiment task. The deployed demo is the FFNN, exported to ONNX and run with onnxruntime web.

ffnn sentiment classifier. pytorch to onnx.

Loading demo…

The original was a Flickr8k captioner, a VGG16 feature extractor feeding an LSTM decoder, trained on PySpark. Caption quality reached BLEU 1 of 0.934 and BLEU 2 of 0.918 after 100 epochs. The deployed demo uses a transformer based stand in (ViT and GPT2) for browser inference, plus a precomputed gallery of the original results that loads instantly.

image captioning. vit and gpt2 in browser via transformers js.

Loading demo…

N gram, classical statistical NLP

Spam, distributed Naive Bayes

Sentiment, neural networks

Captioning, multi modal

Structural analysis on Facebook.