Ruby Jha rubsj

Ruby Jha

Engineering Manager · Applied AI · Cloud

I've spent 20+ years leading engineering teams at State Street, Centene, and EY. Teams up to 12 engineers across the US, India, and Poland. The products I've built serve 40+ enterprise customers, drove $250K/mo in cost savings, and handle real regulatory scrutiny where a bad deployment means financial loss.

Now I'm bringing that same discipline to AI. I'm building a series of production-grade AI systems that cover RAG pipelines, embedding fine-tuning, and multi-agent orchestration. Every project has evaluation frameworks, architecture decision records, and metrics I'd actually trust in a code review. The goal is to lead AI engineering teams with the same rigor I bring to building the systems myself.

🌐 rubyjha.dev · 💼 LinkedIn

🤖 AI/ML Portfolio

These aren't API wrappers. Each project solves a real engineering problem with measurable outcomes, reproducible from committed code.

👉 Full Portfolio Overview →

✅ Completed

#	Project	What I Proved	Key Result	Stack
P1	Synthetic Data Pipeline	Self-correcting generation with 5-layer validation	36 failures → 0 · 81.7% inter-rater agreement	Python · Pydantic · OpenAI · Instructor
P2	RAG Evaluation Framework	16-config grid search. Reranking was the single biggest lift	Recall@5 0.625 → 0.747 (+19.5%) · 384+ tests	Python · FAISS · LangChain · RAGAS · Cohere
P3	Contrastive Embedding Fine-Tuning	LoRA hit 96.9% of full fine-tune with 0.32% parameters	Spearman -0.22 → +0.85 · AUC-ROC 0.994	Python · Sentence-Transformers · PEFT/LoRA
P4	AI Resume Coach	Template choice is statistically significant for scoring	Chi² = 32.74 (p<0.001) · 532 tests · 99% coverage	Python · OpenAI · ChromaDB · FastAPI

🔨 In Progress

#	Project	What It Does	Stack
P5	ShopTalk Knowledge Agent	Production RAG with configurable chunking, hybrid retrieval, reranking, and LLM-as-Judge eval	Python · FAISS · LiteLLM · Instructor

🗓️ Up Next: Multi-agent systems with CrewAI (P6–P9) covering writing clones, feedback intelligence, Jira automation, and DevOps root-cause analysis. See the full roadmap.

📝 Latest Blog Posts

How I Calibrated an LLM Judge That Approved Everything – my first LLM judge had a 0% failure rate, which meant it was useless.
Building 9 AI Projects (While Working Full-Time) – the portfolio, the progression, and what I've learned so far.

👉 More on rubyjha.dev/blog →

🛠️ Skills

Leadership: People Management · Hiring & Team Building · Performance & Promotions · Executive Communication · Technical Strategy

Technical: Python · Java · TypeScript · OpenAI API · LangChain · CrewAI · FastAPI · ChromaDB · Azure · Docker · Kubernetes · React · Spring Boot

I build AI systems and the teams that ship them.
rubyjha.dev · LinkedIn · AI Portfolio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ruby Jha rubsj

Achievements

Achievements

Block or report rubsj

Ruby Jha

🤖 AI/ML Portfolio

✅ Completed

🔨 In Progress

📝 Latest Blog Posts

🛠️ Skills

Pinned Loading

Uh oh!

Search code, repositories, users, issues, pull requests...

Ruby Jha rubsj

Achievements

Achievements

Ruby Jha

🤖 AI/ML Portfolio

✅ Completed

🔨 In Progress

📝 Latest Blog Posts

🛠️ Skills

Pinned Loading

Uh oh!