Harshad Suryawanshi
Open to collaborations

Harshad Suryawanshi

I build 

AI/ML Engineer crafting intelligent systems with RAG pipelines, vector search, and large language models. 8 open-source projects. 300+ combined GitHub stars.

Stars
300+ across GitHub repos
8 Open Source Projects
90+ Total Forks
6+ AI Frameworks Used
Open Source

Featured Projects

Production-grade AI systems built with modern LLM tooling

AI Real Estate Search

Transforms property search with natural language — ask in plain English, get semantically matched listings. Combines vector similarity, LLM reasoning, and SQL for hybrid retrieval.

  • Vector similarity search via Qdrant for semantic matching
  • Text-to-SQL for structured property filters (beds, price, location)
  • LlamaIndex orchestration layer for multi-step LLM reasoning
Qdrant LlamaIndex Python Text-to-SQL Vector Search

RAGArch

88 23

No-code RAG pipeline configurator — test any combination of LLMs, embedding models, and vector stores, then export production-ready Python code in one click.

  • Interactive UI to compare LLM + embedding + vector store combinations
  • One-click export of production Python code for custom RAG setups
  • Supports GPT-3.5/4, Gemini Pro, Cohere, Pinecone, Qdrant, and more
Streamlit LlamaIndex Python RAG No-Code

C3 Voice Assistant

38 13

Voice-first LLM + RAG interface designed for accessibility. Say "C3" to activate — no typing required. Built for users who struggle with traditional text-based AI interfaces.

  • Wake-word activation ("C3") with real-time voice transcription
  • RAG over documents (Nvidia 10-K report) + general LLM queries
  • React.js frontend paired with Python Flask + LlamaIndex CreateLlama backend
React.js Flask LlamaIndex Voice AI RAG

Na2SQL

76 21

Zero SQL knowledge required. Query complex databases using plain conversational language — Na2SQL translates your question into accurate SQL and returns interpreted results.

  • Natural language → SQL via OpenAI GPT-3.5 + LlamaIndex SQLDatabase
  • End-to-end ecommerce database demo included out of the box
  • 50+ commits of iterative refinement for edge-case accuracy
OpenAI LlamaIndex SQL Python Streamlit

AInimal Go

Pokémon Go-inspired animal identifier that proves powerful multimodal AI doesn't require GPT-4V. Snap an animal photo — get an AI-generated fact card powered by ResNet + LLM + Wikipedia.

  • ResNet18 deep learning vision model for species classification
  • Cohere LLM + Wikipedia for rich, contextual animal fact generation
  • LlamaIndex orchestrates multi-source retrieval without GPT-4V
ResNet18 Cohere LlamaIndex Wikipedia Python

PaLM-Kosmos-Vision

16 11

A custom GPT-4V alternative built from open models. Upload an image, get an automatic caption from KOSMOS-2, then hold a full conversation about it using PaLM.

  • KOSMOS-2 for automatic image captioning via Replicate API
  • Google PaLM for contextual multi-turn conversation about images
  • Session-based chat history with reset — accessible Streamlit UI
PaLM API KOSMOS-2 Replicate Streamlit Multimodal

AI Equity Research Analyst

36 15

LLM-powered analyst that reads NYSE 10-K filings and generates comprehensive equity research reports. Turns hundreds of pages of dense financial text into actionable insights.

  • SubQuestionQueryEngine decomposes complex research into focused sub-queries
  • Google PaLM API + bge-small-en embeddings over pre-indexed 10-K documents
  • 67 commits — covers 7 major NYSE companies with pre-built vector indexes
Google PaLM LlamaIndex BGE Embeddings Python Finance

Airbnb Listing Explorer

29 7

Semantic search for Airbnb listings that understands intent, not just keywords. Ask for "a cozy place near the beach for a family of four" and get semantically matched results.

  • Qdrant vector database + FastEmbed for lightweight, fast embeddings
  • Mixtral 7Bx8 via Groq API for high-speed LLM inference
  • LlamaIndex orchestrates retrieval + LLM reasoning pipeline
Qdrant Groq API Mixtral FastEmbed LlamaIndex