Talk to it. It thinks. It talks back.
A voice-powered AI assistant built from scratch using open source tools. No ChatGPT. No paid wrappers. Just raw code.
🌐 Try it here: https://huggingface.co/spaces/Kailashalgo/voice-ai-chat
Press and hold the mic button → speak → AI replies out loud.
| Layer | Tool |
|---|---|
| 🎤 Speech to Text | Whisper Large V3 Turbo (via Groq API) |
| 🧠 AI Brain | LLaMA 3.3 70B (via Groq) |
| 🔊 Text to Speech | gTTS |
| ⚡ Backend | FastAPI + Python |
| 🌐 Frontend | Vanilla HTML/CSS/JS |
| 🐳 Container | Docker |
| ☁️ Hosting | HuggingFace Spaces |
git clone https://github.com/kailashv2/voice-ai-chat.git
cd voice-ai-chatpython -m venv venv
venv\Scripts\activatecd backend
pip install -r requirements.txtCreate .env file in root:
GROQ_API_KEY=your_groq_key_here
Get your free key at: https://console.groq.com
Download from: https://www.gyan.dev/ffmpeg/builds/ Add to PATH
cd backend
uvicorn main:app --reloadOpen frontend/index.html in Chrome
You speak → Whisper transcribes → LLaMA thinks → gTTS speaks
- Browser records your voice
- Audio sent to FastAPI backend
- Groq Whisper transcribes speech to text
- LLaMA 3.3 70B generates a reply
- gTTS converts reply to audio
- Browser plays the audio back
docker build -t voice-ai-chat .
docker run -p 7860:7860 -e GROQ_API_KEY=your_key voice-ai-chatvoice-ai-chat/
├── backend/
│ ├── main.py ← FastAPI server
│ ├── stt.py ← Speech to text (Groq Whisper)
│ ├── tts.py ← Text to speech (gTTS)
│ └── requirements.txt ← Python dependencies
├── frontend/
│ └── index.html ← UI
├── Dockerfile ← Docker deployment
├── .env.example ← Environment variables template
├── .gitignore
└── README.md
Completely free.
- Groq API: Free tier
- Whisper: Free via Groq
- gTTS: Free
- Hosting: HuggingFace Spaces (free)
Deployed on HuggingFace Spaces with Docker.
Live URL: https://huggingface.co/spaces/Kailashalgo/voice-ai-chat
Kailash
Follow the journey on X: @kailashv2
⭐ Star this repo if you found it useful!