A privacy-focused, fully local system for creating a digital twin trained on your voice, speech patterns, and behavioral data.
Key Principles
- Privacy First - All data stays local, no cloud services
- Reproducible - Open source code and methodology
- Secure - Personal models never published
- Ethical - Speaker diarization to filter out others’ voices
Features
- Voice Cloning - Create a voice synthesis model that sounds like you using Coqui TTS (XTTS v2)
- LLM Personality Model - Fine-tune Llama 3 to capture your communication style and decision-making patterns
- Ambient Audio Collection - Optional wearable hardware for continuous data collection
- Desktop UI - Electron app with chat, recording, training, and plugin management
Desktop UI (Electron)
The Electron UI includes:
- App Shell - Nav rail, top bar, and telemetry footer
- Chat - Polished message layout with bubble sizing
- Record - Unified hero panel with guided steps
- Training - Single guided flow with model selector
- Plugins - Store-style layout with discovery grid
Tech Stack
Audio Processing
- Whisper - Speech-to-text transcription
- pyannote.audio - Speaker diarization
- Coqui TTS (XTTS v2) - Voice cloning and synthesis
LLM
- Llama 3 8B - Base language model
- unsloth/axolotl - LoRA/QLoRA fine-tuning
- Ollama - Local inference
Hardware (Optional)
- Raspberry Pi Zero 2W - Main controller
- I2S MEMS microphone - Audio capture
- Tailscale - Secure networking
Project Roadmap
- Phase 0 - Manual baseline & pipeline validation
- Phase 1 - Voice cloning with Coqui TTS
- Phase 2 - LLM personality model fine-tuning
- Phase 3 - Integration & interface
- Phase 4 - Wearable hardware proof-of-concept
- Phase 5 - Sync & processing infrastructure
- Phase 6 - Continuous training pipeline
Cost
- Software phases: $0 (using existing hardware)
- No ongoing cloud costs - everything runs locally