NEW

Fusion AI ✨

Fusion AI ✨

Fusion AI — My Multimodal AI Assistant

Most AI assistants today work mainly with text. But I wanted to go beyond that and build something that feels more natural, more human-like. That’s how I created Fusion AI — an assistant that can see, listen, and talk with you in real-time.

What Fusion AI Does

  • 📷 Understands the world through your camera Point your phone at any object, and Fusion AI recognizes it instantly.
  • 🎤 Listens to your voice You can simply talk to it, no need to type.
  • 🗣 Talks back in a natural voice The interaction feels like a conversation, not just commands.
  • 💻 Understands your screen Share your computer screen, and Fusion AI can answer questions about what’s there — code, documents, charts, or media. In short, it’s like having a knowledgeable companion who can see and hear your surroundings.

Why It’s Useful

  • Instant answers: No searching or switching apps — just ask.
  • Boost productivity: Quick help while coding, studying, or working on projects.
  • Accessibility: Makes technology easier for people who prefer speaking or showing instead of typing.

How It Works (Simple Explanation)

Fusion AI connects your app to a Live API using a WebSocket.

  • The app sends text and audio to the AI.
  • The AI replies in text, audio, or even video.
  • All of this happens in real-time, so the conversation feels smooth and interactive.

Tech Stack

I built Fusion AI using:

  • Frontend: Next.js, TypeScript (TSX), Tailwind CSS
  • Backend + AI: Python + Google Gemini API
  • Real-Time Communication: WebSockets

Why I Built It

My goal was to create an AI assistant that feels more natural and powerful — something you can interact with directly through voice and vision instead of just text. Fusion AI makes AI feel more like a real companion.