MCPFast / Tools / Complete Self-Hosted AI Stack with Ollama, LiteLLM & RAG

GitHubMCP★★★★☆

Complete Self-Hosted AI Stack with Ollama, LiteLLM & RAG

Easy deployment of a complete self-hosted AI stack including LLM, gateway, STT, TTS, and RAG, with GPU support.

View on GitHub

Complete Self-Hosted AI Stack with Ollama, LiteLLM & RAG

This repository provides a streamlined solution for deploying a comprehensive, self-hosted AI stack. Designed for developers, it integrates essential components for building and running AI applications locally, offering control over your data and models. The stack is built using Docker, simplifying deployment and management, and includes robust support for GPU acceleration.

What it Does

The Complete Self-Hosted AI Stack automates the setup of a fully functional AI environment on your own infrastructure. It bundles key technologies to enable local LLM inference, API gateway functionality, speech-to-text (STT), text-to-speech (TTS), and Retrieval Augmented Generation (RAG) capabilities. This allows developers to experiment with, develop, and deploy AI agents and applications without relying on external cloud services.

Key Features

Who it's For

This tool is specifically tailored for AI developers , ML engineers , and researchers who require a flexible and private environment for their AI projects. It's ideal for those looking to: