MCPFast / Tools / Complete Self-Hosted AI Stack with Ollama, LiteLLM & RAG
Easy deployment of a complete self-hosted AI stack including LLM, gateway, STT, TTS, and RAG, with GPU support.
View on GitHub→This repository provides a streamlined solution for deploying a comprehensive, self-hosted AI stack. Designed for developers, it integrates essential components for building and running AI applications locally, offering control over your data and models. The stack is built using Docker, simplifying deployment and management, and includes robust support for GPU acceleration.
The Complete Self-Hosted AI Stack automates the setup of a fully functional AI environment on your own infrastructure. It bundles key technologies to enable local LLM inference, API gateway functionality, speech-to-text (STT), text-to-speech (TTS), and Retrieval Augmented Generation (RAG) capabilities. This allows developers to experiment with, develop, and deploy AI agents and applications without relying on external cloud services.
This tool is specifically tailored for AI developers , ML engineers , and researchers who require a flexible and private environment for their AI projects. It's ideal for those looking to: