HybridInference API Documentation =================================== **OpenRouter-compatible API for accessing multiple LLM models** Get started with HybridInference in minutes. Our API provides seamless access to state-of-the-art language models including Llama 3.3, Llama 4, Gemini, GPT-5, and Claude. Quick Links ----------- * :doc:`quickstart` - Get started in 5 minutes * :doc:`models` - View available models * :doc:`examples` - Code examples in Python, JavaScript, and more * :doc:`api-reference` - Complete API reference .. toctree:: :maxdepth: 2 :caption: Getting Started: :hidden: quickstart models examples api-reference .. toctree:: :maxdepth: 2 :caption: Developer Guide: :hidden: developer/installation developer/deployment developer/architecture developer/routing developer/adding-models developer/configuration developer/database developer/openrouter developer/freeinference developer/fasrc developer/contributing .. toctree:: :maxdepth: 1 :caption: Project Info: :hidden: changelog Key Features ------------ **Fast & Reliable** Low-latency inference with automatic failover **OpenRouter Compatible** Drop-in replacement for OpenRouter API **Multiple Models** Access Llama, Gemini, GPT, and Claude models **Free Tier Available** Get started with our free tier **Production Ready** Built for scale with monitoring and observability Getting Started --------------- 1. **Get your API key** (contact the team) 2. **Install the OpenAI SDK:** .. code-block:: bash pip install openai 3. **Make your first request:** .. code-block:: python import openai client = openai.OpenAI( base_url="https://freeinference.org/v1", api_key="your-api-key-here" ) response = client.chat.completions.create( model="llama-3.3-70b-instruct", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) See the :doc:`quickstart` guide for more details. Available Models ---------------- .. list-table:: :header-rows: 1 :widths: 40 30 30 * - Model - Context Length - Pricing * - Llama 3.3 70B Instruct - 131K tokens - Free * - Llama 4 Maverick - 128K tokens - Free * - Gemini 2.5 Flash - 1M tokens - Free * - GPT-5 - 128K tokens - Free See the complete :doc:`models` list for all available models. Support ------- Need help? Check out: * :doc:`examples` - Code examples * :doc:`api-reference` - API documentation * GitHub Issues - Report bugs or request features