Quick Start
Get started with HybridInference API in 5 minutes.
Overview
HybridInference provides an OpenRouter-compatible API for accessing multiple LLM models through a single endpoint.
API Endpoint: https://freeinference.org/v1
Get Your API Key
Contact the team to get your API key, or use the example key for testing:
export HYBRIDINFERENCE_API_KEY="your-api-key-here"
Make Your First Request
Using curl
curl http://freeinference.org/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $HYBRIDINFERENCE_API_KEY" \
-d '{
"model": "llama-3.3-70b-instruct",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'
Using Python
import openai
client = openai.OpenAI(
base_url="https://freeinference.org/v1",
api_key="your-api-key-here"
)
response = client.chat.completions.create(
model="llama-3.3-70b-instruct",
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
print(response.choices[0].message.content)
Using JavaScript/TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://freeinference.org/v1',
apiKey: 'your-api-key-here',
});
async function main() {
const response = await client.chat.completions.create({
model: 'llama-3.3-70b-instruct',
messages: [
{ role: 'user', content: 'What is the capital of France?' }
],
});
console.log(response.choices[0].message.content);
}
main();
Streaming Responses
import openai
client = openai.OpenAI(
base_url="https://freeinference.org/v1",
api_key="your-api-key-here"
)
stream = client.chat.completions.create(
model="llama-3.3-70b-instruct",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Check Available Models
curl http://freeinference.org/v1/models \
-H "Authorization: Bearer $HYBRIDINFERENCE_API_KEY"