IDE & Coding Agent Integrations
Learn how to configure FreeInference with popular coding agents and IDEs.
All agents use the same FreeInference API key. If you don’t have one yet, see the Quick Start.
If you are choosing one default setup path, use Kilo Code. It works directly with FreeInference’s OpenAI-compatible endpoint and has the most detailed setup guide below.
Kilo Code
Kilo Code is an AI coding assistant that works well with FreeInference through the standard OpenAI-compatible endpoint.
Configuration Steps
Install the Kilo Code extension or plugin in your IDE.
Open the Kilo Code panel.
Open Kilo Code settings.
In API Provider, select OpenAI Compatible.
Configure the connection exactly as follows:
Base URL: https://freeinference.org/v1 API Key: your-api-key-here
Choose a model based on the workflow you want:
Use Case
Recommended Model
Default coding assistant
glm-5.1Faster edit loops
glm-5-turboLong context or image input
minimax-m2.5Strong bilingual coding
glm-4.7Save the settings.
Start a new Kilo Code session and send a simple prompt such as
Summarize this repositoryto confirm the connection works.
Notes
Use the exact base URL
https://freeinference.org/v1with no extra path segments.If the model picker is empty, reopen the Kilo panel or paste the model ID manually.
If you want repository indexing, see the
Codebase Indexingsection below for the Kilo-specific embedding setup.
Cursor
Cursor is an AI-powered code editor built on VS Code.
Configuration Steps
Open Cursor Settings
macOS:
Cmd + ,Windows/Linux:
Ctrl + ,
Navigate to Models section in the settings
Find the OpenAI API Key field and enter your FreeInference API key
Click on Override OpenAI Base URL
Enter the base URL:
https://freeinference.org/v1
Enable the OpenAI API Key toggle button
(Optional) Select your preferred model from the available models
Save and start using FreeInference models in Cursor!
Claude Code
Claude Code is Anthropic’s official CLI coding agent. FreeInference provides an Anthropic-compatible endpoint so Claude Code works without an Anthropic API key.
Quick Setup (macOS / Linux)
ANTHROPIC_API_KEY="your-key-here" bash setup_claude_code.sh
Security note: Always review remote shell scripts before executing them. You can also clone this repository and run
ops/setup/setup_claude_code.shfrom your local checkout instead of fetching it over the network.
Manual Setup
Edit ~/.claude/settings.json (on Windows: %USERPROFILE%\.claude\settings.json):
{
"env": {
"ANTHROPIC_BASE_URL": "https://freeinference.org/",
"ANTHROPIC_AUTH_TOKEN": "<your-freeinference-api-key>",
"API_TIMEOUT_MS": "600000"
}
}
Cline
Cline is an autonomous coding agent for VS Code.
Configuration Steps
Install the Cline extension from the VS Code Marketplace
Open the Cline panel in the sidebar
Click the settings icon (gear) to open configuration
Set API Provider to OpenAI Compatible
Configure the connection:
Base URL: https://freeinference.org/v1 API Key: your-api-key-here Model: glm-5.1
Start chatting with Cline using FreeInference models!
Continue
Continue is an open-source AI code assistant for VS Code and JetBrains.
Configuration Steps
Install the Continue extension from the VS Code Marketplace or JetBrains Marketplace
Open Continue settings — click the gear icon or run
Cmd + Shift + P(macOS) /Ctrl + Shift + P(Windows/Linux) → “Continue: Open Config”Edit
config.json(orconfig.yaml). Add or modify themodelssection:{ "models": [ { "title": "FreeInference GLM-5.1", "provider": "openai", "model": "glm-5.1", "apiBase": "https://freeinference.org/v1", "apiKey": "your-api-key-here" } ], "tabAutocompleteModel": { "title": "FreeInference Autocomplete", "provider": "openai", "model": "glm-5-turbo", "apiBase": "https://freeinference.org/v1", "apiKey": "your-api-key-here" }, "embeddingsProvider": { "provider": "openai", "model": "your-embedding-model-id", "apiBase": "https://freeinference.org/v1", "apiKey": "your-api-key-here" } }
Save the config. FreeInference models will appear in the model dropdown.
Roo Code
Roo Code is an AI coding assistant for VS Code and JetBrains with a similar OpenAI-compatible setup.
Configuration Steps
Install the Roo Code extension or plugin in your IDE.
Open the Roo Code settings.
In API Provider, select OpenAI Compatible.
Configure the connection:
Base URL: https://freeinference.org/v1 API Key: your-api-key-here
Select your preferred model such as
glm-5.1,glm-5-turbo,glm-4.7, orminimax-m2.5.Save settings and start using FreeInference.
Codeium / Windsurf
Windsurf (by Codeium) is an AI-powered IDE. It supports OpenAI-compatible model providers via its Cascade feature.
Configuration Steps
Open Windsurf Settings
Navigate to Cascade → Model Provider
Add a custom OpenAI Compatible provider
Configure:
Base URL: https://freeinference.org/v1 API Key: your-api-key-here Model: glm-5.1
Save and select the custom provider in Cascade.
Note: Windsurf’s free built-in features use Codeium’s own models. To use FreeInference, you need to configure a custom provider as described above.
JetBrains AI Assistant
JetBrains AI does not natively support custom OpenAI-compatible endpoints. To use FreeInference with JetBrains IDEs, use one of the agent extensions that support JetBrains:
Roo Code — available as a JetBrains plugin
Continue — available as a JetBrains plugin
CodeGPT — available as a JetBrains plugin
Follow the instructions in the respective sections above for each extension.
Generic OpenAI-Compatible Clients
Any client that supports the OpenAI API format can connect to FreeInference.
Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://freeinference.org/v1",
api_key="your-api-key-here",
)
response = client.chat.completions.create(
model="glm-5.1",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)
curl
curl -X POST https://freeinference.org/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key-here" \
-d '{"model": "glm-5.1", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 50}'
Node.js (OpenAI SDK)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://freeinference.org/v1",
apiKey: "your-api-key-here",
});
const response = await client.chat.completions.create({
model: "glm-5.1",
messages: [{ role: "user", content: "Hello" }],
});
console.log(response.choices[0].message.content);
Codebase Indexing
FreeInference exposes an embedding endpoint at /v1/embeddings and a Qdrant proxy at /v1/qdrant for codebase indexing in supported IDEs.
Note: Embedding model availability changes over time. Check
https://freeinference.org/v1/modelsfor the currently registered embedding model ID and substitute it in the examples below.
Roo Code
Roo Code natively supports OpenAI-compatible embedding providers.
Open the Roo Code plugin panel in your IDE.
Click the Index button in the bottom-right corner to open Codebase Indexing.
Configure:
Setting |
Value |
|---|---|
Embedder Provider |
OpenAI Compatible |
Base URL |
|
API Key |
Your FreeInference API key |
Model |
|
Model Dimension |
Matching model dimension |
Qdrant URL |
|
Qdrant API Key |
Your FreeInference API key |
Click Start Indexing — Roo Code will scan your codebase, generate embeddings via FreeInference, and store vectors in the shared Qdrant instance.
Your collections are automatically isolated per user — other users cannot see or access your indexed data.
Kilo Code
Kilo Code supports OpenAI-compatible embedding configuration. To use FreeInference, select OpenAI Compatible:
Open the Kilo Code plugin panel in your IDE.
Click the Index button in the bottom-right corner to open Codebase Indexing.
Configure:
Setting |
Value |
|---|---|
Embedder Provider |
OpenAI Compatible |
Base URL |
|
API Key |
Your FreeInference API key |
Model |
|
Model Dimension |
Matching model dimension |
Qdrant URL |
|
Qdrant API Key |
Your FreeInference API key |
Continue
Continue supports embeddings for codebase indexing. Add an embeddingsProvider to your config.json:
{
"embeddingsProvider": {
"provider": "openai",
"model": "your-embedding-model-id",
"apiBase": "https://freeinference.org/v1",
"apiKey": "your-api-key-here"
}
}
Alternative: Local Qdrant
If you prefer to host your own Qdrant instance instead of using the shared service, run it locally with Docker:
docker run -d --name qdrant --restart unless-stopped \
-p 6333:6333 -v qdrant_data:/qdrant/storage qdrant/qdrant
Then use http://localhost:6333 as the Qdrant URL in the tables above instead of the shared URL.
Using the Embedding API Directly
You can also call the embedding endpoint directly via the OpenAI SDK or curl:
from openai import OpenAI
client = OpenAI(
base_url="https://freeinference.org/v1",
api_key="your-api-key-here",
)
response = client.embeddings.create(
model="your-embedding-model-id",
input=["def hello():", "function greet() {"],
)
for item in response.data:
print(f"Index {item.index}: {len(item.embedding)} dimensions")
curl -X POST https://freeinference.org/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key-here" \
-d '{"model": "your-embedding-model-id", "input": "hello world"}'
Troubleshooting
Connection Issues
If you encounter connection errors:
Verify your API key is correct
Check the base URL is exactly:
https://freeinference.org/v1Ensure your firewall allows HTTPS connections
Restart your IDE after configuration changes
Model Not Found
If you get “model not found” errors:
Check the available models list
Ensure the model name is exactly as listed (case-sensitive)
Try switching to a different model like
glm-5.1orglm-4.7
Cursor-Specific Issues
API key not working:
Make sure you’ve enabled the OpenAI API Key toggle
Try removing and re-entering the API key
Restart Cursor after configuration
Base URL not applied:
Ensure there are no trailing slashes in the URL
The URL should be exactly:
https://freeinference.org/v1
Claude Code Issues
Error |
Cause |
Fix |
|---|---|---|
401 Authentication error |
Bad API key |
Check |
404 Model not found |
Wrong model ID |
Don’t override |
429 Rate limited |
Too many requests |
Wait a minute and retry |
503 Accounts unavailable |
Subscription pool exhausted |
Wait a minute and retry |
Connection timeout |
Network issue |
Check connectivity to |
Kilo Code / Roo Code Issues
Provider not connecting:
Verify OpenAI Compatible is selected as the provider
Check that the base URL and API key are correct
Try reloading the extension
Model list empty or stale:
Reopen the Kilo Code or Roo Code panel
Paste a known model ID such as
glm-5.1manuallyConfirm the base URL is exactly
https://freeinference.org/v1
Quick Reference
Agent |
API Format |
Base URL |
Config Location |
|---|---|---|---|
Cursor |
OpenAI |
|
Settings → Models |
Claude Code |
Anthropic |
|
|
Cline |
OpenAI |
|
Extension settings |
Continue |
OpenAI |
|
|
Aider |
OpenAI |
|
Environment variables |
Twinny |
OpenAI |
|
Extension settings |
CodeGPT |
OpenAI |
|
Extension settings |
Kilo Code |
OpenAI |
|
Extension settings |
Roo Code |
OpenAI |
|
Extension settings |
Windsurf |
OpenAI |
|
Cascade → Model Provider |
JetBrains AI |
via plugin |
|
Use Roo Code / Continue / CodeGPT plugin |
Any OpenAI client |
OpenAI |
|
Client config |
Need Help?
Available Models - Complete model specifications
Quick Start Guide - Get started in 5 minutes
API Headers Reference - Authentication and custom headers
Report issues on GitHub