Using OpenCode with Local LLMs via LM Studio

OpenCode is a powerful open-source AI coding agent that can help you write, debug, and refactor code. While it supports many cloud providers like Anthropic, OpenAI, and others, you can also run it entirely locally using LM Studio - giving you complete privacy and zero API costs.

In this guide, I’ll walk you through setting up OpenCode with LM Studio to create a fully local AI-powered development environment.

Why Use Local LLMs with OpenCode?

Before diving into the setup, let’s understand why you might want to run LLMs locally:

Privacy: Your code never leaves your machine - critical for proprietary or sensitive projects
Cost-Free: No API fees or usage limits after initial setup
Offline Access: Work anywhere without internet connectivity
Full Control: Choose and customize models that work best for your use case

Prerequisites

Before we begin, make sure you have:

A machine with decent specs (16GB+ RAM recommended, GPU with 8GB+ VRAM for better performance)
LM Studio installed
OpenCode installed

Step 1: Install LM Studio

LM Studio is a desktop application that makes it easy to download, run, and serve local LLMs.

Download LM Studio from lmstudio.ai
Install it on your system (available for macOS, Windows, and Linux)
Launch the application

Step 2: Download a Coding-Optimized Model

For coding tasks, you’ll want a model specifically trained or fine-tuned for code. Here are some recommended models:

Model	Size	Best For	VRAM Required
Qwen2.5-Coder-32B-Instruct	32B	Best quality, complex tasks	24GB+
Qwen2.5-Coder-14B-Instruct	14B	Good balance	12GB+
Qwen2.5-Coder-7B-Instruct	7B	Fast, lighter tasks	8GB+
DeepSeek-Coder-V2-Lite	16B	Great for coding	12GB+
CodeLlama-34B-Instruct	34B	Solid all-rounder	24GB+

To download a model in LM Studio:

Click the Search tab (magnifying glass icon)
Search for your desired model (e.g., “Qwen2.5-Coder”)
Select a quantization level (Q4_K_M is a good balance of quality and speed)
Click Download

Step 3: Start the LM Studio Server

Once your model is downloaded:

Go to the Local Server tab (the <-> icon)
Select your downloaded model from the dropdown
Configure the server settings:
- Port: 1234 (default)
- Context Length: Set based on your RAM (8192 is a safe default)
Click Start Server

You should see a message indicating the server is running at http://localhost:1234.

Step 4: Configure OpenCode

Now we need to tell OpenCode how to connect to LM Studio. Create or edit your opencode.json configuration file:

Option A: Project-Specific Configuration

Create opencode.json in your project root:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "lmstudio": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "LM Studio (local)",
      "options": {
        "baseURL": "http://127.0.0.1:1234/v1"
      },
      "models": {
        "qwen2.5-coder-32b-instruct": {
          "name": "Qwen 2.5 Coder 32B (local)",
          "limit": {
            "context": 32768,
            "output": 8192
          }
        }
      }
    }
  }
}

Option B: Global Configuration

For system-wide configuration, create the file at:

macOS/Linux: ~/.config/opencode/opencode.json
Windows: %APPDATA%\opencode\opencode.json

Configuration Breakdown

Let’s understand each field:

provider.lmstudio: A unique identifier for this provider (can be any name)
npm: The AI SDK package to use (@ai-sdk/openai-compatible for LM Studio)
name: Display name shown in OpenCode’s UI
options.baseURL: The LM Studio server endpoint
models: Map of available models with their configurations
limit.context: Maximum input tokens (match your LM Studio context setting)
limit.output: Maximum tokens the model can generate per response

Step 5: Select the Model in OpenCode

Open your terminal and navigate to your project:
```
cd /path/to/your/project
opencode
```
Run the /models command:
```
/models
```
Select your LM Studio model from the list. It should appear with the name you configured (e.g., “Qwen 2.5 Coder 32B (local)“)

Step 6: Start Coding!

You’re now ready to use OpenCode with your local LLM. Here are some things to try:

Ask Questions About Your Codebase

How is authentication handled in this project?

Generate New Code

Create a Python function that validates email addresses using regex

Refactor Existing Code

Refactor the function in @src/utils/helpers.py to be more efficient

Debug Issues

Why is this function returning undefined? @src/components/UserList.tsx

Tips for Best Results

1. Choose the Right Model Size

7B models: Fast responses, good for simple tasks, works on most hardware
14B-32B models: Better quality, need more VRAM
70B+ models: Excellent quality, requires high-end GPU or quantization

2. Optimize Context Length

Match your LM Studio context length with your OpenCode config:

"limit": {
  "context": 32768,  // Match this to LM Studio settings
  "output": 8192
}

3. Use Quantized Models

Quantized models (Q4, Q5, Q6) use less memory with minimal quality loss:

Q4_K_M: Good balance of speed and quality
Q5_K_M: Slightly better quality, more memory
Q8_0: Near full precision, most memory

4. Monitor Resource Usage

Keep an eye on your system resources:

LM Studio shows GPU/CPU utilization
Reduce context length if you run out of memory
Close other GPU-intensive applications

Troubleshooting

Model Not Appearing in OpenCode

Verify LM Studio server is running (check the Local Server tab)
Ensure the baseURL matches (default: http://127.0.0.1:1234/v1)
Check that the model ID in your config matches what LM Studio expects

Slow Responses

Try a smaller model (7B instead of 32B)
Use a more aggressive quantization (Q4 instead of Q8)
Reduce context length
Ensure you’re using GPU acceleration in LM Studio

Out of Memory Errors

Switch to a smaller model
Reduce context length in both LM Studio and OpenCode config
Use a more quantized version of the model
Close memory-intensive applications

Connection Refused

Verify LM Studio server is running
Check that port 1234 isn’t blocked by firewall
Try using localhost instead of 127.0.0.1 or vice versa

Advanced Configuration

Multiple Models

You can configure multiple models in your opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "lmstudio": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "LM Studio (local)",
      "options": {
        "baseURL": "http://127.0.0.1:1234/v1"
      },
      "models": {
        "qwen2.5-coder-32b-instruct": {
          "name": "Qwen 2.5 Coder 32B (Best Quality)"
        },
        "qwen2.5-coder-7b-instruct": {
          "name": "Qwen 2.5 Coder 7B (Fast)"
        },
        "deepseek-coder-v2-lite": {
          "name": "DeepSeek Coder V2"
        }
      }
    }
  }
}

Then switch between them using /models in OpenCode.

Custom Headers (if needed)

Some setups might require custom headers:

{
  "options": {
    "baseURL": "http://127.0.0.1:1234/v1",
    "headers": {
      "X-Custom-Header": "value"
    }
  }
}

Comparison: Local vs Cloud

Aspect	Local (LM Studio)	Cloud (Anthropic/OpenAI)
Privacy	Complete - data stays local	Data sent to provider
Cost	Free after setup	Pay per token
Speed	Depends on hardware	Generally fast
Quality	Good with right model	State-of-the-art
Offline	Yes	No
Setup	More complex	Simple API key

Conclusion

Running OpenCode with LM Studio gives you a powerful, private, and cost-free AI coding assistant. While cloud models might offer slightly better quality for complex tasks, local models have come a long way and are more than capable for most development workflows.

The best part? You can always switch between local and cloud models depending on your needs - use local for everyday coding and cloud for complex architecture decisions.

Give it a try and experience the freedom of AI-assisted coding without sending your code to the cloud!