Using OpenCode with Local LLMs via LM Studio

| 7 min read

OpenCode is a powerful open-source AI coding agent that can help you write, debug, and refactor code. While it supports many cloud providers like Anthropic, OpenAI, and others, you can also run it entirely locally using LM Studio - giving you complete privacy and zero API costs.

In this guide, I’ll walk you through setting up OpenCode with LM Studio to create a fully local AI-powered development environment.

Why Use Local LLMs with OpenCode?

Before diving into the setup, let’s understand why you might want to run LLMs locally:

  • Privacy: Your code never leaves your machine - critical for proprietary or sensitive projects
  • Cost-Free: No API fees or usage limits after initial setup
  • Offline Access: Work anywhere without internet connectivity
  • Full Control: Choose and customize models that work best for your use case

Prerequisites

Before we begin, make sure you have:

  1. A machine with decent specs (16GB+ RAM recommended, GPU with 8GB+ VRAM for better performance)
  2. LM Studio installed
  3. OpenCode installed

Step 1: Install LM Studio

LM Studio is a desktop application that makes it easy to download, run, and serve local LLMs.

  1. Download LM Studio from lmstudio.ai
  2. Install it on your system (available for macOS, Windows, and Linux)
  3. Launch the application

Step 2: Download a Coding-Optimized Model

For coding tasks, you’ll want a model specifically trained or fine-tuned for code. Here are some recommended models:

ModelSizeBest ForVRAM Required
Qwen2.5-Coder-32B-Instruct32BBest quality, complex tasks24GB+
Qwen2.5-Coder-14B-Instruct14BGood balance12GB+
Qwen2.5-Coder-7B-Instruct7BFast, lighter tasks8GB+
DeepSeek-Coder-V2-Lite16BGreat for coding12GB+
CodeLlama-34B-Instruct34BSolid all-rounder24GB+

To download a model in LM Studio:

  1. Click the Search tab (magnifying glass icon)
  2. Search for your desired model (e.g., “Qwen2.5-Coder”)
  3. Select a quantization level (Q4_K_M is a good balance of quality and speed)
  4. Click Download

Step 3: Start the LM Studio Server

Once your model is downloaded:

  1. Go to the Local Server tab (the <-> icon)
  2. Select your downloaded model from the dropdown
  3. Configure the server settings:
    • Port: 1234 (default)
    • Context Length: Set based on your RAM (8192 is a safe default)
  4. Click Start Server

You should see a message indicating the server is running at http://localhost:1234.

Step 4: Configure OpenCode

Now we need to tell OpenCode how to connect to LM Studio. Create or edit your opencode.json configuration file:

Option A: Project-Specific Configuration

Create opencode.json in your project root:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "lmstudio": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "LM Studio (local)",
      "options": {
        "baseURL": "http://127.0.0.1:1234/v1"
      },
      "models": {
        "qwen2.5-coder-32b-instruct": {
          "name": "Qwen 2.5 Coder 32B (local)",
          "limit": {
            "context": 32768,
            "output": 8192
          }
        }
      }
    }
  }
}

Option B: Global Configuration

For system-wide configuration, create the file at:

  • macOS/Linux: ~/.config/opencode/opencode.json
  • Windows: %APPDATA%\opencode\opencode.json

Configuration Breakdown

Let’s understand each field:

  • provider.lmstudio: A unique identifier for this provider (can be any name)
  • npm: The AI SDK package to use (@ai-sdk/openai-compatible for LM Studio)
  • name: Display name shown in OpenCode’s UI
  • options.baseURL: The LM Studio server endpoint
  • models: Map of available models with their configurations
  • limit.context: Maximum input tokens (match your LM Studio context setting)
  • limit.output: Maximum tokens the model can generate per response

Step 5: Select the Model in OpenCode

  1. Open your terminal and navigate to your project:

    cd /path/to/your/project
    opencode
  2. Run the /models command:

    /models
  3. Select your LM Studio model from the list. It should appear with the name you configured (e.g., “Qwen 2.5 Coder 32B (local)“)

Step 6: Start Coding!

You’re now ready to use OpenCode with your local LLM. Here are some things to try:

Ask Questions About Your Codebase

How is authentication handled in this project?

Generate New Code

Create a Python function that validates email addresses using regex

Refactor Existing Code

Refactor the function in @src/utils/helpers.py to be more efficient

Debug Issues

Why is this function returning undefined? @src/components/UserList.tsx

Tips for Best Results

1. Choose the Right Model Size

  • 7B models: Fast responses, good for simple tasks, works on most hardware
  • 14B-32B models: Better quality, need more VRAM
  • 70B+ models: Excellent quality, requires high-end GPU or quantization

2. Optimize Context Length

Match your LM Studio context length with your OpenCode config:

"limit": {
  "context": 32768,  // Match this to LM Studio settings
  "output": 8192
}

3. Use Quantized Models

Quantized models (Q4, Q5, Q6) use less memory with minimal quality loss:

  • Q4_K_M: Good balance of speed and quality
  • Q5_K_M: Slightly better quality, more memory
  • Q8_0: Near full precision, most memory

4. Monitor Resource Usage

Keep an eye on your system resources:

  • LM Studio shows GPU/CPU utilization
  • Reduce context length if you run out of memory
  • Close other GPU-intensive applications

Troubleshooting

Model Not Appearing in OpenCode

  1. Verify LM Studio server is running (check the Local Server tab)
  2. Ensure the baseURL matches (default: http://127.0.0.1:1234/v1)
  3. Check that the model ID in your config matches what LM Studio expects

Slow Responses

  • Try a smaller model (7B instead of 32B)
  • Use a more aggressive quantization (Q4 instead of Q8)
  • Reduce context length
  • Ensure you’re using GPU acceleration in LM Studio

Out of Memory Errors

  • Switch to a smaller model
  • Reduce context length in both LM Studio and OpenCode config
  • Use a more quantized version of the model
  • Close memory-intensive applications

Connection Refused

  • Verify LM Studio server is running
  • Check that port 1234 isn’t blocked by firewall
  • Try using localhost instead of 127.0.0.1 or vice versa

Advanced Configuration

Multiple Models

You can configure multiple models in your opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "lmstudio": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "LM Studio (local)",
      "options": {
        "baseURL": "http://127.0.0.1:1234/v1"
      },
      "models": {
        "qwen2.5-coder-32b-instruct": {
          "name": "Qwen 2.5 Coder 32B (Best Quality)"
        },
        "qwen2.5-coder-7b-instruct": {
          "name": "Qwen 2.5 Coder 7B (Fast)"
        },
        "deepseek-coder-v2-lite": {
          "name": "DeepSeek Coder V2"
        }
      }
    }
  }
}

Then switch between them using /models in OpenCode.

Custom Headers (if needed)

Some setups might require custom headers:

{
  "options": {
    "baseURL": "http://127.0.0.1:1234/v1",
    "headers": {
      "X-Custom-Header": "value"
    }
  }
}

Comparison: Local vs Cloud

AspectLocal (LM Studio)Cloud (Anthropic/OpenAI)
PrivacyComplete - data stays localData sent to provider
CostFree after setupPay per token
SpeedDepends on hardwareGenerally fast
QualityGood with right modelState-of-the-art
OfflineYesNo
SetupMore complexSimple API key

Conclusion

Running OpenCode with LM Studio gives you a powerful, private, and cost-free AI coding assistant. While cloud models might offer slightly better quality for complex tasks, local models have come a long way and are more than capable for most development workflows.

The best part? You can always switch between local and cloud models depending on your needs - use local for everyday coding and cloud for complex architecture decisions.

Give it a try and experience the freedom of AI-assisted coding without sending your code to the cloud!

Video Walk-through

References