Opik MCP

Official

Drive your Opik LLM-observability workspace from your AI host — read traces, log scores, save prompts, ask Ollie.

Unverified

stdio (local)

API key

Python

View repo 210 Website

Add to your client

Copy the config for your MCP client and paste it into its config file.

Install / run

uvx opik-mcp

Paste into ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "opik-mcp": {
      "command": "uvx",
      "args": [
        "opik-mcp"
      ],
      "env": {
        "OPIK_API_KEY": "<your-key>",
        "OPIK_WORKSPACE": "<your-workspace>"
      }
    }
  }
}

Requires `uv` (the Python package runner). Install it from https://docs.astral.sh/uv/ if `uvx` is not found.

Step-by-step guides: Add to Claude Desktop · Add to Cursor · Add to Windsurf

Before you start

Python 3.13+
uv (the uvx runner) installed
An Opik / Comet workspace with an OPIK_API_KEY (from comet.com/api/my/settings/)
OPIK_WORKSPACE (optional; defaults to 'default' — set it only for a named cloud workspace)
ask_ollie and run_experiment require Comet Cloud (not available on self-hosted Opik)

About Opik MCP

Opik MCP connects your AI coding assistant to your Comet Opik LLM-observability workspace. Through six universal tools you can read traces/spans/experiments/prompts, list collections with filtering and pagination, log traces and scores and comments, save prompt versions, manage test suites and experiments, and delegate investigative or end-to-end evaluation work to Ollie, Opik's in-product assistant. It runs as a Python package via uvx over stdio (default) or streamable-http, and is configured entirely through environment variables.

Tools & capabilities (6)

read

Universal read by id, name, or opik:// URI. Supports project, trace, span, test_suite, experiment, and prompt entities; composite reads (trace, prompt) inline their children so one call returns the full picture. Name-based lookup is available for project, experiment, prompt, and test_suite.

list

Universal list of a collection with an optional name-substring filter and pagination. Project-scoped types (trace, test_suite_item, prompt_version) require their parent UUID.

ask_ollie

Investigate, synthesize, or cross-reference entities via Ollie, the Opik in-product assistant. Ollie has direct read access to the workspace and can execute writes (scores, comments, test-suite items, prompt versions) mid-stream when asked. Returns final text plus a thread_id for follow-ups. Comet Cloud only.

write

Universal write dispatcher. Pass operation + data; supported operations: trace.create, trace.update, span.create, score.create, comment.create, prompt_version.save, test_suite.create, test_suite_item.upsert, experiment.create, experiment_item.create. Validates the payload, applies the right REST verb, and returns the backend response.

schema

Introspect the exact JSON shape and required fields of any write operation before calling it. Returns the schema, OAuth scope, and one validated example. Pure lookup — no backend call.

run_experiment

Run an evaluation experiment end-to-end via Ollie. Takes a single experiment_config dict (prompt, test suite, scorers) mirroring Opik's experiment shape; Ollie executes the run and writes results back as an Opik experiment. Comet Cloud only.

When to use it

Ask why an experiment regressed and have Ollie read the experiment + traces and explain the failures
Score a trace or span on a metric (e.g. helpfulness) with a reason, directly from chat
List and browse your Opik projects, traces, experiments, and prompts without leaving your editor
Save a new prompt version and curate evaluation test suites
Run an end-to-end evaluation experiment and write results back to Opik

Security notes

Requires OPIK_API_KEY (from comet.com/api/my/settings/) passed via the host config env block, not the shell. On HTTP transport opik-mcp performs no local credential validation — any well-formed Authorization: Bearer token is forwarded verbatim to opik-backend, which is the single point of auth enforcement; keep the default 127.0.0.1 bind and prefer stdio on shared networks. The OSS backend does not authenticate requests, so an HTTP opik-mcp in front of it is as open as the OSS REST API. Anonymous usage telemetry (event type + timing, plus a SHA-256 digest of the API key — raw key never leaves the process) is on by default; opt out with OPIK_MCP_ANALYTICS_ENABLED=false. Ollie runs in YOLO mode by default, auto-approving mid-stream writes (logged on the opik_mcp.audit logger); set OPIK_MCP_AUTO_APPROVE=disabled to require per-action confirmation. Pre-release: not yet on PyPI — use the git source form (uvx --from git+https://github.com/comet-ml/opik-mcp.git opik-mcp) until the first PyPI release.

Opik MCP FAQ

How do I run it — is it on PyPI?

Run it with uvx (no global install). It is currently pre-release and not yet on PyPI; until the first PyPI release lands, replace 'uvx opik-mcp' in any snippet with 'uvx --from git+https://github.com/comet-ml/opik-mcp.git opik-mcp'.

Do all six tools work on self-hosted Opik?

No. ask_ollie and run_experiment are available on Comet Cloud only and will fail at dispatch on self-hosted. Use read / list / write directly on self-hosted (set COMET_URL_OVERRIDE to your host).

My OPIK_API_KEY isn't being picked up — why?

In Claude Code / Cursor / VS Code, env vars only apply when placed inside the 'env' block of the MCP server config, not from your shell. Add OPIK_API_KEY there and restart the host.

Why does ask_ollie time out on Cursor?

Cursor has a hard 60s tool-call timeout that does not reset on progress notifications (an upstream Cursor bug). Keep ask_ollie queries focused, or run the same operation on Claude Code, which has no documented hard cap.

What about the old npx opik-mcp?

That is the legacy TypeScript server (opik-mcp@^2 on npm), now deprecated and sunsetting on 2026-11-15. Swap 'npx -y opik-mcp' for 'uvx opik-mcp@latest' (or the git source form during pre-release). Legacy source is preserved under legacy/typescript/.

#opik #comet #llm-observability #tracing #evaluation #prompt-management #experiments #feedback-scores #ollie

Alternatives to Opik MCP

Compare all alternatives →

Elasticsearch MCP Server

Monitoring & Observability

700

Official Elastic server: list indices, read mappings, and search with Query DSL.

Verified

stdio (local)

API key

TypeScript

5 tools

Updated 4 months agoRepo

PostHog MCP Server (Official Remote)

Monitoring & Observability

350

Official PostHog server: product analytics, feature flags, experiments, error tracking and SQL.

Verified

stdio (local)

API key

TypeScript

12 tools

Updated 5 months agoRepo

Prometheus MCP Server

Monitoring & Observability

340

Run PromQL queries and analyze Prometheus metrics from any MCP client.

Verified

stdio (local)

No auth

Python

6 tools

Updated 1 month agoRepo

Compare Opik MCP with:

vs Elasticsearch MCP Server vs PostHog MCP Server (Official Remote)vs Prometheus MCP Server vs Datadog MCP Server (Official Remote)