gptme-codegraph

v0.1.0 Structural code retrieval for gptme via tree-sitter (callers, callees, blast radius) packages/gptme-codegraph View on GitHub

gptme-codegraph

Structural code retrieval for gptme via tree-sitter — complementary to gptme-rag (text chunks), this retrieves code structure: function/class definitions, call graphs, blast radius, and impact analysis.

Features

When to use

Reach for the right retrieval tool by the shape of the question, not by habit:

Rule of thumb: exact text → grep; "what does this concept look like" → semantic; "how is this symbol wired" → codegraph.

Install

pip install gptme-codegraph[treesitter,mcp]

Or with uv:

uv add gptme-codegraph[treesitter,mcp]

Usage

CLI

# Extract symbols from a file
gptme-codegraph path/to/file.py parse
gptme-codegraph path/to/file.ts parse

# Who calls a function?
gptme-codegraph path/to/file.py callers my_function

# What does a function call?
gptme-codegraph path/to/file.py callees my_function

# What breaks if you change a function?
gptme-codegraph path/to/file.py impact my_function

# Where is a symbol defined?
gptme-codegraph path/to/file.py def my_function

# Show a repo-map style symbol skeleton for a directory
gptme-codegraph path/to/repo map

Committed repo-map artifact

"Analyze once, commit the graph." Generate a .gptme-codegraph-map.json that teammates and agents can read for a repo's structural outline without re-running the tree-sitter pipeline:

# Generate and save the artifact at <repo>/.gptme-codegraph-map.json
gptme-codegraph-commit-map path/to/repo

# Check freshness (exit 0 = fresh, 1 = stale/missing) — for pre-commit/CI gating
gptme-codegraph-commit-map path/to/repo --check

# Regenerate only if stale (use in a pre-commit hook); --force always regenerates
gptme-codegraph-commit-map path/to/repo --refresh

Freshness is keyed off a digest of supported source files (*.py, *.ts, *.tsx, *.js, *.rs, *.go, *.java, *.cs, *.rb, *.c, *.cpp, *.php, *.kt, *.kts, *.swift), not HEAD — so an artifact regenerated in a pre-commit hook stays fresh after the commit that contains it lands. The default staleness window is 1 day (--stale-after-days N to change it). The artifact is structural only (paths, class/function names, nesting) — no source, comments, or values — so it is safe to commit to any repo.

MCP Server

# Start the MCP server (stdio transport)
gptme-codegraph-mcp

Configure in Claude Code:

claude mcp add codegraph -- gptme-codegraph-mcp

Python API

from gptme_codegraph import (
    build_call_graph,
    build_cross_file_call_graph,
    build_index,
    extract_symbols,
    impact_radius,
)
from pathlib import Path

# Single-file: extract symbols and build call graph
symbols = extract_symbols(Path("src/my_module.py"))
_callees_graph, callers_graph = build_call_graph(symbols)

# Compute impact radius: what breaks if you change this symbol?
radius = impact_radius("my_function", callers_graph, max_depth=5)
print(radius)  # {"depth_0": {…}, "depth_1": {…}, …}

# Cross-file: build an index over a whole directory
index = build_index(Path("src/"))
_callees_graph, callers_graph = build_cross_file_call_graph(index, Path("src/"))
radius = impact_radius("my_module::MyClass.my_method", callers_graph, max_depth=5)
print(radius)  # {"depth_0": {…}, "depth_1": {…}, …}

Status

Experimental package — Python support is the deepest path today, with broad tree-sitter extraction now wired into the same surface for common web, systems, JVM, scripting, PHP/Kotlin, and Swift codebases. Cross-file resolution remains strongest on Python; non-Python import handling is best-effort rather than fully semantic.

Namespace packages (import google.cloud.storage without __init__.py) are a known v1.1 gap.