Documentation Index
Fetch the complete documentation index at: https://docs.morphllm.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Compact compresses chat history and code context at 33,000 tok/s by removing irrelevant lines. Every surviving line is byte-for-byte identical to the original input. 100K tokens compresses in under 2 seconds. Passquery to tell the model what matters for the next LLM call. Without it, the model auto-detects from the last user message.
Usage Examples
keepContext Tags
Wrap sections you never want compressed in<keepContext> / </keepContext> tags. Tagged content survives compression verbatim regardless of the compression ratio.
kept_line_ranges showing which lines were force-preserved.
Compatible Endpoints
Compact also works through OpenAI-compatible endpoints withmodel: "morph-compactor":
| Endpoint | Format | Use with |
|---|---|---|
POST /v1/compact | Native Morph format | Direct HTTP, Morph SDK |
POST /v1/responses | OpenAI Responses API | Any OpenAI SDK (client.responses.create()) |
POST /v1/chat/completions | OpenAI Chat Completions | Any OpenAI-compatible client |