Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.morphllm.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Compact compresses chat history and code context at 33,000 tok/s by removing irrelevant lines. Every surviving line is byte-for-byte identical to the original input. 100K tokens compresses in under 2 seconds. Pass query to tell the model what matters for the next LLM call. Without it, the model auto-detects from the last user message.

Usage Examples

import { MorphClient } from '@morphllm/morphsdk';

const morph = new MorphClient({ apiKey: "YOUR_API_KEY" });

const result = await morph.compact({
  input: chatHistory,
  query: "How do I validate JWT tokens?",
  compressionRatio: 0.5,
  preserveRecent: 3,
});

// result.output is the compressed text — pass it to your LLM

keepContext Tags

Wrap sections you never want compressed in <keepContext> / </keepContext> tags. Tagged content survives compression verbatim regardless of the compression ratio.
<keepContext>
// CRITICAL: Auth middleware — do not compress
function authenticate(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: 'No token' });
  req.user = jwt.verify(token, process.env.JWT_SECRET);
  next();
}
</keepContext>
The response includes kept_line_ranges showing which lines were force-preserved.

Compatible Endpoints

Compact also works through OpenAI-compatible endpoints with model: "morph-compactor":
EndpointFormatUse with
POST /v1/compactNative Morph formatDirect HTTP, Morph SDK
POST /v1/responsesOpenAI Responses APIAny OpenAI SDK (client.responses.create())
POST /v1/chat/completionsOpenAI Chat CompletionsAny OpenAI-compatible client
See the full Compact documentation for SDK reference, best practices, and advanced usage.