A comprehensive guide to the production-grade observability system for Claude Code hooks using OpenTelemetry, Langtrace, and SigNoz.
Overview
This framework provides full visibility into Claude Code hook execution through:
- Distributed tracing - Track operations across hook invocations
- Metrics collection - Monitor performance and usage patterns
- Structured logging - Debug issues with correlated logs
- LLM instrumentation - Track token usage and costs
All telemetry exports to both local files (for offline analysis) and SigNoz Cloud (for dashboards and alerting).
Architecture
Claude Code Hooks
│
▼
HookMonitor (otel-monitor.ts)
• Initializes OpenTelemetry SDK
• Creates root spans for each hook
• Records metrics and logs
│
▼
┌─────────────────────────────────────┐
│ Dual Export Pattern │
├──────────────────┬──────────────────┤
│ Local Files │ SigNoz Cloud │
│ JSONL format │ OTLP/gzip │
│ ~/.claude/ │ Real-time │
│ telemetry/ │ dashboards │
└──────────────────┴──────────────────┘
Quick Start
1. Environment Setup
Add to ~/.claude/.envrc:
# Enable telemetry
export OTEL_ENABLED="true"
export SIGNOZ_ENABLED="true"
# SigNoz Cloud endpoint
export OTEL_EXPORTER_OTLP_ENDPOINT="https://ingest.us.signoz.cloud"
export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
export OTEL_EXPORTER_OTLP_COMPRESSION="gzip"
# Service identification
export OTEL_SERVICE_NAME="claude-code-hooks"
export OTEL_RESOURCE_ATTRIBUTES="deployment.environment=development,service.version=1.0.0"
# Authentication (from Doppler)
export SIGNOZ_INGESTION_KEY="<your-key>"
2. Basic Usage
import {
initTelemetry,
shutdown,
withSpan,
recordMetric,
logger,
getTraceUrl
} from './lib/otel';
// Initialize at startup
initTelemetry();
// Wrap operations in spans
await withSpan('my-operation', { 'attr.key': 'value' }, async (span) => {
// Your code here
logger.info('Processing item', { itemId: 123 });
});
// Record custom metrics
recordMetric('items.processed', 42, { queue: 'main' });
// Get trace link for debugging
console.log(`View trace: ${getTraceUrl()}`);
// Clean shutdown
await shutdown();
Core Components
OpenTelemetry Core (hooks/lib/otel.ts)
The foundation providing tracing, metrics, and logging.
| Function | Purpose |
|---|---|
initTelemetry() |
Initialize SDK with exporters |
withSpan(name, attrs, fn) |
Wrap async function in a span |
recordMetric(name, value, attrs) |
Record counter metric |
recordGauge(name, value, attrs) |
Record gauge metric |
logger.info/warn/error() |
Structured logging |
getTraceUrl() |
Get SigNoz trace link |
getTraceId() |
Get current trace ID |
shutdown() |
Graceful cleanup |
Hook Monitor (hooks/lib/otel-monitor.ts)
Simplified instrumentation for hooks via instrumentHook().
import { instrumentHook } from './lib/otel-monitor';
instrumentHook('my-hook', async (ctx) => {
ctx.addAttribute('custom.attr', 'value');
ctx.recordEvent('processing_started');
ctx.logger.info('Hook executing');
// Create child span for sub-operations
await ctx.startChildSpan('fetch-data', {}, async () => {
// Nested operation
});
});
Langtrace (hooks/lib/langtrace.ts)
Automatic LLM provider instrumentation with PII protection.
import { initLangtrace, recordLLMEvent } from './lib/langtrace';
// Initialize with PII redaction (default)
initLangtrace();
// Or with custom patterns
initLangtrace({
contentProcessor: createPIIProcessor([
{ pattern: /ACME-\d{6}/g, replacement: '[ACME_ID]' }
])
});
Auto-redacted patterns:
- Email addresses, phone numbers, SSNs
- Credit card numbers, API keys, AWS keys
- JWT tokens, Bearer tokens, passwords
Instrumented Hooks
| Hook | Event | Description |
|---|---|---|
session-start-otel.ts |
SessionStart | Session initialization |
mcp-pre-tool-otel.ts |
PreToolUse | MCP tool invocation start |
mcp-post-tool-otel.ts |
PostToolUse | MCP tool completion |
agent-pre-tool-otel.ts |
PreToolUse | Subagent spawn |
agent-post-tool-otel.ts |
PostToolUse | Subagent completion |
skill-activation-prompt-otel.ts |
UserPromptSubmit | Skill detection |
tsc-check-otel.ts |
PostToolUse | TypeScript checks |
stop-build-check-otel.ts |
Stop | Build verification |
Metrics Reference
Hook Metrics
| Metric | Type | Description |
|---|---|---|
hook.duration |
Histogram | Execution time distribution |
hook.executions |
Counter | Total invocations |
mcp.invocations |
Counter | MCP tool calls |
agent.invocations |
Counter | Subagent spawns |
LLM Metrics
| Metric | Type | Description |
|---|---|---|
gen_ai.client.token.usage |
Counter | Tokens consumed |
gen_ai.client.cost |
Counter | Cost in USD |
gen_ai.client.operation.duration |
Histogram | LLM call latency |
Build Metrics
| Metric | Type | Description |
|---|---|---|
build.check.duration |
Histogram | Type check time |
build.errors |
Gauge | Error count |
Output Locations
Local Files
~/.claude/telemetry/
├── traces-YYYY-MM-DD.jsonl # Span data
├── logs-YYYY-MM-DD.jsonl # Log records
└── llm-events-YYYY-MM-DD.jsonl # LLM calls
Remote Dashboard
URL: https://tight-ladybird.us.signoz.cloud/
Available Dashboards
| Dashboard | Purpose |
|---|---|
| Claude Code Hooks Observability | Core hook performance |
| Token Usage & Cost Efficiency | LLM consumption tracking |
| Tool & MCP Usage Analytics | MCP server/tool usage |
| Error & Anomaly Detection | Error monitoring |
| Subagent Analytics | Agent invocations |
| Session Health Overview | Session activity |
Configuration Options
Trace Sampling
Control telemetry volume with sampling:
# Development - capture everything
export OTEL_TRACES_SAMPLER="always_on"
# Production - 10% sampling
export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.1"
Debug Mode
Enable verbose SDK logging:
export OTEL_LOG_LEVEL="debug"
Circuit Breaker
Built-in protection when SigNoz is unreachable:
- Trips after 3 consecutive failures
- Resets after 60 seconds
- Fails fast to prevent request blocking
Troubleshooting
No data in SigNoz
- Check
OTEL_ENABLED="true"andSIGNOZ_ENABLED="true" - Verify
SIGNOZ_INGESTION_KEYis set - Enable debug:
OTEL_LOG_LEVEL="debug" - Check local files exist in
~/.claude/telemetry/
High latency on exports
- Ensure compression:
OTEL_EXPORTER_OTLP_COMPRESSION="gzip" - Reduce sampling in production
- Check for circuit breaker trips in logs
Missing LLM traces
- Verify
initLangtrace()called at startup - Check
LANGTRACE_WRITE_TO_FILE="true"for local debugging - Confirm provider is supported (OpenAI, Anthropic, Bedrock, etc.)
Related Documentation
- MCP Tools:
signoz_list_services,signoz_search_metric_by_text