AI/LLM Cost Monitoring
Track every AI API call — costs, tokens, and latency across OpenAI, Anthropic, and more.
AI API calls are expensive. One runaway loop or a bad prompt can cost hundreds of dollars before you notice. DeepTracer tracks every LLM call your app makes — so you always know what you're spending and where.
What gets tracked
For every AI API call, DeepTracer records:
- Provider — OpenAI, Anthropic, Google, Mistral, and others
- Model — gpt-4o, claude-3-haiku, gemini-pro, etc.
- Token counts — Input tokens and output tokens
- Cost — Calculated from model pricing
- Latency — How long the call took
- Service — Which part of your app made the call
Dashboard features
- Cost breakdown — Total spend by model, provider, service, or day
- Daily usage charts — Token counts and cost over time
- Per-request details — See individual LLM calls with full metadata
- Spike detection — Spot unusual usage before your bill gets out of hand
Quick setup with Vercel AI SDK
If you're using the Vercel AI SDK, add tracking in one line:
import { wrapVercelAI } from "@deeptracer/ai"
import { generateText } from "ai"
const ai = wrapVercelAI(logger, { generateText })
// Use ai.generateText instead of generateText — that's the only change
const result = await ai.generateText({
model: openai("gpt-4o"),
prompt: "Summarize this article...",
})Every call through the wrapper is automatically tracked — model, tokens, cost, and latency.
Other providers
DeepTracer has wrappers for the major AI SDKs:
import { wrapVercelAI } from "@deeptracer/ai"
import { generateText, streamText } from "ai"
const ai = wrapVercelAI(logger, { generateText, streamText })
const result = await ai.generateText({ model: openai("gpt-4o"), prompt: "Hello" })import { wrapOpenAI } from "@deeptracer/ai"
import OpenAI from "openai"
const openai = wrapOpenAI(logger, new OpenAI())
const result = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
})import { wrapAnthropic } from "@deeptracer/ai"
import Anthropic from "@anthropic-ai/sdk"
const anthropic = wrapAnthropic(logger, new Anthropic())
const result = await anthropic.messages.create({
model: "claude-3-haiku-20240307",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello" }],
})Manual tracking
If you're using a provider without a wrapper, or making raw HTTP calls to an AI API, log usage manually:
logger.llmUsage({
model: "gpt-4o",
provider: "openai",
operation: "chat",
inputTokens: 150,
outputTokens: 320,
latencyMs: 1200,
})When you use a wrapper function, you don't need to call llmUsage manually — it's handled for you.
Why this matters
A few common scenarios where LLM monitoring saves you money:
- Runaway loops — A recursive agent calls GPT-4o 200 times instead of 5. You catch it in minutes, not on your next invoice.
- Model selection — You discover that 80% of your calls work fine with a cheaper model. Switch and save.
- Prompt optimization — Your longest prompts cost 10x more than they need to. Trim them with data, not guesswork.