Token

// Description

A Token is the smallest unit into which a Large Language Model breaks down text. Instead of whole words, LLMs process tokens — word pieces typically 3–4 characters long. "Marketing" becomes "Mark" + "eting," while short words like "the" are a single token. In English, it's about 1.3 tokens per word, in German about 1.5.

Tokens determine both the costs and limits of AI applications. API prices are calculated per million tokens: GPT-5.2 costs $1.75 input / $14 output, Claude Opus 4.6 costs $15/$75, Gemini 3.1 Pro costs $1.25/$5. The context window — how much text a model can process at once — is also measured in tokens.

Tokenization uses algorithms like BPE (Byte Pair Encoding) that merge frequent character sequences into single tokens. Different models use different tokenizers — the same text may have different token counts in GPT vs. Claude. OpenAI's tiktoken library enables exact pre-calculation.

Practical takeaway: long prompts (system + context + question) consume input tokens, the response consumes output tokens (which are more expensive). Efficient Prompt Engineering not only saves costs but also leaves more room in the context window for relevant information.

// Use Cases

API cost calculation
Context window management
Prompt optimization
Budget planning for AI projects
Text chunking for RAG
Model selection by cost efficiency
Token limit monitoring
Batch processing optimization

// AI Pirates Assessment

Understanding tokens is essential for budget planning. We use affordable models (GPT-4o-mini, Haiku) for routine tasks and frontier models only where quality demands it. This keeps our API costs under control.

// Frequently Asked Questions

What is a token in AI?

A token is the basic unit into which text is broken down for AI models. It can be a word, word part, or punctuation mark. An English word averages ~1.3 tokens. Tokens determine the costs and limits of AI applications.

How much does a token cost?

Prices vary by model and are calculated per million tokens. GPT-5.2: $1.75 input / $14 output. Claude Haiku: $0.80/$4. GPT-4o-mini: $0.15/$0.60. A typical 1,000-word blog article ≈ 1,300 tokens, costing just a few cents.

What's the difference between input and output tokens?

Input tokens are the text you send to the model (prompt, system instruction, context). Output tokens are the generated response. Output tokens are 4–8× more expensive than input tokens because generation is more compute-intensive.

How can you optimize token costs?

Write efficient prompts (no unnecessary context), use cheaper models for simple tasks (GPT-4o-mini instead of GPT-5.2), employ caching for repeated requests, and use batch processing for bulk operations.

// Description

// Use Cases

// Frequently Asked Questions

// Related Entries

Need help with Token?