Your text0 tokens
0
Tokens
0
Characters
0
Words
0
Lines
Cost estimate by model (input tokens)
| Model | Tokens | Input cost | % of context |
|---|
What is a token?
A token is a chunk of text as processed by a language model's tokenizer. Tokens are roughly 4 characters or 0.75 words in English. The word "tokenization" is typically 3–4 tokens. A sentence of 10 words is roughly 13 tokens.
Token counts vary between models because each uses a different tokenizer. This tool uses a close approximation based on the standard ratio of ~4 characters per token. For exact counts use the official tokenizer libraries: tiktoken for OpenAI models, or the Anthropic/Google SDKs.
Frequently asked questions
How many tokens is 1000 words?
Approximately 1,330 tokens. The general rule is 1 token ≈ 0.75 words. So 1,000 words ÷ 0.75 = ~1,333 tokens. This varies by language and content type.
What is a context window?
The context window is the maximum number of tokens a model can process in a single request — including both your input and the model's output. If your prompt plus expected output exceeds the context window, the request will fail or be truncated.
Are input and output tokens priced the same?
No. Output tokens are typically 3–5x more expensive than input tokens because generating tokens requires more computation than reading them. This tool shows input token costs — multiply by 3–5x for output cost estimates.
How accurate is this token counter?
This tool provides close approximations using the ~4 characters per token heuristic. For exact token counts use tiktoken (OpenAI), the Anthropic SDK's count_tokens method, or Google's countTokens API. Accuracy is typically within 5–10%.
Token rules of thumb
~4 chars = 1 token
~0.75 words = 1 token
1 page (~500 words) ≈ 667 tokens
1 book (~80K words) ≈ 107K tokens
Output tokens cost 3–5x more
Related tools