Spoold is a free, privacy-first developer toolbox. Paste into the Magic Box and it detects JSON, HTML, JWT, curl, OpenAPI, CSV, timestamps, and more—then suggests the best tool. You can also browse the full catalog by category. No sign-up is required.

Is Spoold free to use?

Yes. Core tools are free. Heavy work runs in your browser so your payloads are not processed on Spoold servers for formatting, decoding, and similar utilities.

Is my data safe with Spoold?

Tool processing for supported client-side utilities happens in your browser. Encrypted share links are designed so only people with the link can read the payload. Review the Privacy Policy for details on local storage, analytics, support, and third-party services.

What tools are available?

Spoold ships 72+ tools including JSON formatter and diff, JSON Schema validation, YAML and TOML converters, HTML and Markdown preview, JWT decode and sign, OpenAPI/Swagger viewer with curl, GraphQL formatter, curl to code and curl compare, certificate viewer, CSV preview, regex tester, Mermaid, code editor, LLM token utilities, QR codes, and many encoding and text utilities.

Yes. Spoold uses optional support and in-layout promotions for its own tools instead of third-party ad placements. You can support development from the Support Spoold page.

Can I use keyboard shortcuts?

Yes. Press Cmd or Ctrl+K to open tool search, use / from the homepage, and use per-tool shortcuts (shown in the UI) such as jj for JSON and hh for HTML where configured.

Token & Context Budget — Split System, User, RAG vs Context Window (Online)

Guide: Token & context budget

↑ Back to tool

What is this tool?

A free LLM context window calculator and prompt token budget helper. Split your prompt into labeled blocks (system, user, RAG, extra), count tokens and characters per block and in total, then compare against a model context limit with an optional reserved completion budget. See whether you are under, near, or over budget — and which section is largest. Client-side in the browser; use your provider's tokenizer for billing.

Prompt sections

System / instructions — Policies, persona, tool rules.
User message — The main user query or task.
RAG / retrieved context — Pasted chunks, citations, knowledge base text.
Extra — Tool JSON, function definitions, or anything else you bill as input.

The UI highlights the largest block so you know where to trim first when you are over limit.

Tokenizer modes

Same encodings as the token calculator:

Approximate — UTF-8 bytes ÷ 4 (fast planning heuristic, not BPE-exact).
cl100k_base — Tiktoken; common for GPT-3.5 / GPT-4–class chat models.
o200k_base — Tiktoken; closer for GPT-4o / newer o-series–style vocabularies.

Other vendors (Anthropic, Gemini, etc.) use different tokenizers — treat counts as planning, not guaranteed billable tokens.

Context limit & completion reserve

Pick a context limit (presets from 4K up to 1M tokens, or a custom value) and set reserve for completion — headroom for the model's reply, tool call payloads, or follow-up turns. The effective budget is limit − reserved. Totals above that show as over budget; high utilization triggers a warning so you can avoid truncation or API errors.

Features

Four text areas — Independent token/char counts per section.
Summary bar — Total tokens/chars, progress vs budget, over/warn/ok states.
Largest block — Called out in the summary for quick trimming.
Share — Copy a URL encoding your inputs (for collaboration or bookmarks).

How to use

Paste system, user, RAG, and extra text into the matching panels.
Choose tokenizer — Match the mode you use for estimates (approx vs cl100k vs o200k).
Set context limit — Preset or custom; align with your model's window.
Reserve completion tokens — Room for the answer (and tools if needed).
Adjust — If over budget, shrink the largest section or raise the limit.

Use cases

Scenario	How this helps
RAG pipelines	See whether retrieved chunks + system + user fit before the request hits the API.
Long system prompts	Isolate system tokens vs user message to decide what to compress.
Agents & tools	Park tool schemas and JSON in Extra and watch total vs reserved completion space.
Teaching & docs	Demonstrate context windows and why “prompt length” matters.

Limits

Not billing-accurate for non–OpenAI-compatible tokenizers.
API wrappers may add hidden tokens (chat templates); counts here are raw pasted text only.
Share URLs encode state in the link — avoid sharing secrets.

Budgeting: context window calculator, max context tokens, reserved completion tokens, how many tokens is my prompt, LLM prompt token counter, system prompt token count, tiktoken context budget, cl100k prompt size, o200k base estimate.

RAG: RAG token budget, chunk size vs top-k context, citation window tokens, re-ranker output fits context, tool calling leaves room for answer.

Vendors: OpenAI context limit helper, Anthropic 200k planner, Gemini long context trim, Azure OpenAI deployment quota tokens (awareness).

This tool shows split sections and a clear limit line—pair with Token calculator for single-blob counts and Vision tokens for images.

FAQ

Is Token & context budget free?

Yes. Counting runs in your browser.

What is the difference vs the token calculator?

This page splits prompt into system / user / RAG / extra and compares to a context limit with reserved completion space. The token calculator focuses on one blob plus illustrative cost.

Why am I “over budget” but my API still works?

Your provider may use a different tokenizer, template, or a larger effective window. Use this as a planning signal, then confirm in their dashboard or logs.

What does Share do?

It copies a URL that encodes your sections and settings so you can reopen the same scenario later.

Similar tools

Other Spoold utilities that pair with context budgeting:

Conclusion

Use Token & context budget to split prompts and stay inside your model window. For single-block counts and cost hints, use Token calculator; for VRAM, use LLM RAM / VRAM; for latency what-ifs, use Token generation speed.