Dev Tools Article

Freeze the Web Into a Single Self-Contained Binary

Kage uses headless Chrome to clone websites, strip their JavaScript, and pack them into zero-dependency executables.

Lenn Voss

Cloud & Infrastructure Writer · Jun 14, 2026 · 4 min read

We have all been there. You hit "Save As" on a valuable technical resource, a documentation page, or an essay, only to open it six months later and find a blank screen, a broken layout, or an infinite loading spinner. The modern web is no longer made of documents; it is made of thin clients executing complex, ephemeral JavaScript. When the third-party API or tracking server goes dark, the saved page dies with it.

Enter Kage (影, meaning "shadow"), an open-source tool written in Go designed to solve this exact problem. Instead of simply downloading raw source HTML, Kage drives a real browser to capture a fully rendered snapshot of a website, strips out every single line of JavaScript, and packages the remaining static assets into a single, self-contained binary or archive that you can run offline forever.

The Headless Execution Strategy

Traditional web scrapers often fail on modern single-page applications (SPAs) because they do not execute client-side JavaScript. Kage takes the opposite approach. It spins up a headless instance of Chromium or Chrome, navigates to the target URL, and waits for the page to settle.

Once the page has fully rendered, Kage snapshots the DOM exactly as a human reader would see it. It then performs a series of sanitization steps:

JavaScript Stripping: Every <script> tag and inline event handler is completely removed.
Asset Localization: It rewrites URLs for CSS, images, and fonts, downloading those assets to local paths.
Zero Network Footprint: The resulting files run zero code, make no external API calls, and contain no tracking scripts.

Because Kage relies on a real browser, it requires Chrome or Chromium on the host system. It automatically detects system installations, but developers can specify a custom path using the --chrome flag or the KAGE_CHROME environment variable. For environments without a local browser, Kage is also distributed as a Docker container that bundles Chromium out of the box.

A Polite, Idempotent Crawler

Cloning a single page is useful, but archiving an entire site requires a robust crawling engine. Kage implements a breadth-first crawler that is designed to be both efficient and polite.

By default, the crawler respects robots.txt rules and seeds its queue using the site's sitemap.xml. It is also highly idempotent: pages are keyed by the files they write, meaning that duplicate paths (such as variations between HTTP and HTTPS, or trailing slashes) are only fetched once.

If you interrupt a crawl with Ctrl-C, Kage gracefully saves its state. Running the command again resumes the crawl from where it stopped. For updating existing archives, the --refresh flag re-renders pages in place to capture updates, while --force wipes the local mirror and starts clean.

To handle modern web design patterns, Kage includes several specialized crawling flags:

--scroll: Automatically scrolls down each page during rendering to trigger lazy-loaded images and dynamic content.
--workers: Controls concurrency (defaulting to 4 parallel workers).
--max-depth and --max-pages: Prevents infinite crawl loops on highly dynamic sites.
--scope-prefix: Restricts the crawl to specific subpaths (e.g., /doc).

Packaging to ZIM and Self-Serving Binaries

Once a site is cloned, Kage writes a standard directory structure of HTML, CSS, and images. You can preview this folder locally using kage serve, which launches a lightweight static file server on port 8800.

However, managing thousands of loose files is not ideal for long-term archiving or sharing. Kage solves this with its pack command, which compresses the entire cloned directory into a single file.

Developers can choose between two output formats:

ZIM Archives: A standard format for offline content. These can be read back using kage open <file.zim> or other third-party ZIM readers.
Self-Contained Binaries: By passing the --format binary flag, Kage compiles the cloned site and a minimal web server into a single executable binary.

This compiled binary has zero external dependencies. You can copy it to a server, hand it to a colleague, or store it on a thumb drive. Running the binary immediately spins up a local server and hosts the archived site, ensuring that your documentation or reference material remains readable decades into the future, completely independent of the original host.

Sources & further reading

Show HN: Kage – Shadow any website to a single binary for offline viewing — github.com

#Web Scraping #Go #Cli #Offline #Archiving

Written by

Lenn Voss · Cloud & Infrastructure Writer

Lenn writes about cloud platforms, Kubernetes internals, and the infrastructure decisions that quietly make or break engineering organizations. Based in Berlin's vibrant tech scene, they have a talent for turning dense platform-engineering topics into prose that people actually finish reading.

Discussion 0

Join the discussion

No comments yet

Be the first to weigh in.

Freeze the Web Into a Single Self-Contained Binary

The Headless Execution Strategy

A Polite, Idempotent Crawler

Packaging to ZIM and Self-Serving Binaries

Sources & further reading

Discussion 0

Related Reading

Ditching Line Diffs for AST-Based Git Merges with Weave

ReactOS Runs 3D-Accelerated Half-Life on Bare Metal

The Agentic Shift in Formal Verification

Exploring Zinnia, a Modular Rust Kernel That Boots to XFCE