docker-stealthy-auto-browse: The Browser That Doesn't Know It's Being Automated

I’ve been automating browsers for years. Selenium, Puppeteer, Playwright — used them all, watched them all get caught. The arms race between bot detection and browser automation has been going on since the dawn of web scraping, and guess who’s been losing? Every single Chromium-based automation tool on the fucking planet.

The problem isn’t the tools themselves. Playwright is great software. Puppeteer works fine. The problem is Chrome DevTools Protocol — the mechanism they all use to talk to the browser. CDP is how your automation framework says “click this button” or “type in this field.” It’s also how Cloudflare, DataDome, PerimeterX, and every other bot detection service on Earth knows you’re not a real human. You can install stealth plugins, patch navigator.webdriver, spoof fingerprints until your eyes bleed — CDP is still there, and they will find it.

So I built docker-stealthy-auto-browse. Let me be clear — I didn’t invent any of this stealth shit. The heavy lifting is done by Camoufox, Playwright, PyAutoGUI, and browserforge. These are brilliant projects built by people smarter than me. What I did is take all of this shit, wire it together inside a Docker container, and slap an HTTP API on top so you can control the whole thing remotely with curl commands. One container, one endpoint, zero bullshit setup.

The Fundamental Problem With Every Other Approach

Here’s what every Chromium-based automation tool does: it opens Chrome, connects to it via CDP, and sends commands through this protocol. The browser knows. JavaScript running on the page knows. The bot detection service’s script that loaded before your page content? It definitely knows.

You can try to hide it:

Patch navigator.webdriver to return false — detectors check if it was patched
Install stealth plugins — detectors check for the side effects of those plugins
Spoof fingerprints — detectors compare your main context fingerprint against web worker fingerprints and find inconsistencies
Use headless mode — detectors check for headless signals

It’s a cat-and-mouse game where the cat has every advantage. CDP leaves traces everywhere — in the JavaScript runtime, in the way events are dispatched, in timing patterns, in the browser’s internal state. You’re trying to pretend a puppet isn’t a puppet while the strings are clearly visible.

The Approach: No Strings At All

docker-stealthy-auto-browse doesn’t hide automation signals. It eliminates them entirely.

Camoufox instead of Chromium. A custom Firefox fork. There is no Chrome DevTools Protocol because Firefox doesn’t use it. Bot detectors looking for CDP signals find nothing — not because we hid them, but because they don’t exist. navigator.webdriver is false — not patched to return false, genuinely false because Camoufox doesn’t set it.

Playwright for browser control. Handles the DOM-level stuff — navigation, element selection, page inspection. The convenient but detectable input mode goes through Playwright. Combined with Camoufox, it doesn’t leak the usual CDP automation signals that Chromium-based setups do.

PyAutoGUI instead of DOM events. When you need stealth, the mouse physically moves across the virtual screen with human-like curves, random jitter, and eased acceleration. When you type, real OS-level keystrokes are generated with randomized delays between characters. The browser receives these as genuine user input. No JavaScript in the world can tell the difference between PyAutoGUI input and a real human sitting at a keyboard.

Real fingerprints via browserforge. The fingerprint is generated once and applied consistently across the main context and web workers. No spoofing means no inconsistencies — a common detection vector that catches most fingerprint-spoofing tools.

Xvfb for a real display. The browser runs with a full graphical display inside the container via a virtual framebuffer. No headless mode, no headless signals. As far as the browser and any detection script is concerned, it’s running on a normal desktop.

My contribution is the glue — a Python HTTP API that wires all of these together, the Docker container that packages everything into a single docker run command, the page loader system for URL-triggered automation, the dual input mode abstraction (system vs playwright), and the noVNC integration for live viewing. The stealth tech is other people’s genius. The packaging and API is mine.

How It Works

You run the container, it exposes an HTTP API on port 8080. Send JSON commands, get JSON responses. That’s the entire interface.

docker run -d --name browser \
  -p 8080:8080 \
  -p 5900:5900 \
  psyb0t/stealthy-auto-browse

Port 8080 is the API. Port 5900 is a noVNC viewer so you can watch the browser in real-time from your own browser — open http://localhost:5900/ and you see exactly what the automated browser sees.

Navigate somewhere:

curl -X POST http://localhost:8080 \
  -H "Content-Type: application/json" \
  -d '{"action": "goto", "url": "https://example.com"}'

Read the page:

curl -X POST http://localhost:8080 \
  -H "Content-Type: application/json" \
  -d '{"action": "get_text"}'

Find every clickable thing on the page:

curl -X POST http://localhost:8080 \
  -H "Content-Type: application/json" \
  -d '{"action": "get_interactive_elements"}'

This returns every button, link, and input with their viewport coordinates, text, and CSS selectors. Now click one with a real mouse movement:

curl -X POST http://localhost:8080 \
  -H "Content-Type: application/json" \
  -d '{"action": "system_click", "x": 500, "y": 300}'

Type with real keystrokes:

curl -X POST http://localhost:8080 \
  -H "Content-Type: application/json" \
  -d '{"action": "system_type", "text": "hello world"}'

Take a screenshot:

curl http://localhost:8080/screenshot/browser?whLargest=512 -o screenshot.png

Two Input Modes — This Matters

The container gives you two ways to interact with pages, and picking the right one is the difference between getting through and getting blocked.

System Input — Undetectable

system_click, mouse_move, system_type, send_key, scroll — these all use PyAutoGUI to generate real OS-level events. The mouse moves with human-like curves. Keystrokes have randomized timing. The browser has zero way to know these aren’t from a real person.

You work with viewport coordinates — get them from get_interactive_elements.

Playwright Input — Detectable But Convenient

click, fill, type — these use Playwright’s DOM automation with CSS selectors or XPath. Faster, easier, no coordinate math. But the event injection patterns are theoretically detectable by sophisticated behavioral analysis.

The rule is simple: site has bot detection? Use system input. Always. Just scraping something that doesn’t fight back? Playwright input is fine.

Here’s what an undetectable login looks like — every interaction uses OS-level input:

API=http://localhost:8080
# Navigate to login
curl -X POST $API -H 'Content-Type: application/json' \
  -d '{"action": "goto", "url": "https://example.com/login"}'
# Find all interactive elements
curl -X POST $API -H 'Content-Type: application/json' \
  -d '{"action": "get_interactive_elements"}'
# Click the email field (coordinates from above)
curl -X POST $API -H 'Content-Type: application/json' \
  -d '{"action": "system_click", "x": 400, "y": 200}'
# Type email with human-like keystrokes
curl -X POST $API -H 'Content-Type: application/json' \
  -d '{"action": "system_type", "text": "[email protected]"}'
# Tab to password field
curl -X POST $API -H 'Content-Type: application/json' \
  -d '{"action": "send_key", "key": "tab"}'
# Type password
curl -X POST $API -H 'Content-Type: application/json' \
  -d '{"action": "system_type", "text": "secretpassword"}'
# Submit
curl -X POST $API -H 'Content-Type: application/json' \
  -d '{"action": "send_key", "key": "enter"}'
# Wait for redirect
curl -X POST $API -H 'Content-Type: application/json' \
  -d '{"action": "wait_for_url", "url": "**/dashboard", "timeout": 15}'

The site sees a real human typing at natural speed with randomized delays. No CDP signals. No automation fingerprints. Nothing.

Page Loaders: Automation on Autopilot

Page loaders are like Greasemonkey userscripts but for the HTTP API. You write a YAML file that says “whenever the browser visits this domain, run these steps automatically.” Mount them into the container and forget about it.

# loaders/news_site.yaml
name: News Site Cleanup
match:
  domain: news-site.com
steps:
  - action: goto
    url: "${url}"
    wait_until: networkidle
  - action: wait_for_element
    selector: "article"
    timeout: 10
  - action: eval
    expression: "document.querySelector('.cookie-consent')?.remove()"
  - action: eval
    expression: "document.querySelector('.newsletter-overlay')?.remove()"
  - action: scroll_to_bottom
    delay: 0.3

Now every goto to news-site.com automatically waits for content, kills the cookie popup, kills the newsletter modal, and scrolls to trigger lazy-loaded images. No more sending 5 commands after every navigation.

The Full API

The HTTP API covers everything you’d need:

Navigation: goto, refresh with configurable wait conditions
System input: system_click, mouse_move, system_type, send_key, scroll — all OS-level, all undetectable
Page inspection: get_interactive_elements, get_text, get_html, eval
Wait conditions: wait_for_element, wait_for_text, wait_for_url, wait_for_network_idle — because sleep is for amateurs
Tab management: list_tabs, new_tab, switch_tab, close_tab
Cookies & storage: full CRUD for cookies, localStorage, sessionStorage
Downloads & uploads: handle file downloads and programmatic file inputs
Network logging: record all HTTP requests the page makes — find API endpoints, debug, verify
Screenshots: browser viewport or full desktop, with resize parameters
Dialog handling: auto-accept or configure responses to alert/confirm/prompt dialogs

Two screenshot endpoints give you the browser viewport (what the page looks like) or the full virtual desktop (including browser chrome). Both support resize params so you’re not downloading 1920×1080 PNGs every time:

# Resize longest side to 512px
curl http://localhost:8080/screenshot/browser?whLargest=512 -o shot.png
# Full desktop including browser chrome
curl http://localhost:8080/screenshot/desktop?whLargest=512 -o desktop.png

Stealth Configuration That Matters

A few environment variables that actually affect whether you get caught:

Timezone matching. Bot detectors compare your browser’s timezone against your IP’s geolocation. If your IP says Romania but your timezone says UTC, that’s a red flag. Set TZ=Europe/Bucharest (or whatever matches your IP) and this vector disappears.

Proxy support. Route all traffic through a residential proxy with PROXY_URL=http://user:pass@proxy:8888. Combined with timezone matching, you look like a real user from that location.

Persistent profiles. Mount a directory to /userdata and your cookies, localStorage, sessions, and fingerprint survive container restarts. Without this, every restart is a fresh identity — which is sometimes what you want, and sometimes suspicious as fuck.

docker run -d \
  -e TZ=Europe/Bucharest \
  -e PROXY_URL=http://user:pass@proxy:8888 \
  -v ./my-profile:/userdata \
  -p 8080:8080 \
  -p 5900:5900 \
  psyb0t/stealthy-auto-browse

Pre-installed Extensions

Every container ships with privacy extensions already configured:

uBlock Origin — blocks ads, trackers, and annoyances. Less noise, less tracking scripts running
LocalCDN — intercepts CDN requests and serves resources locally. Google and Cloudflare can’t track you across sites
ClearURLs — strips tracking parameters (utm_source, fbclid, gclid) from URLs
Consent-O-Matic — auto-rejects cookie consent popups so you don’t have to deal with that shit

Want more? Mount a persistent profile, open VNC, navigate to about:addons, and install whatever you want. They’ll persist across restarts.

Bot Detection Test Results

Tested against everything that matters and passed them all:

CreepJS — canvas/WebGL fingerprint consistency, lies detection, worker comparison: pass
BrowserScan — WebDriver flag, CDP signals, navigator properties: pass
Pixelscan — fingerprint coherence, timezone/IP match, WebRTC leaks: pass
Cloudflare — challenge pages, Turnstile, bot management: pass
DeviceAndBrowserInfo — 19 checks, all green, “You are human!”: pass
Fingerprint.com — identified as normal Firefox, no bot flags: pass
Rebrowser — CDP leak detection, webdriver, viewport analysis: pass

It passes because there’s nothing to detect. No CDP to find because Firefox doesn’t have it. No spoofed fingerprints because the fingerprint is real and consistent. No automation flags because navigator.webdriver is genuinely false. No fake input events because PyAutoGUI generates real ones at the OS level.

Built For AI Agents

Here’s the thing nobody talks about with browser automation: the best use case in 2026 isn’t some Python script running a scraping loop. It’s AI agents that need to interact with the web like a human.

I use Claude Code constantly, and half the shit I need it to do involves web pages — filling out forms, checking dashboards, grabbing data from sites that don’t have APIs, interacting with admin panels. The problem with giving an LLM a browser has always been the interface. Selenium? Too complex. Playwright’s API? Too many moving parts. The LLM ends up writing 50 lines of setup code before it can click a single button.

docker-stealthy-auto-browse was designed from the ground up to be AI-friendly. The entire interface is curl commands with JSON. That’s it. An LLM doesn’t need to import libraries, manage browser instances, handle async contexts, or deal with any of that garbage. It just sends HTTP requests.

Think about what an AI agent needs to browse the web:

Navigate somewhere — one curl to goto
Understand what’s on the page — one curl to get_text. The AI reads the text and knows what it’s looking at. If text isn’t enough, get_interactive_elements returns every clickable thing with coordinates and labels. If it’s still confused, take a screenshot — Claude can read images
Interact with elements — one curl to system_click with x,y coordinates, one curl to system_type for text input
Wait for results — one curl to wait_for_text or wait_for_element
Verify the outcome — one curl to get_text again

No SDK. No driver setup. No browser lifecycle management. The container handles all of that. The AI just talks to an HTTP endpoint.

I’ve had Claude Code do things like:

Log into web dashboards, navigate to specific pages, extract data, and summarize it
Fill out multi-step forms on sites that require JavaScript rendering
Monitor pages for changes and alert me when something updates
Interact with admin panels that have no API — clicking buttons, changing settings, downloading exports
Research shit on sites that block regular HTTP requests behind Cloudflare

The repo includes a full .claude/INSTRUCTIONS.md that teaches Claude exactly how to use the browser — the two input modes, the typical workflow, every action with examples. Drop this into your Claude Code setup and it knows how to browse the web out of the box. It even includes a pattern for writing reusable bash helper scripts so the AI doesn’t keep repeating the same curl commands over and over:

# /tmp/browser.sh — Claude writes this once, then sources it
API="${BROWSER_API:-http://localhost:8080}"
browser_goto() { curl -s -X POST "$API" -H "Content-Type: application/json" -d "{\"action\":\"goto\",\"url\":\"$1\"}"; }
browser_click() { curl -s -X POST "$API" -H "Content-Type: application/json" -d "{\"action\":\"system_click\",\"x\":$1,\"y\":$2}"; }
browser_type() { curl -s -X POST "$API" -H "Content-Type: application/json" -d "{\"action\":\"system_type\",\"text\":\"$1\"}"; }
browser_text() { curl -s -X POST "$API" -H "Content-Type: application/json" -d '{"action":"get_text"}'; }
browser_screenshot() { curl -s "$API/screenshot/browser?whLargest=512" -o "${1:-/tmp/screen.png}"; }

It’s also available as an OpenClaw skill on ClawHub. Install it with clawhub install psyb0t/stealthy-auto-browse and any OpenClaw-compatible AI agent can spin up the browser on demand — it only loads when the agent actually needs to browse something, so it’s not wasting resources sitting idle.

The combination of a dead-simple HTTP API, full stealth against bot detection, and built-in AI agent instructions makes this the best browser automation tool for LLMs that I’ve found. And I looked, believe me. Everything else either requires complex SDK setup that confuses the AI, or gets caught by Cloudflare on the first request, or both.

The Bottom Line

Every other browser automation tool is playing defense — hiding CDP signals, patching detection vectors, hoping the next Cloudflare update doesn’t break their stealth plugin. docker-stealthy-auto-browse doesn’t play that game. There’s no CDP to hide. There are no automation signals to patch. The browser genuinely doesn’t know it’s being automated.

One Docker container. One HTTP API. Passes every bot detector we’ve thrown at it.

Go grab it: github.com/psyb0t/docker-stealthy-auto-browse

Licensed under WTFPL — Do What The Fuck You Want To Public License. Because obviously.