Playground Crawler
When client-side crawl falls short — server-side Chromium for DOM scan, pattern detection, and screenshot review.
When you need it
Client-side crawl isn't enough
JS-heavy SPA, late mount, dynamic content, shadow DOM — that's when server-side Chromium steps in.
JS-heavy SPA
React/Vue/Angular single-page apps — when the DOM is fully rendered via JS.
Late mount
Initial HTML is empty; the real content mounts after 2-5s — Chromium can wait.
Dynamically loaded DOM
Scroll + click + lazy load — elements that need scripted interaction to appear.
Shadow DOM
Client-side access to web component shadow roots is limited; server-side Chromium can pierce them.
3 crawl modes
Single · Multi · Screenshot-only
Three modes depending on need. Multi-page pulls dynamic URL lists from CH events_canonical.
single_page
single_pageGive one URL → DOM scan + pattern detect + screenshot. Fastest mode — perfect for playground preview.
multi_page
multi_pageCH events_canonical last 7d distinct source_context['context_url'] → multi-page crawl. Looks for the same pattern across pages (total count).
screenshot_only
screenshot_onlyFull-page + viewport screenshot, no DOM scan. For UX review / regression.
6-strategy selector generator
Layered selectors for auto-healing fail-safe
Each event gets 6 selector types — if one breaks, the next one kicks in (5-layer fallback healing).
Why 6 layers?
Frontend redesigns can break CSS classes, ARIA labels, or text content. Unless all 6 break at once, event tracking keeps working.
dataAttrdata-* attribute — most stable (manually added, dev-controlled)textVisible text content — can break on i18n changescssCSS class / id — risky during class refactorsxpathXPath expression — dependent on DOM hierarchyregexAttribute regex match — pattern-basedariaARIA role + label — accessibility-grade, relatively stable
// 6 strateji, en spesifikten en gevşeğe doğru
{
"selectors": [
{ "type": "data_attr", "value": "[data-testid='cart-add']" },
{ "type": "aria", "value": "[role='button'][aria-label='Sepete ekle']" },
{ "type": "text", "value": "button:has-text('Sepete ekle')" },
{ "type": "css", "value": ".product-card .add-to-cart" },
{ "type": "xpath", "value": "//button[contains(., 'Sepete ekle')]" },
{ "type": "regex", "value": "button[name=~'add.cart']" }
]
}Pattern detection
1,248 cart buttons → one event
Auto-groups repeated elements with the same design — one event definition covers every page.
How does it group?
Clusters by selector + neighboring DOM structure + accessible label similarity. All elements in a cluster bind to a single event.
# Multi-page crawl → 1.248 add-to-cart noktası bulundu
gurulu playground crawl \
--mode multi_page \
--base https://shop.example.com \
--discover ch_events_last_7d \
--max-pages 200
# Sonuç: pattern grouping (eşleşen selector → tek event'e bağla)
# event:add_to_cart → matched 1.248 instances across 47 pagesScreenshot review
MinIO presigned + thumbnail
After a crawl, each page screenshot is written to MinIO; the API returns a 302 redirect to a presigned URL.
GET /v1/playground/crawl/{crawl_id}/screenshot
→ 302 redirect → MinIO presigned URL (15 dk TTL)
# Thumbnail (JPEG, dashboard list için)
GET /v1/playground/crawl/{crawl_id}/screenshot?variant=thumbAccess control
Presigned URLs are returned with a 15-min TTL — workspace permission is checked first, then redirected to the MinIO bucket. Dashboard thumbnails are JPEG.
Performance config
Browser pool 2, timeout 30s, shm 1gb
Chromium SHM segment 1 GB — Docker default shm_size of 64 MB will not work, override is required.
230 s1 GB1 job / workerUse cases
Onboarding + discovery
Two typical scenarios — new workspace setup and finding new patterns in an existing workspace.
Onboarding playground
New workspace setup: use the 'scan multi-page' button from the sessions list to detect patterns across every page and bulk-create events.
Pattern discovery
In an existing workspace post-launch — run a multi-page crawl from the playground to find new elements and add them to the event registry.
Related docs
Read next
Build audiences from patterns, see patterns surfaced in the AI summary, learn the architecture.