How it works
Four ideas explain almost everything Gurulu does. Once they click, the rest of the product is just a careful implementation of them.
1. Universe Kernel — required from day one
Every event, every row, every endpoint in Gurulu carries the same set of fields. From the very first ingestion. There is no "we'll add that column later." The kernel is mandatory.
The required fields are:
tenant_id— your accountworkspace_id— the project inside your accountanonymous_idorperson_id— who triggered thissession_id— the visit it belongs tosource_context— where it came from: device, browser, country, referrer, source domain, UTMconfidence— how sure we are about identityconsent_state— what the user agreed to, by GCM v2 categoryhealth_status— did this event pass the validation gate
Why this matters: every analytics platform we have used grows fields over time. By the time you need to slice by browser, half your historical events do not have one. Universe Kernel guarantees that any query you can imagine works on every event you ever sent — including the first one.
The UI ships in stages. The data shape never widens later.
2. Three event classes, never mixed
Events are not a single bag. There are three classes, and they are kept separate from the moment they enter the system.
| Class | What it is | Where it comes from | What it is for | |---|---|---|---| | Interaction | A click, a pageview, a scroll, a rage-click. | Browser autocapture. | Behavior. Funnels. Replay. Health. | | Intent | Add to cart. Checkout started. Application started. A signal of intention without commitment. | Browser or server. | Audience building. Predictions. | | Outcome | A purchase, a signup, a money-movement event. The thing that matters. | Server only. | Attribution. Revenue. Source of truth. |
Outcomes only fire from your server. The browser cannot tell us a purchase happened — your server can. This is the only way attribution numbers can be trusted.
3. Four channels, four roles, never confused
There are exactly four ways an event reaches Gurulu, and each channel has a strict role:
- Browser script (
@gurulu/web). Observes the page. Autocaptures interactions. Never sends outcomes. - Playground. A visual builder on top of your live site. Lets a non-engineer point at a button and say "this is signup." Adds a rule to the registry — it does not create events.
- Server SDK (
@gurulu/webidentify and track, plus@gurulu/node). Sends identified events and verified outcomes from your backend. This is where money-movement is sent. - Agent, CLI, MCP server (
@gurulu/cliplus@gurulu/mcp-server). Registry-aware. Your AI editor reads the registry and knows the event name isbet_placed, notbet.placed. The agent never invents an event.
When the channels are kept strict, two things stop happening: untrustworthy outcomes from the browser, and event-name drift across the codebase.
4. Event is a contract
The registry is the source of truth for event names and shapes. Three things follow from this:
- Event keys are
snake_case. The registry enforces the pattern^[a-z0-9_]+$.bet_placedis accepted.bet.placedandbetPlacedare rejected. - The SDK does not send free-form strings. Types are generated from the registry. If the registry does not know
signup_completed, your code does not compile. - Every event passes a validation gate at ingestion. Four outcomes are possible: accept, warn, quarantine, reject. The decision and the reason are recorded.
In CI, drift fails the pull request. The broken event is caught in the PR review, not three weeks later in the dashboard.
5. Customer-defined attribution
We do not pick the attribution model for you. You define a policy: first-touch, last-touch, linear, time-decay, position-based, or data-driven. You pick the lookback window. You pick how channels are weighted.
Every outcome carries a provenance trace. You can ask, for any credit: which touchpoints were considered, which were excluded, why this model was applied, and what the alternative models would have credited.
If you want to know why a sale was credited to organic search instead of paid, the answer is in the database. It is not a black box.
6. Identity, explainable
The identity engine merges two records when seven steps of evidence agree. Every merge is:
- Reversible. The merge ledger is append-only. Unmerging is a single write.
- Confidence-scored. Each merge has a level — high, medium, low — and a list of signals.
- Auditable. You can see when two records merged, on what evidence, and on which channel.
There is no opaque "user graph" that you trust on faith. There is a ledger you can read.
7. Privacy by default
Region defaults to EU. The free tier gets EU residency, the same as paid tiers. Consent state is captured at every event in Google Consent Mode v2 categories, and the pipeline filters downstream destinations based on what the user agreed to. DSR export and forget requests run on a 60-second SLA queue.
We do not capture what we do not need. Sensitive surfaces — checkout pages, KYC, account settings — are masked in replay by default. You can widen the capture; the default is narrow.
Where to go next
- What is Gurulu — product, pains, and positioning.
- API reference — the REST surface, grouped.