Architecture Deep Dive
HitKeep is architected to be radically simple to deploy while remaining capable of handling high-throughput traffic. Unlike traditional analytics stacks that require separate services (Nginx, Kafka, ClickHouse, Redis), HitKeep embeds everything into a single Go binary.
High-Level Overview
Section titled “High-Level Overview”Data flows through the system in four distinct stages:
- Ingestion: The HTTP Server receives a hit from the tracking script.
- Buffering: The hit is serialized and published to an embedded NSQ topic in memory.
- Processing: An internal consumer reads from the queue, forming micro-batches.
- Storage: Data is written to embedded DuckDB, a high-performance columnar database.
graph LR
A[Browser] -->|HTTP POST| B(HTTP Server)
B -->|Publish| C{Embedded NSQ}
C -->|Consume| D[Ingest Worker]
D -->|INSERT| E[(DuckDB)]
1. The Database: DuckDB
Section titled “1. The Database: DuckDB”HitKeep uses DuckDB, an in-process SQL OLAP database management system.
- Why DuckDB? It is optimized for analytical queries (aggregations, time-series bucketing) and runs locally without a server process. It creates a single file (
hitkeep.db), making backups as simple as copying that file. - Data Layout: We use DuckDB’s columnar storage engine to compress data efficiently. This allows HitKeep to store millions of hits in relatively small files (~120MB per million hits).
2. Ingestion & Buffering: Embedded NSQ
Section titled “2. Ingestion & Buffering: Embedded NSQ”Writing to a disk-based database synchronously during an HTTP request is a bottleneck. To solve this, HitKeep embeds NSQ, a realtime distributed messaging platform.
- Decoupling: When a request hits
/ingest, HitKeep validates it and pushes it to an in-memory queue immediately. The HTTP request completes in milliseconds. - Burst Handling: If traffic spikes (e.g., your site goes viral), the queue acts as a shock absorber. The database writer consumes messages at a steady, optimal pace, preventing database locks or corruption.
3. Clustering (Leader/Follower)
Section titled “3. Clustering (Leader/Follower)”For high availability or horizontal scaling, HitKeep supports a Leader/Follower topology using HashiCorp Memberlist for node discovery (Gossip protocol).
The Roles
Section titled “The Roles”- Leader: There is exactly one Leader in the cluster. It is the only node that holds the
hitkeep.dbfile lock and writes to the database. It runs the NSQ Consumer. - Follower: All other nodes are Followers. They accept HTTP traffic but do not write to disk locally.
Request Flow in a Cluster
Section titled “Request Flow in a Cluster”- A Load Balancer sends traffic to any node (Leader or Follower).
- If a Follower receives an
/ingestrequest, it proxies the payload internally to the Leader via HTTP. - The Leader accepts the payload, puts it in the queue, and writes it to disk.
- If the Leader dies, the cluster elects a new Leader (requires persistent storage to be re-attached or shared, e.g., via Kubernetes StatefulSets).
4. Frontend Architecture
Section titled “4. Frontend Architecture”The dashboard is a Single Page Application (SPA) built with Angular and PrimeNG.
- API Driven: The frontend communicates strictly via the JSON REST API.
- Signals: State management leverages Angular Signals for fine-grained reactivity.
- Lightweight Tracker: The tracking script (
hk.js) is built separately using Rolldown to ensure the smallest possible footprint (< 2KB) for your website visitors.