Ga naar inhoud
Begin met HitKeep Cloud

HitKeep Disaster Recovery Runbook for Operators

Deze inhoud is nog niet vertaald.

Disaster recovery for HitKeep is straightforward if you treat the storage layout correctly.

The most common mistake is restoring only the shared control-plane database while forgetting tenant-local analytics files.

At minimum:

  • the shared control-plane database in {data-path}/hitkeep.db
  • all tenant analytics databases in {data-path}/tenants/**
  • the archive directory if you rely on retention archives for older raw data

If you use built-in backups, these are already exported into snapshot directories. If you use external tooling, your DR plan should capture the same boundary.

Plan for the failure you are most likely to face:

Scenario Recovery source Main risk
Bad deploy Latest local backup Restoring a snapshot from before a schema migration
Disk loss Off-host backup or S3 backup Missing tenant-local databases
Accidental team deletion Backup from before purge Retaining deleted tenant data longer than policy allows
Host migration Latest verified snapshot Forgetting archive and asset directories
Region outage Object storage or external snapshot Restore time and DNS cutover

Run this periodically on a disposable environment:

  1. Provision an empty host or container.
  2. Restore HitKeep from a recent snapshot.
  3. Start the same HitKeep version, or a newer compatible one.
  4. Log in as an admin.
  5. Validate one default-tenant site.
  6. Validate one non-default team site.
  7. Validate goals, funnels, and ecommerce.
  8. Confirm team membership and team switching still work.
  9. Confirm retention archives are still present if you keep them separately.

If you cannot perform this drill successfully, you do not yet have a reliable recovery process.

Terminal window
./hitkeep recover restore-backup \
-from /var/lib/hitkeep/backups \
-snapshot 2026-03-08T120000Z \
-db /var/lib/hitkeep/data/hitkeep.db \
-data-path /var/lib/hitkeep/data \
-yes

Restore is offline-only. Stop HitKeep before running it.

Teams introduce two important operational facts:

  • archived teams can later be purged physically
  • tenant analytics may live outside the shared database

That means:

  • backups taken before a purge may still contain the purged tenant
  • backups taken after a purge should not
  • archive retention and backup retention are separate concerns

If you have GDPR or hard-deletion requirements, your DR runbooks should explicitly define how long old snapshots are retained and when they are expired.

A good HitKeep DR posture means:

  • you know exactly where live data lives
  • you know exactly where backups are written
  • you have tested recover restore-backup
  • you can restore both shared and tenant-local data
  • you are not depending on replaying a stale WAL to make a restore boot

For small self-hosted installs, run a restore drill after changing backup storage or before a major version upgrade. For teams using HitKeep as client or business reporting infrastructure, run a scheduled drill at least quarterly and record the snapshot timestamp, HitKeep version, restore target, and validation result.