Analytics with Open Exports
HitKeep gives teams an open export path for analytics data. You can export site data, filtered hit data, Web Vitals samples, AI fetch records, AI chatbot events, saved Opportunities, and personal account data in formats that work with DuckDB, spreadsheets, scripts, and warehouses.
Use this guide when your search is “web analytics with open exports”, “analytics data portability”, or “how do I get raw analytics out of the product before choosing a vendor”.
Export paths
Section titled “Export paths”| Export path | What it is for | Formats | Auth surface |
|---|---|---|---|
| Site takeout | Full site data package for portability or audit review | xlsx, csv, parquet, json, ndjson | Dashboard session |
| Filtered hits export | Raw hit export for a selected date range or filter | csv, xlsx, parquet, json, ndjson | Dashboard session, bearer token, or API key |
| AI fetch export | Server-side AI crawler records for a site and date range | csv, xlsx, parquet, json, ndjson | Dashboard session, bearer token, or API key |
| AI chatbot export | On-site assistant instrumentation events | csv, xlsx, parquet, json, ndjson | Dashboard session, bearer token, or API key |
| User takeout | Personal account data for GDPR portability | xlsx, csv, parquet, json, ndjson | Dashboard session |
Takeout endpoints default to xlsx. Hits exports default to csv. Set format=parquet, format=json, or format=ndjson when the export is headed to a data pipeline.
Site takeout includes Web Vitals rows as record_type = web_vital when samples exist. Those rows contain metric name, value, server-derived rating, normalized path, navigation type, timestamp, tracker source, and tracker version.
Site takeout also includes saved Opportunities and safe AI run metadata when those records exist. Exported Opportunities keep localization keys, interpolation params, cited evidence IDs, detector metadata, status, and the final customer-visible output boundary. Takeout excludes provider secrets, raw prompts, raw provider responses, raw external error bodies, and unrestricted AI tool payloads.
For the generation model and API field contract, see Opportunity Recommendations.
Site takeout
Section titled “Site takeout”Export a complete site package from the generated API reference:
The endpoint is session-authenticated. It is meant for signed-in operators who can access the site in the dashboard.
For warehouse work, request Parquet explicitly:
/api/sites/{site_id}/takeout?format=parquetFor spreadsheet review, use the default xlsx output:
/api/sites/{site_id}/takeoutFiltered raw hit exports
Section titled “Filtered raw hit exports”When you only need a slice of traffic, use the hit export endpoint instead of a full takeout:
This endpoint accepts date range and filter parameters from the API reference. It is the better fit for recurring jobs that extract one campaign, one hostname, one referrer, or one investigation window.
Example export targets:
/api/sites/{site_id}/hits/export?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&format=csv/api/sites/{site_id}/hits/export?filter=utm_source:newsletter&format=parquetUse a scoped API client for scheduled scripts so access can be revoked without changing a user password.
Supported Formats
Section titled “Supported Formats”| Format | Best for |
|---|---|
parquet | DuckDB, Apache Spark, Pandas, warehouse import, long-term analytical storage |
csv | Simple tabular imports and scripts that expect text files |
json | Programmatic processing where nested fields are easier to preserve |
ndjson | Streaming pipelines and line-oriented processing |
xlsx | Manual review in spreadsheet apps and user-facing takeout packages |
Retention archives and backup snapshots also use open files. See Data Retention and Archiving for Parquet archives and Backups and Restore for full database snapshots.
For current runtime, storage, export, and non-replacement facts, see Facts and Limits.
Query a Parquet export
Section titled “Query a Parquet export”Parquet exports can be queried locally with DuckDB without importing them into another service:
duckdb -c " SELECT date_trunc('week', timestamp) AS week, referrer_domain, count(*) AS hits FROM 'site_hits.parquet' GROUP BY 1, 2 ORDER BY 1 DESC, 3 DESC LIMIT 20;"This is useful for audits, support investigations, and migrations where you need to inspect the raw exported file before loading it elsewhere.
Export AI visibility data
Section titled “Export AI visibility data”HitKeep’s AI reports are exportable through dedicated endpoints:
Use these when you want server-side AI crawler analytics or assistant instrumentation outside the dashboard. For setup and interpretation, see AI Visibility Analytics and AI Chatbot Analytics.
Export Web Vitals
Section titled “Export Web Vitals”Web Vitals are included in full site and user takeout as record_type = web_vital. Use those rows when you need to audit p75 changes outside HitKeep, join performance evidence to deployment windows, or preserve a vendor-neutral archive of opt-in performance samples.
For dashboard reporting and aggregate API access, see Web Vitals Analytics.
Export Opportunities
Section titled “Export Opportunities”Opportunities are part of the full site takeout. They are recommendation records, not raw prompt transcripts.
Included fields are safe for customer review:
- translation keys such as
title_key,summary_key, andaction_key copy_paramsand route params used by localized dashboard copy- cited evidence IDs and evidence labels/values
- detector version, impact, confidence, score, and status
- safe AI run metadata such as provider label, model label, template version, hashes, token counts, status, and error category
Excluded fields stay unavailable because they can contain secrets, prompt internals, provider payloads, or visitor-level data.
User takeout and GDPR portability
Section titled “User takeout and GDPR portability”Users can export their own account data through the user takeout endpoint:
This supports GDPR Article 20 data portability workflows. For compliance context, see GDPR for HitKeep.
Migration workflow
Section titled “Migration workflow”A practical vendor-exit workflow looks like this:
- Export the site via the site takeout endpoint.
- Export high-volume hit slices through the filtered hits export endpoint if you need smaller files.
- Query a Parquet sample with DuckDB to verify date ranges and field coverage.
- Import the resulting Parquet, CSV, JSON, or NDJSON files into your warehouse or next analytics stack.
If you are replacing Google Analytics, pair this page with the self-hosted GA4 alternative guide and the Google Analytics comparison.
Self-hosted or managed
Section titled “Self-hosted or managed”Self-hosted HitKeep keeps exports inside infrastructure you control. Configure paths and storage targets in the Configuration Reference, then decide whether exports stay on local disk or move to S3-compatible storage through S3 Backups.
Use HitKeep Cloud when you want managed hosting, updates, and encrypted backups while still keeping self-service export paths available.