Skip to content
Start in Cloud

Analytics with Open Exports

HitKeep gives teams an open export path for analytics data. You can export site data, filtered hit data, Web Vitals samples, AI fetch records, AI chatbot events, saved Opportunities, and personal account data in formats that work with DuckDB, spreadsheets, scripts, and warehouses.

Use this guide when your search is “web analytics with open exports”, “analytics data portability”, or “how do I get raw analytics out of the product before choosing a vendor”.

Export pathWhat it is forFormatsAuth surface
Site takeoutFull site data package for portability or audit reviewxlsx, csv, parquet, json, ndjsonDashboard session
Filtered hits exportRaw hit export for a selected date range or filtercsv, xlsx, parquet, json, ndjsonDashboard session, bearer token, or API key
AI fetch exportServer-side AI crawler records for a site and date rangecsv, xlsx, parquet, json, ndjsonDashboard session, bearer token, or API key
AI chatbot exportOn-site assistant instrumentation eventscsv, xlsx, parquet, json, ndjsonDashboard session, bearer token, or API key
User takeoutPersonal account data for GDPR portabilityxlsx, csv, parquet, json, ndjsonDashboard session

Takeout endpoints default to xlsx. Hits exports default to csv. Set format=parquet, format=json, or format=ndjson when the export is headed to a data pipeline.

Site takeout includes Web Vitals rows as record_type = web_vital when samples exist. Those rows contain metric name, value, server-derived rating, normalized path, navigation type, timestamp, tracker source, and tracker version.

Site takeout also includes saved Opportunities and safe AI run metadata when those records exist. Exported Opportunities keep localization keys, interpolation params, cited evidence IDs, detector metadata, status, and the final customer-visible output boundary. Takeout excludes provider secrets, raw prompts, raw provider responses, raw external error bodies, and unrestricted AI tool payloads.

For the generation model and API field contract, see Opportunity Recommendations.

Export a complete site package from the generated API reference:

The endpoint is session-authenticated. It is meant for signed-in operators who can access the site in the dashboard.

For warehouse work, request Parquet explicitly:

/api/sites/{site_id}/takeout?format=parquet

For spreadsheet review, use the default xlsx output:

/api/sites/{site_id}/takeout

When you only need a slice of traffic, use the hit export endpoint instead of a full takeout:

This endpoint accepts date range and filter parameters from the API reference. It is the better fit for recurring jobs that extract one campaign, one hostname, one referrer, or one investigation window.

Example export targets:

/api/sites/{site_id}/hits/export?from=2026-04-01T00:00:00Z&to=2026-04-30T23:59:59Z&format=csv
/api/sites/{site_id}/hits/export?filter=utm_source:newsletter&format=parquet

Use a scoped API client for scheduled scripts so access can be revoked without changing a user password.

FormatBest for
parquetDuckDB, Apache Spark, Pandas, warehouse import, long-term analytical storage
csvSimple tabular imports and scripts that expect text files
jsonProgrammatic processing where nested fields are easier to preserve
ndjsonStreaming pipelines and line-oriented processing
xlsxManual review in spreadsheet apps and user-facing takeout packages

Retention archives and backup snapshots also use open files. See Data Retention and Archiving for Parquet archives and Backups and Restore for full database snapshots.

For current runtime, storage, export, and non-replacement facts, see Facts and Limits.

Parquet exports can be queried locally with DuckDB without importing them into another service:

Terminal window
duckdb -c "
SELECT
date_trunc('week', timestamp) AS week,
referrer_domain,
count(*) AS hits
FROM 'site_hits.parquet'
GROUP BY 1, 2
ORDER BY 1 DESC, 3 DESC
LIMIT 20;
"

This is useful for audits, support investigations, and migrations where you need to inspect the raw exported file before loading it elsewhere.

HitKeep’s AI reports are exportable through dedicated endpoints:

Use these when you want server-side AI crawler analytics or assistant instrumentation outside the dashboard. For setup and interpretation, see AI Visibility Analytics and AI Chatbot Analytics.

Web Vitals are included in full site and user takeout as record_type = web_vital. Use those rows when you need to audit p75 changes outside HitKeep, join performance evidence to deployment windows, or preserve a vendor-neutral archive of opt-in performance samples.

For dashboard reporting and aggregate API access, see Web Vitals Analytics.

Opportunities are part of the full site takeout. They are recommendation records, not raw prompt transcripts.

Included fields are safe for customer review:

  • translation keys such as title_key, summary_key, and action_key
  • copy_params and route params used by localized dashboard copy
  • cited evidence IDs and evidence labels/values
  • detector version, impact, confidence, score, and status
  • safe AI run metadata such as provider label, model label, template version, hashes, token counts, status, and error category

Excluded fields stay unavailable because they can contain secrets, prompt internals, provider payloads, or visitor-level data.

Users can export their own account data through the user takeout endpoint:

This supports GDPR Article 20 data portability workflows. For compliance context, see GDPR for HitKeep.

A practical vendor-exit workflow looks like this:

  1. Export the site via the site takeout endpoint.
  2. Export high-volume hit slices through the filtered hits export endpoint if you need smaller files.
  3. Query a Parquet sample with DuckDB to verify date ranges and field coverage.
  4. Import the resulting Parquet, CSV, JSON, or NDJSON files into your warehouse or next analytics stack.

If you are replacing Google Analytics, pair this page with the self-hosted GA4 alternative guide and the Google Analytics comparison.

Self-hosted HitKeep keeps exports inside infrastructure you control. Configure paths and storage targets in the Configuration Reference, then decide whether exports stay on local disk or move to S3-compatible storage through S3 Backups.

Use HitKeep Cloud when you want managed hosting, updates, and encrypted backups while still keeping self-service export paths available.