Skip to main content

Introduction

DataKit is a privacy-first, browser-based data analysis tool that brings the power of SQL directly to your web browser. Built with DuckDB-WASM and React, it processes multi-gigabyte datasets entirely client-side without requiring any server-side processing or cloud uploads.

How does it work?

DataKit transforms your browser into a powerful data analysis environment. Simply drag and drop CSV or Parquet files, and immediately start querying them with full SQL capabilities.

Key Features:

  • Client-side processing - All data processing happens in your browser using WebAssembly
  • Large file support - Handle multi-GB datasets with streaming parsers and efficient memory management
  • Full SQL engine - Powered by DuckDB with support for complex queries, joins, and aggregations
  • No setup required - Works instantly in any modern web browser
  • Zero data upload - Your files stay on your local machine

Perfect for:

  • Data analysts who need quick insights without setting up databases
  • Privacy-conscious users who can't upload sensitive data to cloud services
  • Teams wanting to share analysis workflows without infrastructure overhead
  • Anyone frustrated with Excel crashes on large datasets

Self-Hosting DataKit

DataKit is designed as a client-side application, making self-hosting straightforward. Since all data processing happens in the browser, you only need to serve the static files.

Requirements

  • Web Server - Any server capable of serving static files
  • HTTPS - Required for File System Access API (optional feature)
  • Modern Browsers - Chrome, Firefox, Safari, or Edge with WebAssembly support

Performance Notes

  • Datasets up to several GBs can be processed efficiently
  • Performance depends on user's browser and device capabilities
  • No server resources consumed for data processing
  • Scales automatically with user's hardware

Ready to get started? Visit datakit.page to try it out, or deploy your own instance following the self-hosting guide above.