Introduction
DataKit is a privacy-first, browser-based data analysis tool that brings the power of SQL directly to your web browser. Built with DuckDB-WASM and React, it processes multi-gigabyte datasets entirely client-side without requiring any server-side processing or cloud uploads.
How does it work?
DataKit transforms your browser into a powerful data analysis environment. Simply drag and drop CSV or Parquet files, and immediately start querying them with full SQL capabilities.
Key Features:
- Client-side processing - All data processing happens in your browser using WebAssembly
- Large file support - Handle multi-GB datasets with streaming parsers and efficient memory management
- Full SQL engine - Powered by DuckDB with support for complex queries, joins, and aggregations
- No setup required - Works instantly in any modern web browser
- Zero data upload - Your files stay on your local machine
Perfect for:
- Data analysts who need quick insights without setting up databases
- Privacy-conscious users who can't upload sensitive data to cloud services
- Teams wanting to share analysis workflows without infrastructure overhead
- Anyone frustrated with Excel crashes on large datasets
Self-Hosting DataKit
DataKit is designed as a client-side application, making self-hosting straightforward. Since all data processing happens in the browser, you only need to serve the static files.
Requirements
- Web Server - Any server capable of serving static files
- HTTPS - Required for File System Access API (optional feature)
- Modern Browsers - Chrome, Firefox, Safari, or Edge with WebAssembly support
Performance Notes
- Datasets up to several GBs can be processed efficiently
- Performance depends on user's browser and device capabilities
- No server resources consumed for data processing
- Scales automatically with user's hardware
Ready to get started? Visit datakit.page to try it out, or deploy your own instance following the self-hosting guide above.