Data Engineering Stack
A practical open-source data engineering stack focused on batch and analytical workloads. It combines scalable storage, pipeline automation, orchestration, visualization, and observability for data teams.
Tools in this Stack
SeaweedFS is an open source distributed file system supporting WebDAV, S3 API, FUSE mount, HDFS, etc, optimized for lots of small files, and easy to add capacity. `Apache-2.0` `Go`
FIFO Pipeline which parallels execution on each stage while maintaining the order of messages and results.
Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts
Web-based environment for interactive and reproducible computing. ([Demo](https://mybinder.org/v2/gh/jupyterlab/jupyterlab-demo/try.jupyter.org?urlpath=lab), [Source Code](https://github.com/jupyterlab/jupyterlab/)) `BSD-3-Clause` `Python/Docker`
🔥 人人可用的开源 BI 工具,数据可视化神器。An open-source BI tool alternative to Tableau.
GUI Git client. Replace the Git CLI with a clear UI and AI assist. [![Freeware][Freeware Icon] ![Open-Source Software][OSS Icon]](https://github.com/maoyama/Tempo)
Why This Stack Works
This stack is designed around the core lifecycle of data engineering: ingest, transform, store, analyze, and monitor. SeaweedFS provides a simple but proven distributed storage layer suitable for data lake patterns, while Parapipe enables building repeatable ETL workflows using code-first pipelines. Rundeck complements this by handling scheduling, dependency management, and operational control of data jobs in production. For analytics and insight delivery, JupyterLab supports exploratory data analysis and experimentation, while DataEase serves as the visualization and BI layer for stakeholders. Tempo adds observability through distributed tracing, helping teams understand pipeline performance and quickly diagnose bottlenecks or failures. Together, these tools form a cohesive, production-ready stack that is flexible enough to evolve, with clear upgrade paths to more specialized databases or streaming platforms if needed.