Skip to content

apache/fluss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Apache Fluss logo

Documentation | QuickStart | Development

CI License Slack Ask DeepWiki

What is Apache Fluss (Incubating)?

Apache Fluss (Incubating) is a streaming storage built for real-time analytics & AI which can serve as the real-time data layer for Lakehouse architectures.

It bridges the gap between data streaming and data Lakehouse by enabling low-latency, high-throughput data ingestion and processing while seamlessly integrating with popular compute engines like Apache Flink, while Apache Spark, and StarRocks are coming soon.

Fluss (German: river, pronounced /flus/) enables streaming data continuously converging, distributing and flowing into lakes, like a river 🌊

Features

  • Sub-Second Data Freshness: Continuous ingestion and immediate availability of data enable low-latency analytics and real-time decision-making at scale.
  • Streaming & Lakehouse Unification: Streaming-native storage with low-latency access on top of the lakehouse, using tables as a single abstraction to unify real-time and historical data across engines.
  • Columnar Streaming: Based on Apache Arrow it allows database primitives on data streams and techniques like column pruning and predicate pushdown. This ensures engines read only the data they need, minimizing I/O and network costs.
  • Compute–Storage Separation: Stream processors focus on pure computation while Fluss manages state and storage, with features like deduplication, partial updates, delta joins, and aggregation merge engines.
  • ML & AI–Ready Storage: A unified storage layer supporting row-based, columnar, vector, and multi-modal data, enabling real-time feature stores and a centralized data repository for ML and AI systems.
  • Changelogs & Decision Tracking: Built-in changelog generation provides an append-only history of state and decision evolution, enabling auditing, reproducibility, and deep system observability.

Building

Prerequisites for building Apache Fluss:

  • Unix-like environment (we use Linux, Mac OS X, Cygwin, WSL)
  • Git
  • Maven (we require version >= 3.8.6)
  • Java 11
git clone https://github.com/apache/fluss.git
cd fluss
./mvnw clean package -DskipTests

Apache Fluss is now installed in build-target. The build command uses Maven Wrapper (mvnw) which ensures the correct Maven version is used.

Contributing

Apache Fluss (Incubating) is open-source, and we’d love your help to keep it growing! Join the discussions, open issues if you find a bug or request features, contribute code and documentation, or help us improve the project in any way. All contributions are welcome!

License

Apache Fluss (Incubating) project is licensed under the Apache License 2.0.