Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
Updated
Jun 11, 2024 - Java
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Open Control Plane for Tables in Data Lakehouse
lakeFS - Data version control for your data lake | Git for data
Postgres for Search and Analytics
An IDE and translation engine for detection engineers and threat hunters. Be faster, write smarter, keep 100% privacy.
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Upserts, Deletes And Incremental Processing on Big Data.
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
An Git-like version control file system for data lineage & data collaboration.
汇总Apache Hudi相关资料
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
An AWS based solution using AWS CloudWatch and AWS Lambda based on Python to automatically terminate AWS EMR clusters that have been idle for a specified period of time.
Add a description, image, and links to the datalake topic page so that developers can more easily learn about it.
To associate your repository with the datalake topic, visit your repo's landing page and select "manage topics."