Jump to content

Draft:Unity Catalog

From Wikipedia, the free encyclopedia
  • Comment: Please rewrite from scratch without the use of LLMs Helpful Raccoon (talk) 03:11, 3 June 2026 (UTC)
  • Comment: In accordance with the Wikimedia Foundation's Terms of Use, I disclose that I have been paid by my employer for my contributions to this article. Systemsnull (talk) 23:55, 2 June 2026 (UTC)



Unity Catalog
DeveloperUnity Catalog community (LF AI & Data Foundation)
ReleaseJune 12, 2024; 2 years ago (2024-06-12) (open source)
Stable release
0.4.0 / February 14, 2026; 4 months ago (2026-02-14)
Written inJava, Python, TypeScript, Scala
Operating systemCross-platform
TypeMetadata catalog, data governance
LicenseApache License 2.0
Websitewww.unitycatalog.io
Repositorygithub.com/unitycatalog/unitycatalog

Unity Catalog is a metadata and governance catalog for data and artificial intelligence assets, released under the Apache License. It includes a unified namespace for tables, unstructured data volumes, and functions, with APIs that can be accessed with multiple compute engines. It also manages table formats including Delta Lake and Apache Iceberg.[1][2] The source code for the project is part of the LF AI & Data Foundation, an open source nonprofit software foundation.[3][4] The catalog was originally developed at Databricks, which also offers it as a managed service.[5]

Overview

[edit]

Unity Catalog works by organizing data and AI assets in a hierarchical namespace consisting of catalogs, schemas, and other securable objects.[2][6][7] It manages structured tables, supporting formats such as Delta Lake, Apache Iceberg, Apache Parquet, CSV, and JSON; and unstructured data through volumes; and machine-learning assets such as functions and models.[2][8]

The governance features in the project include credential vending, a feature where the catalog server issues scoped credentials to clients accessing underlying cloud storage, and support for the Apache Iceberg REST Catalog API and Apache Hive metastore API.[2][8] By acting as a layer that exposes open APIs, implementation provides an interface for external clients, such as engines like Apache Spark to read tables, volumes, and functions that are managed by the catalog.[9]

Unity Catalog leverages an OpenAPI-based REST specification and an reference server, with client libraries and SDKs.[8][6] The server is compatible with the popular Apache Iceberg REST Catalog and provides compatibility with the Apache Hive metastore API.[2][8]

See also

[edit]

References

[edit]
  1. ^ Krill, Paul (June 12, 2024). "Databricks races with Snowflake to open up data catalog source code". InfoWorld. Retrieved June 2, 2026.
  2. ^ a b c d e "Open sourcing Unity Catalog". Databricks Blog. June 12, 2024. Retrieved June 2, 2026.
  3. ^ "Welcoming Unity Catalog to the LF AI & Data Foundation". LF AI & Data Foundation. June 20, 2024. Retrieved June 2, 2026.
  4. ^ "Unity Catalog". LF AI & Data Foundation. Retrieved June 2, 2026.
  5. ^ "What is Unity Catalog?". Microsoft Learn. Retrieved June 2, 2026.
  6. ^ a b "Unity Catalog Documentation". unitycatalog.io. Retrieved June 2, 2026.
  7. ^ Unity Catalog: Open and Universal Governance for the Lakehouse and Beyond. Proceedings of the ACM SIGMOD/PODS Conference. 2025. doi:10.1145/3722212.3724459.
  8. ^ a b c d "unitycatalog/unitycatalog". GitHub. Retrieved June 2, 2026.
  9. ^ "Databricks open-sources Unity Catalog, challenging Snowflake on interoperability for data workloads". VentureBeat. June 12, 2024. Retrieved June 2, 2026.
[edit]

Category:Free software Category:Data management software Category:Metadata Category:Big data products Category:Software using the Apache license Category:2024 software