This post is authored by Joseph Sirosh, Corporate Vice President of Information Management & Machine Learning at Microsoft.
As one more important step towards our commitment to deliver a world-class cloud analytics platform, I am thrilled to announce that the public preview of the Microsoft Azure Data Catalog – an enterprise metadata catalog / portal for the self-service discovery of data sources – becomes available on Monday next week, July 13, 2015.
Azure Data Catalog is a fully managed service that stores, describes, indexes and provides information on how to access any registered data source. It closes the gap between those seeking information and those producing it. The video below gives you a quick overview of its key capabilities:
Businesses of every size face the challenge of sifting through their myriad data sources and discovering the right ones for a given problem. Although businesses collect and store tons of data as part of their everyday activities, too often they fail to reap the full benefit of all the data that’s being gathered. Employees too often end up spending more time searching for data than they actually do working with the data itself.
To address these problems, Azure Data Catalog uses a crowdsourced approach. Any user, for instance an analyst, data scientist or data developer, can register, enrich, discover, understand and consume data sources. Every user is empowered to register the data sources that they use. Registration extracts the structural metadata from the data source and stores it in the cloud-based Catalog, while the data itself remains in the data source.
Crowdsourced annotations let users who are knowledgeable about the data assets registered in the Catalog to enrich the system at any time. This helps others understand the data more readily, including its intended purpose and how it’s being used within the business.
Azure Data Catalog also lets users discover data sources by searching and filtering. Users can then connect to data sources using any tool of their choice, and they can similarly work with the data that they need using the tools with which they are already familiar.
Azure Data Catalog bridges the gap between IT and the business – it encourages the community of data producers, data consumers and data experts to share their business knowledge while still allowing IT to maintain control and oversight over all the data sources in their constantly evolving systems.
Stay tuned for the official announcements next Monday – I hope several of you give Azure Data Catalog a spin, and, as always, keep your feedback coming.
Follow me on twitter