Written by Chris Tao, Microsoft Premier Field Engineer.
In my last article, we discussed the nature of metadata, the need for metadata management in the modern enterprise, and some fundamental information about Project Barcelona. In this article, we will go through the Project Barcelona product UI to see how it can help Enterprise IT administrators on metadata management work.
The Project Barcelona product is being continuously released via an online demo website. We can simply browse to that site to look at the metadata management features.
Figure 1: Project Barcelona Default Page
Project Barcelona is designed to help IT immediately garner the benefit of metadata management with minimal initial investment, which means there will be no up-front planning, modeling and ongoing maintenance required when using this product. Simply setup the product and configure the appropriate security account, and Project Barcelona will automatically start crawling and showing the results.
Figure 1, above, shows the default page of the Project Barcelona. After deploying and configuring Project Barcelona to crawl enterprise data systems, the indexed data systems will show in the default page.
Figure 2: Data Source Filtering Function
By expanding the left menu, we can filter the data sources by different attributes, like ‘Artifact Type’, ‘Domain’, ‘Created Time’ etc. This will be helpful when we crawled a lot of data sources and hard to find some of them.
Figure 3: Browse a Crawled Data Source
We can double click the data source icon to browse a crawled data source and drill down to the detail object levels. As Figure 3 shows, with Project Barcelona we can easily find the basic logical structure of information for the entity called ‘AlbumName’, which includes a database table called ‘Artist_Album’, a database schema named ‘dbo’, a database called ‘ChinookFeeder1’ and the Microsoft SQL Server which is ‘INVENTORYFEEDER’.
Figure 4: Metadata Trace
To experience the power of Project Barcelona’s metadata tracing function, you can click the ‘Get Dependencies’ button, after which you will be able to see the whole data flow lifecycle of the ‘AlbumName’ column.
In Figure 4, the green line means this column has undergone ETL (Extract, Transform and Load); the black line means this column is been referenced by this object.
With this diagram we can see that the INVENTORYFEEDER database is the source of the ‘AlbumName’ column, and this column is eventually loaded into CHINOOK_SALES database server; then, after some transformations, it’s loaded into CHINOOK-SALES database and the data warehouse as well. We can also expand the left menu to filter the data sources.
Figure 5: Drill Down to AlbumTitle
Now we can click the data object ‘CHINOOK_DATAWAREHOUSE’. When we expand the right menu, we will see that there’re multiple data columns referenced by this object. To browse the detail data flow transformation information, we can simply double click any of the columns which we want to see, for example, ‘AlbumTitle’:
Figure 6 Album Title Data Flow
Now we can easily see the whole lifecycle about the column ‘AlbumTitle’ based on this diagram:
- The data was initialized in a data column named ‘AlbumName’ in the database object ‘INVENTORYFEEDER’ .
- The ‘Feeder DB1’ retrieves the data from ‘INVENTORYFEEDER’ and stores it.
- After some transformations and extractions, it is eventually loaded into the ‘AlbumTitle’ column based on the value in the ‘Populated Albums’ column.
- The data in column ‘AlbumTitle’ is also consumed by the stored procedure ‘GetAlbumsByArtist’ and database ‘Chinook Sales DB’
This kind of information is extremely helpful for enterprise IT administrators when planning for updates to database schemas or retiring legacy systems.
By clicking each of the data icons, we can see more details about each milestone of the column in the bottom of the page.
Figure 7 Keyword Search
In the latest version of Project Barcelona, we can do keyword search across all of the crawled data objects. To do a metadata search, just browse to the default page, type the keyword ‘Artist’ and click the search button; all of the data objects having the keyword ‘Artist’ in the name will show in the search result page. That will be very helpful when we want to locate some commonly name objects across the data sources.
In this article we discussed the basic functionality of Project Barcelona, and how this could greatly help us with enterprise metadata management. Thanks for reading, and your comments are welcome. If you want to find out more, please visit the official Project Barcelona product team blog.