The Data Catalog – The “Yellow Pages” for Business-Relevant Data

A much discussed technological approach to making knowledge about distributed data available is the data catalog, the "yellow pages" for company-relevant data. It stores information about data in the form of metadata, structures it and makes it searchable.

Overview

An overview of data catalogs by BARC Analyst Timm Grosser, including tips on how to select the right data cataloging solution for your organization.

Data is essential for companies to keep up with the digital age. Everyone knows that by now. But it’s not so easy to extract the desired value from data and shine with innovative, data-driven business applications. Instead, we often see data chaos that has been growing for years in the form of fragmented data landscapes and distributed expert knowledge.

A hotly discussed technological approach to make knowledge of distributed data available is the data catalog, the “Yellow Pages” for business-relevant data. It stores information about data in the form of metadata and structures, and makes it searchable.

A data catalog tool achieves its usefulness primarily through three essential points:

  1. covering information needs quickly and easily
  2. capturing and curating metadata (knowledge) as efficiently (automated) as possible
  3. providing a platform for the exchange of knowledge for “all”

In addition, functions for data governance and/or data access are valuable.

Finding the right tool can be more complicated than you might expect. The market for data catalogs is anything but transparent. As with other trending areas, the range of products is exploding and we are now aware of more than 90 solutions with data cataloging functions operating worldwide. But not all data cataloging is the same. These offerings vary in focus, content, features and supported use cases. The following table provides an overview of the basic tool types for data cataloging. Basically, there are options for specific use cases (as part of a BI or analytics user tool, as part of an environment) and offerings that provide a comprehensive, independent solution (specialists, as part of a data governance (DG)/data management (DM) platform).

Data Cataloging tool types:

Catalog scenarioCharacteristicsTool examples
…homemadeRudimentary catalog functionsExcel, Confluence, Wiki, …
…as part of a BI/analytics toolCatalog functions related to the data/artifacts in the environmentAlteryx, Qlik, Tableau, …
…as part of an environmentCatalog functions related to technical data/artifacts in the environmentAmazon, Cloudera, Google, …
…as specialistComprehensive catalog functions related to data and partly artifacts from different tools/environments, added functionality such as data governanceAlation, Waterline, Zeenea, …
…as part of a data governance/DM platformComprehensive catalog functions related to data and partly artifacts from different tools/environments. Additional functionality from the portfolio (e.g., workflows, data quality, etc.)ASG, Collibra, Infogix, Informatica, SAP, …

Pay particular attention to interfaces and transparent, open metadata models for metadata exchange with other catalogs and systems when selecting a data catalog. This offers you a number of advantages:

  • You avoid vendor lock-in and can use the tool’s capabilities in a targeted manner
  • You can more easily transfer catalogs from different areas or environments to a parent catalog
  • It allows easier migration or integration with more powerful tools or tools with a different focus

When selecting a data catalog, its functions should be carefully checked. A checklist should normally include:

  • Adapters and functions for metadata integration and exchange
  • Supported content (e.g., supported metadata types, openness and extensibility of the metadata model)
  • Functions and machine support for the maintenance (curation) of metadata
  • Functions and machine support for catalog use and search/navigation/analysis of metadata
  • Ease of use
  • Support for collaboration
  • Further data management functions (e.g., for data governance, data preparation, data quality and data protection)

We are also happy to support you directly – with our best practice experience, established process models and numerous templates – through the entire selection process from requirements gathering to the creation of a shortlist, proof of concept support and deciding which tool to use. This gives you greater decision security, saves you time and resources and provides you with a partner who can help to create a data cataloging roadmap which is both transparent and acceptable to management and relevant stakeholders.

Don‘t miss out!
Join over 25,775 data & analytics professionals and get the latest product insights, research, surveys and more!

Discover more content

Author(s)

Senior Analyst Data & Analytics

Timm Grosser is a Senior Analyst Data & Analytics at BARC with a focus on data strategy, data governance and data management. His core expertise is the definition and implementation of data & analytics strategy, organization, architecture and software selection.

He is a popular speaker at conferences and seminars and has authored numerous BARC studies and articles.

Check out the world´s most comprehensive guide to the Power BI ecosystem.