We are currently working on the upcoming BARC Score Data Intelligence Platforms 2024. BARC Score is our evaluation model for providing an overview of software solutions and their providers. Detailed analyst knowledge is combined with extensive research results to create a graphic that is unique in the market and provides a clear statement on the current market position and portfolio of the vendors.
This BARC Score examines 13 selected, market-relevant data intelligence platforms and is scheduled for publication in the first quarter of 2024.
But what happens behind the scenes? How is a BARC score created? What discussions are held? How do we decide which providers are included and what insights do our analyses provide? We will take you along on our journey to publication. By the way, this is the first “Behind the Scenes with an Analyst” post. If you like it, give us a Like on LinkedIn and we will continue the series with even more motivation. We also welcome comments on the content, usefulness of the article or other topics you’d like us to cover.
The journey begins
We have just completed The Data Management Survey 2024 – the world’s largest annual user survey on data management tools. Now, it’s time to move on to the next project: BARC Score Data Intelligence Platforms 2024. Our aim is to deliver an overview of the data intelligence platforms market. But which tools should be included and what exactly should we test?
What is data intelligence?
First of all, a brief definition: What exactly is Data Intelligence and why should we be concerned with it?
At BARC, we consider Data Intelligence, together with Data Mesh (including Data Products) and Data Fabric, to be the hottest drivers for investment in data infrastructures right now. These topics are closely linked. They address the challenges faced by organizations seeking to provide data from distributed data landscapes in a flexible but controlled manner. In short, companies want to work with data but face obstacles such as difficulties in finding data, complex data access, non-integrated data, insufficient data quality, lengthy processes and user uncertainty as to where to get answers to their data questions.
Centralization is yesterday’s news, today decentralization counts
The villain is quickly identified: IT is too slow to implement new requirements and creates a bottleneck. The current trend is therefore towards decentralization (i.e., the distribution of data tasks to the places where they can be carried out most effectively and efficiently). Business departments are being given more responsibility with the aim of maximizing the benefits derived from data and minimizing administrative effort.
This should allow data to be used faster and better via self-service. However, it should also be possible to share data or findings with others.
Meeting this challenge requires knowledge. This is the core of data intelligence: providing knowledge about data in a way that enables everyone to benefit from it while optimizing data management, complying with data protection requirements and driving the development of data.
We cannot understand our data without metadata
Knowledge about data is manifested in the form of metadata. Metadata links technical, functional, organizational, operational and social knowledge in companies and makes it accessible to both humans and machines. This makes it easy to find, understand, interpret and access, while at the same time ensuring data security.
Data intelligence is therefore concerned with the intelligent collection, linking, enrichment, analysis and application of metadata. In our next blog post, we will present the requirements for data intelligence platforms in detail.
Data intelligence is hype – and there are many solutions out there
There are currently around 100 solutions on the market with different focuses and functions for collecting, processing and analyzing metadata. A list of all these providers is the starting point for our research to identify data intelligence platforms that are eligible for inclusion in this BARC Score. During the research, we were able to categorize the market after an initial screening as follows:
Inventories
Inventories collect technical metadata and make it searchable (technical data catalogs). This allows us to search, for example, for tables or attributes in databases or data lakes. In some cases, they also make it possible to add and search business terminology and descriptions using tagging.
Data catalogs
Data catalogs go beyond supporting technical metadata and can also capture other functional or operational metadata. They offer functions for searching and navigating data and for supporting data governance processes and collaboration. The market segment is characterized by providers with different specialties and functional scopes.
There are specialists who focus on searching and finding data, some vendors offer tools for data lineage experts, others specialize in security and data protection and still others focus on data governance. Within this category, we therefore see a wide variety of tools.
Data intelligence platforms
Data intelligence platforms enhance data catalogs with intelligent AI/ML-supported functions to automate catalog tasks or simplify user interaction. They support data access, offer functions for setting up data marketplaces and data products and promote the generation of value from metadata through its active use (active metadata management).
There are features and there are tools
While dedicated software for cataloging is available, some of the larger software providers also offer data cataloging as a feature.
- Stand-alone solutions
The main purpose of these solutions is to collect, prepare, link and analyze metadata. As an enterprise data catalog, these tools can integrate metadata across systems and so offer a central point of contact for questions about data of all kinds.
- As a supporting function
Metadata functions specifically support the main use case of the tool (e.g., data virtualization, data integration, knowledge graphs, BI & analytics). Their cataloging functionality is therefore usually limited or focused, so they are not suitable as enterprise-wide solutions. For example, the Qlik Data Catalog offers good functions for the integration and use of metadata, but primarily for the Qlik world.
Knowledge Graphs We will discuss the broad topic of knowledge graphs in more detail in a separate blog, but it should be noted here that they are already used as an engine in many data catalogs and data intelligence platforms. They help to link metadata more extensively and, in turn, enable advanced analyses, such as relationship analyses between different metadata objects. There are knowledge graphs that support enterprise cataloging and data intelligence (i.e., metadata management) as their main use case. But there’s also another market for knowledge graphs: those that focus more on the application of operational use cases and integrate data into a semantic model. Data is stored together with metadata and thus enables analyses. Prominent examples include the resolution of parts lists across several production systems or support in the development of drugs. These processes often involve several hundred systems, each with its own nomenclature for ingredients, processes, etc. This is where semantic integration helps to provide a complete overview of the development process. This difference in approaches (knowledge graph as part of the architecture vs. knowledge graph tool that can be used for cataloging) also results in a different classification in the market. This explains the classification of, for example, data.world under stand-alone solutions (focused on metadata) and Cambridge Semantics under "cataloging as a feature" in our market overview. The latter integrates data into the semantic model, can enrich it with metadata and then make it available for analysis. As mentioned above: More on this in a separate blog.
Market overview of inventories, data catalogs & data intelligence platforms
The aim of the illustration below is to provide an initial view of the market. It does not claim to be exhaustive – if your tool of choice is missing, please let us know in the comments.
Figure 1: Market segments for Data Catalogs & Data Intelligence Platforms
There is still a great deal of diversity within the various fields of Data Intelligence, which we will dive into in the next few blogs. The crucial question from here on is how we can effectively narrow down the market and who will be included in the BARC Score Data Intelligence Platforms 2024. To this end, we agreed on the following criteria, which helped us to narrow down the market quickly.
INFO: We use the same procedure in our software selection projects with customers. This allows us to quickly reduce the market down to a manageable size.
Figure 2: For the BARC Score Data Intelligence Platforms 2024, we have defined eight selection criteria
By applying these criteria, we were able to quickly narrow the market down to 13 providers. Strictly speaking, we would therefore not consider the offerings of the hyperscalers Amazon, Google and Microsoft. But does this list accurately reflect the current competitive landscape? Many of our customers are already using solutions from their strategic hyperscaler providers as initial prototypes for setting up company-wide catalogs or data intelligence platforms, and a comparison with stand-alone solutions can be particularly interesting in that context. We therefore decided to make an exception and include the big three hyperscalers: Amazon, Microsoft and Google.
So, our final list is set:
Figure 3: Based on the selection criteria, the 13 providers shown are examined in detail as part of the BARC Score
We contacted these vendors with an extensive Request for Information (RfI) comprising more than 200 criteria. We have received responses from most of them and are currently in the process of completing the final briefing appointments. These will allow us to see the solutions live again, understand the concepts better and clarify any outstanding queries about the RfI responses.
In the next blog, we will talk about our requirement profiles, evaluation methodology and initial findings from the RfI analysis, which we are starting this week. Stay tuned!