Companies today are trying to do more with their data than ever before. However, companies’ existing data architectures aren’t necessarily constructed in a manner that’s conducive to getting it all done. According to IT analyst Howard Dresner, an emerging pattern dubbed the active data architecture will give companies the freedom to realize their big data dreams.
According to Dresner, an active data architecture is a software-defined abstraction layer that separates the physical data store from the consumption of that data. The concept borrows some of the ideas of data meshes and data fabrics, especially the focus on building data products. However, it goes beyond what most people associate with a data mesh or a data fabric.
You cannot go out and buy an active data architecture, according to Dresner, just as you can’t buy a fully formed data fabric or a prebuilt data mesh. Rather, an organization builds their own active data architecture using existing components from well-established data disciplines, including data integration, data engineering, data governance, metadata management, and operational and analytical data infrastructure, Dresner Advisory Services writes in its latest Wisdom of the Crowd series of reports, titled “Active Data Architecture.”
“It is comprised of various data management capabilities, including virtualized and distributed data access, data governance, and security,” the company writes in the report. “An active data architecture helps to elevate the status and importance of data to the level of a ‘product’ by separating the management, governance, and use of data from the specific technical systems in which it may be housed. In essence, an active data architecture provides (among other things) a layer of abstraction enabling data to be managed and applied in an application-independent manner.”
A data catalog that uses metadata to help an organization categorize and discover data sets is an important component of an active data architecture. So is a semantic layer that helps to translate between business definitions of data metrics that humans understand and the underlying technical data definitions that dictate how the data is processed and stored.
Dresner Advisory Services surveyed businesses around the world and concluded that awareness and interest in the active data architecture concept is growing. The company found that 28% of respondents consider an active data architecture “of critical importance,” an increase of 2% from 2024. Similarly, less than 5% of survey respondents said that active data architecture is not important, down from 7% in 2024.
Larger companies in western countries were more likely to consider the concept important, the survey found. It also found that workers in operations, sales and marketing, BI, and IT were more likely to view it as important, compared to people working in data science, finance, strategic planning, or executive management.
Dresner also found a correlation between groups that have already achieved success with their BI projects and those who have a positive view of active data architecture. Specifically, 62% of organizations that rated their BI efforts as “extremely successful” view active data architecture as critically important, Dresner writes in the report, and none of these respondents considered it not important.
“The buildout of an active data architecture approach to accessing, combining, and preparing data speaks to a degree of maturity and sophistication in leveraging data as a strategic asset,” Dresner Advisory Services writes in the report. “It is not surprising, then, that respondents who rate their BI initiatives as a success place a much higher relative importance on active data architecture concepts compared with those organizations that are less successful.”
Data integration is a major component of an active data architecture, but there are different ways that users can implement data integration. According to Dresner, the majority of active data architecture practitioners are utilizing batch and bulk data integration tools, such as ETL/ELT offerings. Fewer organizations are utilizing data virtualization as the primary data integration method, or real-time event streaming (i.e. Apache Kafka) or message-based data movement (i.e. RabbitMQ).fa
Data catalogs and metadata management are important aspects of an active data architecture. “The diverse, distributed, connected, and dynamic nature of active data architecture requires capabilities to collect, understand, and leverage metadata describing relevant data sources, models, metrics, governance rules, and more,” Dresner writes.
Drenser’s survey found that 84% of survey respondents consider semantic layers to be critical very important, or important to active data architectures, according to the study, compared to only 15% who said semantic layers were not critical or important.
“The ability to build a semantic layer that interacts with a variety of data source types, interoperates with other tools, enables consistent views of data, and supports appropriate levels of security and control is increasingly important to many organizations,” the company states in its study.
Ingestion of metadata is the top-requested feature in an active data architecture, followed by impact analysis, lineage visualization, modeling of integrated views of data, modeling of all the componentry of an active data architecture, and optimization capabilities.
Automated governance is another critical factor in succeeding with an active data architecture. Dresner finds that organizations are prioritizing certain subsets of data governance in their active data architecture builds, followed by open source, security, privacy, data quality, and open formats.
Dresner’s survey also found that organizations are prioritizing the scalability and performance of their active data architectures. “The high level of importance for persistence, caching, and distributed query optimization appears to align with the accelerating demand for data virtualization, which requires these capabilities to achieve suitable performance,” the company writes.
Being adaptable to change is an inherent aspect of active data architectures, so it’s not surprising that Dresner finds that organizations are favoring dynamic optimization techniques, which enable them to do things like adjust data placement or choose different integration methods. Organizations need to monitor their environments, which is why key performance indicator (KPI) monitoring is also trending among active data architecture practitioners. The capability to manage an active data architecture via API is also seen as a benefit.
Active data architectures aren’t bought, but built, and organizations source their components from a variety of places. Dresner found data integration tool providers were the top choice, as cited by more than 50% of survey respondents, followed by BI and analytics tool vendors; data catalog and metadata management providers; data fabric- or data mesh-focused vendors; sellers of database and data persistence layers; cloud infrastructure providers; and data governance providers.
The active data architecture trend is also picking up steam in software development, as third-party vendors look to the architectural pattern for clues on how to develop their wares to achieve maximum impact and the least amount of disruption. Dresner’s study found that 55% of software vendors said that an active data architecture is critically important, followed by 21% rating it as very important and another 14% indicating it is important. Four percent said it was somewhat important while 9% indicating it is not important at all.
Dresner included 20 vendors in its active data architecture ratings. Dremio and Denodo tied for first; Pentaho, Palantir, and Informatica tied for third; Fivetran, Cube, and Astera tied for fourth, and Altair came in fifth.
“We’re honored to be recognized as a leading vendor in this space,” stated Read Maloney, Dremio’s chief marketing officer. “As organizations race to build agentic applications powered by AI, the ability to deliver governed, real-time, and AI-ready data is becoming the key differentiator. This recognition from Dresner—based entirely on customer feedback—reinforces Dremio’s role in accelerating this shift by providing fast, flexible, and open access to data.”
Interestingly, more than 95% of the vendors surveyed by Dresner reported that they can deliver all of the functionality needed to build an active data architecture via a single product offering. “This is questionable, given the reality that many vendors offer multiple disparate products across feature categories, such as data integration, data governance, and metadata/data catalog,” Dresner says. “Both end user and competitor organizations should be aware that many vendors offer only a narrow subset of the overall functionality needed for true active data architecture.”
Alas, just as Dresner points out there is widespread confusion about what constitutes a data fabric or a data mesh, there is clearly additional education needed to educate the market on what an active data architecture entails.
Dremio shared a copy of Dresner’s report, which you can access here.
Related Items:
Is the Universal Semantic Layer the Next Big Data Battleground?
What We’ve Learned from Over Two Decades of Data Virtualization
Data Mesh Vs. Data Fabric: Understanding the Differences
The post The Active Data Architecture Era Is Here, Dresner Says appeared first on BigDATAwire.
Leave a Reply