UNIFYING DATA ASSETS FOR INCREASEd EFFICIENCY AND PRODUCTIVITY
To ensure success, companies need to evaluate their needs and examine their options.
Any modern enterprise has multiple data sources. Tens, hundreds, even thousands exist in the largest organizations. From financial databases, to human resources, to marketing, each department, each application, and each user group may have a set of data sources to provide them with business answers on a day-to-day basis. But what happens when a question is asked that crosses the domain of these various data sources? How does an organization begin to unify its data assets? Let's start with the choices on both extremes: either leave the data where it is, or bring the data together into a single data source.
The first option, to leave the data in their various silos, requires the use of a reporting tool, custom application, or other delivery tool that can access all data sources. This may be the simplest technique from an implementation effort and cost standpoint. An organization may choose to simply place a reporting tool such as Crystal Reports® in the hands of an analyst, provide connectivity to the data sources, and rely on the analyst to bring the data together in a meaningful way. Despite its simplicity, this method has many downfalls. One issue is that the organization is placing its trust in a single developer to identify and implement business logic to unify disparate data sets. Often, this logic is repeated numerous times, across many reports. The result is information that is not always accurate or consistent. Because there is a lack of centralized, defined business logic for joining data sets, the chance for an erroneous interpretation of data is high. Technology-wise, performance is often an issue in such an architecture, as the application itself is responsible for querying the various data sources for part of the answer, and then unifying possibly large data sets internally. In general, such an arrangement rarely finds success on a large scale, and is suitable only for the smallest of organizations or departments
The second option, to bring all the data together, was once billed as ‘the' solution in the early days of data warehousing. Large initiatives, with the goal of producing a single, monumental data schema that would be enforced throughout the organization have been attempted over the last two decades. A successful implementation promised many rewards, including consistent, reliable data reported across an organization, rapid report development, and centrally changeable business logic. Few organizations, though, ever achieve such a successful implementation. Although the tools for such an undertaking (ETL, data cleansing, master data maintenance, and various storage architectures) are constantly improving, these tools cannot always overcome the enormous amount of effort and resources required to create the data model. Also, developers, whether using custom applications or off the shelf reporting tools, will often find themselves frustrated with their inability to work with such a data model. In such situations, it is often a case of attempting to please all parties, and therefore pleasing none.
Then, of course, there are the many methods in between the extremes. Such methods usually involve moderating the vision in the second option, and creating smaller, targeted data warehouses, that may or may not share some underlying objects (master data, common data objects and definitions). A newer method gaining traction is the use of EII (Enterprise Information Integration) technology to ‘leave the data where it is'. This method is actually a structured, more centrally planned version of option one. It aims to gain speed and simplicity by leaving the data where it is naturally stored, but providing reliable data by creating defined, planned and researched integration points between the data in different sources. EII tools also overcome the performance issues associated with the disparate location of the data by integrating specialized distributed query engines, as well as data integration engines to merge the data sets. Application and BI tools access the virtual data set presented by the EII tool, and the tool does all the heavy lifting of querying the individual data sources and unifying the data, ostensibly in real time.
So what is the right solution for an organization? The answer depends on the size, complexity and needs of the organization. Whereas a small organization that has few and small data sources to integrate may be comfortable with the first scenario, a larger organization with many large and complex data sources may choose a combination of solutions; for example creating focused data warehouses and unifying them with an EII tool. In all cases, an enterprise must balance priorities between cost, effort, and reliability requirements.
Summary of Data Asset Unification Options
Cross Data Source Access
'Data Federation' / EII Tool
Consolidated Data Warehouse
Few, Simple Data Sources
Heterogeneous Data Sources
Heterogeneous Data Sources
Few, simple, one-off reports
Consistent, repeatable reporting; large volume of reports
Consistent, repeatable reporting; large volume of reports
Data Sources in close proximity
Data Sources in close proximity (network latency)
Data Sources geographically dispersed
Performance not a concern - data sets small
Performance is a concern - small to large data sets
Performance is a concern - small to large data sets
Near real time data required - operational reporting
Near real time data required - operational reporting
Real time data not required
Little up-front cost, larger ongoing maintenance cost
Large up-front investment, small maintenance cost
Very large up-front investment
Variable information quality, questionable consistency across reports
Consistent information quality, consistency across reports
Consistent information quality, consistency across reports
Written by Adrian So, Data Management Group Senior Consultant
Call 888.394.1664 to find out how Data Management Group can help you with all your business intelligence needs.
To find out how we can help you solve your information challenges, visit our professional services pages: