Reconciliation using Informatica
January 9, 2016
Informatica Power Center
New Features of Informatica Version 10
January 12, 2016
Show all

FAQs for BI Architecture

What is a Data Warehouse?

A data warehouse is a central repository for all or significant parts of the data that an enterprise’s various business systems collect. W. H. Inmon coined this term. IBM sometimes uses the term “information warehouse.” Typically, a data warehouse is housed on an enterprise mainframe server.

Data from various online transaction processing applications and other sources is selectively extracted and organized on the data warehouse database for use by analytical applications and user queries.

What is a Data Mart?

A data mart is a repository of data gathered from operational data and other sources that is designed to serve a particular community of knowledge workers. In scope, the data may derive from an enterprise-wide database or data warehouse or be more specialized.

The emphasis of a data mart is on meeting the specific demands of a particular group of knowledge users in terms of analysis, content, presentation, and ease-of-use. Users of a data mart can expect to have data presented in terms that are familiar.

How does a Data Warehouse differ from a Data Mart?

In practice, the terms datamart and data warehouse each tend to imply the presence of the other in some form. However, most writers using the term seem to agree that the design of a data mart tends to start from an analysis of user needs and that a data warehouse tends to start from an analysis of what data already exists and how it can be collected in such a way that the data can later be used.

What is an ODS?

An Operational Data store is a construct, which has been designed to overcome the need for Information (EIS) in the Operational area. It is similar to a data warehouse but differs in many respects. Sitting in between the operational, transaction processing environment and the enterprise data warehouse is this structure called the ODS. The ODS provides foundation to achieve tangible integrated operational results in a short time frame. While the data warehouse offers no relief to the organization struggling with nonintegrated operational systems, the ODS offers immediate relief.


What is a Federated Data Warehouse?

Most techniques used by organizations to build a data warehousing system employ either a top-down or bottom-up development approach. In the top-down approach, an enterprise data warehouse (EDW) is built in an iterative manner and underlying dependent data marts are created as required. In the bottom-up approach, independent data marts are created with the view to integrating them into an enterprise data warehouse at some time in the future. There are many pros and cons of the two approaches but there is a steady trend toward the use of independent data marts, especially with the move toward the use of turnkey analytic application packages.

A solution must offer low cost and rapid ROI advantages of the independent data mart approach without problems of data integration in the future. Such a solution is called the federated data warehouse. Two components of a federated data warehouse are the common business model and shared information staging areas.

When end-users access the “system of record” (the OLTP system) directly and generate “summarized data” reports and thereby given the feel of a “data warehouse”, such a data warehouse is known as a “virtual data warehouse”.

What is an Enterprise Data Warehouse?

The enterprise data warehouse is the heart of the DSS environment. The enterprise data warehouse is fed by the ODS and the integration and transformation layer. In turn the enterprise data warehouse feeds the rest of the DSS environment. In a very real sense the enterprise data warehouse is the crossroads of the DSS world.

What are the different process maps available for Data Marts?
Available process maps for Data Marts are;

  • Top-Down Architecture
  • Bottom-Up Architecture
  • Enterprise Data Mart Architecture
  • Data Stage/Data Mart Architecture (DS/DMA)
  • Distributed Data Warehouse/Data Mart Architecture (DDW/DMA)
  • Distributed Knowledge Management Architecture (DKMA)
 What is Top-Down architecture of Warehouse development?

 The interaction associated with the architecture begins with an Extraction, Transformation, Migration, and Loading (ETML) process working from legacy and/or external data sources. Extraction transformation, and migration, process data from these sources and output it to a centralized Data Staging Area. Following this, data and metadata are loaded into the Enterprise Data Warehouse and the centralized metadata repository. Once these are constituted, Data Marts are created from summarized data warehouse data and metadata.


What is Bottom-up architecture of Warehouse development?

 The second data warehousing systems architecture, the “Bottom-up” architecture became popular because the Top-down architecture took too long to implement, was often politically unacceptable, and was too expensive.

The central idea in Bottom-up architecture is to construct the data warehouse incrementally over time from independently developed data marts. The process begins with ETML for one or more data marts. No common data staging area is required. There is generally a separate area for each data mart. There may not even be standardization on the ETML tool.


What is Enterprise Data Mart architecture of Warehouse development?

EDMA supports an incremental approach to the data warehouse through data mart development by creating a shared framework for development. The EDMA framework includes enterprise subject areas, common dimensions, metrics, business rules, and data sources, all represented in a logically common (but not necessarily physically centralized) Global Metadata Repository (GMR).

Central to the architecture also, is a common data staging area called a Dynamic Data Store (DDS) for extraction, transformation, and migration results. A DDS stores, cleans, and transforms data extracted from operational systems, and also prepares the data for loading into DSS data stores.

What is Datastage/ Data Mart architecture of Warehouse development?

DS/DMA is the same as EDMA with the important exception that no physical enterprise-wide data warehouse is implemented. Instead, the data warehouse is viewed as the conjunction of the data marts in the context of an EDMA-like metadata repository.


What is Distributed Data Warehouse/ Data Mart architecture of Warehouse development?

 It provides a logical database layer mapping a unified logical data model to physical tables in the various data marts. And second it provides transparent querying of the unified logical database across data marts and data warehouses along with caching and joining services. Thus, the distributed character of the data warehouse/data mart system is made transparent to users.

Leading tool providers supporting this architecture are Informatica, Carleton, and Sybase Adaptive Server.


What is Distributed Data Warehouse/ Data Mart architecture of Warehouse development?

 DKM architecture is an evolving O-O/Component-based architecture applicable to enterprise wide systems incorporating multiple processing styles including DSS, OLTP, and Batch processing. These systems are called Distributed Knowledge Management Systems (DKMS). Here DKM architecture is applied to data warehouse/data mart -based DSS systems.

Top – Down and Bottom-Up architectures may be viewed as two-tier architectures utilizing clients and local or remote databases. EDMA, DS/DMA, and DDW/DMA may be viewed as adding middleware and tuple layers to earlier architectures to provide the capability to manage warehouse systems integration through unified logical views, monitoring, reporting, and intentional DBA maintenance activity. But tuple-layer based management still doesn’t provide automatic feedback of changes in one component of a data warehousing system to others.

DKM architecture may be viewed as adding an object layer to EDMA or to DDW/DMA to provide integration through automated change capture and management. Figure Six depicts DKM architecture.

 Why do we need a different schema for Data warehousing/Business Intelligence?

 A data warehouse stores historical data that is collected from various sources and the data is time variant. The data warehouse is updated in a scheduled and controlled manner. An online system usually does not store historical data, however if history data is stored it would pose severe performance issues, further being an Online environment 2 users accessing the same set of information are likely to derive different results. An online system would provide data only from a single source, whereas a data warehouse usually has data being put in from various sources.


What are the various features of Enterprise Data Warehouse?

The various features of an EDW are:

  • An EDW may or may not interact with Data Mart. The most important aspect of any Data Warehouse is the consistency of information.
  • An EDW may have same level of granularity as a Data Mart. This means that the data from the Data Mart will be dumped into an EDW.
  • An EDW may have more detailed level of data than a Data Mart. This means that the data requirement is at a more granular level that may not be catered by the Data Mart.