Data Warehousing

Index Document SAS   SAS Questions SAS Main Frames Example. Data Warehousing C Programs Trouble Shooting  Favorite Sites Informatica  Blog



What is datawarehouse? 
data warehouse is a subject oriented(which deals with
particular area of business),integrated(combine data in 
data bases),time variant(depends on regularity),non
volatile(static or read only)collection of data which is
helpful in decision making. What is ETL
 
ETL Process can be defined as the Extraction of Information 
from various sources like flat files, databases or XML 
data, Performing transformations (manipulations on the data)
based on the user requirements and loading them to the 

Target location.

 What is the difference between datamart and datawarehouse? 
Dataware house: It is a collection of data marts.  
                Represents historical data
 
DW Will be your target for historical data and will be use 
for Analysis.
 
Data mart: It is a sub set of data ware housing.
           It can provide the data to analyze query reporting

            & analysis.

 
Data Mart would be your Source for DW and will be subject 

Oriented

 1. What are all the things fact table contains?

Latest Answer: Fact Table is the heart of a DWH or a Data Mart. It contains all the primary keys of your dimension table, fact (additive,non-additive,semi-additive),attributes,degenerated dimension. # Additive: Additive facts are facts that can be summed up through all of the dimensions in the fact table. 
# Semi-Additive: Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others. 
# Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
 2. What are lookup tables? Why they are used? Where they are used? Latest Answer: When a foreign key exists in a table, the foreign key’s table is sometimes referred to as a lookup table. The DEPARTMENT table in our example is a lookup table for the EMPLOYEE table. The value of an employee’s department can be looked up...3. What are the difference between a database and data warehouse? Latest Answer: Database :- database is a collection of related information stored in a structured form in terms of table so that It makes easier insertion, deletion and mainupulation of data. database consist of tables that contain attributes.DatawareHouse :- a data ... What is STAR SCHEMA? 
A Star schema consist of fact table surrounded by one or
more dimension tables,which represent a star like
structure.Primary key of dimension tables connect to fact
tables as foreign keys.Data is is denormalized manner.For

better query performance we would go for star schema.

 

When should you use a STAR and when a SNOW-FLAKE schema?

 
STAR SCHEMA:-if PERFORMANCE is the priority than go for 
star schema,since here dimension tables are DE-NORMALIZED
 
SNOW-FLAKE SCHEMA:-if MEMORY SPACE is the priority than go 
for snoflake schema,since here dimension tables are 

NORMALIZED

 What is factless fact schema?

A fact table without any measure is called factless fact.

 

What is operational data source (ODS)?

An Operational Data Store (ODS) integrates data from 
multiple business operation sources to address operational 
problems that span one or more business functions. An ODS 
has the following features:
       Subject-oriented — Organized around major subjects 
of an organization (customer, product, etc.), not specific 
applications (order entry, accounts receivable, etc.).
       Integrated — Presents an integrated image of 
subject-oriented data which is pulled from fragmented 
operational source systems.
       Current — Contains a snapshot of the current 
content of legacy source systems. History is not kept, and 
might be moved to the data warehouse for analysis.
       Volatile — Since ODS content is kept current, it 
changes frequently. Identical queries run at different 
times may yield different results.
       Detailed — ODS data is generally more detailed than 
data warehouse data. Summary data is usually not stored in 
an ODS; the exact granularity depends on the subject that 
is being supported.
 
What is Molap and Rolap? What is Diff between Them?
MOLAP has aggregated value stored in cube.Since the data is
aggregated, query performance is fast. 
 
ROLAP has data sored in relational databases.Here query has
to access the database for retrieving the data every time.So
performance is slow when compared to molap. Size is larger
than molap.