The Data Warehouse Development Life Cycle
Here, you can see that the data is pulled from the Oracle OLTP
database using Oracle’s SQL*Net facility. For details on this
distributed SQL technique, see Chapter 9, Distributed Oracle Data
Warehouses. Also, note the dissection of the sales_date column.
Rather than store the sales_date as a single column of the date
datatype, the Oracle data warehouse transformation stores the
sales_date in three columns: one for the day, another for the month,
and a third column for the year. The reason for this date breakout
will become clear later in this chapter when we discuss data query
analysis.
Data Warehouse Loading Analysis
The third process involved in data warehouse analysis is the
analysis of loading data into the data warehouse. Analysis of data
loading involves understanding how the data is going to be extracted
from the target system, how the data will be cleansed and validated,
and how the data will be loaded, aggregated, and summarized at
warehouse load time. Most Oracle professionals use Oracle’s
referential integrity to validate incoming warehouse data.
Referential integrity can be used to check for valid data values and
ensure that all data relationships are maintained. Because
referential integrity is only used at load and purge time, it causes
no performance problems to the Oracle data warehouse, especially
since loading is generally done during off hours. For more
information on using Oracle utilities with Referential Integrity,
see Chapter 9, Oracle Data Warehouse Utilities.. Figure 3.15
displays methods for cleansing, validating, and loading Oracle data
into a data warehouse.
Figure 3.15 Methods for cleansing, validating, and loading
Oracle data.
For more information on using Oracle’s SQL*Loader for
Oracle warehouse loads, see Chapter 11, Oracle Data Warehouse
Utilities.