The T-ADAPT site of Dr. Michael Hahne

Home

T-ADAPT

Model Management

Model Transformation

SAP BW Models

SQL Server Models

Visio Shapes

Hahne online

Imprint

T-ADAPT:

Conceptual design of temporal aspects in data warehousing with T-ADAPT

from Dr. Michael Hahne

Chapter 2: Temporal aspects in Data Warehousing

In temporal databases are stored versions of objects in order to track the evolution of these objects over a period of time and the therefore introduced attributes of time of these versions are named time stamps. They represent either a point of time or a time interval and the time axis is considered as a series of discrete time units [CS99]. The smallest relevant time unit determines the granularity of time measurement and is called a chronon [J+94, 55].

In information systems that recognize temporal aspects we distinguish between valid time and transaction time for time stamping [T+93].Whereas the valid time of an object or a fact is defined as the interval of time in which this object has a constant state, the transaction time determines the time when a fact is current and retrievable in the database [J+94, 53f.]. Databases that recognize either valid time or transaction time are named temporal database [SA85, 242f.].

It is sufficient to consider dimensions and their hierarchies in the following in order to recognize time stamping in the conceptional model since all facts in a Data Warehouse have already a relation to time because of time dimension [CS98, 516].

This problem was mentioned first by Kimball who discussed this under the expression ?slowly changing dimensions?. He proposed a solution to handle changes in consolidation trees in which the dimensions are time stamped with their validity interval [Ki96].

The problem of temporal OLAP is addressed by Bliujute et al. at the Time Center at the University of Arizona [B+98]. They compared the performance of various SQL queries over the approaches of fact tables that represent time series, fact tables that are event oriented as well as the Kimball approach. Beyond that, Mendelzon and Vaisman carried out the need for a temporal query language for OLAP, which was introduced as TOLAP. They also defined a temporal multidimensional data model [MV00].

Temporal views and their maintenance are in the focus of some recent work of Yang & Widom. They carried out a view definition language that operates over non-temporal data sources [YW98; YW00].

A formal model of dimension updates, which result from modifications in hierarchical structures, in a multidimensional model is carried out by Hurtado et al. [H+99a]. They also deal with the question of an implementation of the dimension updates in a ROLAP storage of the data warehouse [H+99b].

Pedersen & Jensen introduced a multidimensional model that is able to handle complex data and considers also temporal aspects. These are considered as one modelling issue among other data modelling problems [PJ99]. We also discuss the problem that arises with structural changes in consolidation trees as modelling issue, but namely already in the phase of conceptual design and with the need for an adequate graphical representation method.

The structural changes in consolidation hierarchies result from changing, deletion and adding nodes, which is exemplary represented in fig. 2 for the case of a two-level hierarchy with product groups and products. A change of the mapping to higher and subordinated elements in the hierarchy occurs in the case of the change of a hierarchy node. The consolidation hierarchy is also changed by admission of new nodes and deletion of existing nodes [SB03, 55-56].

Fig. 2. Structural changes in consolidation hierarchies

In order to clarify the effects of these structural changes they are explained together with the possible reporting requirements on the basis of a simple numerical example. All three changes of the hierarchy as shown in fig. 2 happened at the transition from a fictitious time period 1 to a second fictitious time period 2. The fictitious values to the corresponding accounting periods are summarized in table 1.

accounting period product value
period 1 P1 110
period 1 P2 120
period 1 P3 130
period 1 P4 140
period 1 P5 150
period 2 P1 110
period 2 P2 120
period 2 P3 130
period 2 P5 150
period 2 P6 160

Table 1. Values of the performance indicators of the example

The analysis of all data concerning the respective current period with regard to a historical period or to historical truth is a report requirement which can be found very often. Another reporting requirement can be to only consider those elements of the structure that are valid during the entire reporting period ("not to compare apples with pears") [Ha03, 70].

The results of the computation for the product groups for the representation of different report results, considering different requirements, in handling structural changes for the four listed scenarios are listed below.

First the computation on the basis of the dimension structure with validity in the old time period 1 is presented. This leads to the fact that values for P6 cannot be considered. The result of the query is summarized in table 2.

PG1 PG2
period 1 360 290
period 2 360 150

Table 2. Report with structure of a fixed date

The valid structure differs besides the new element P6 by the missing node P4 and rearranging of P3 in the second period regarded. Therefore no consideration of the values of P4 occurs with the result as represented in table 3.

PG1 PG2
period 1 230 280
period 2 230 440

Table 3. Report with actual structure

The computation for the historical truth clearly differs from these two points of view since the transaction data view is important. The computation is always done on the basis of the particular validity of the structure in each case then. The results are shown in table 4.

PG1 PG2
period 1 360 290
period 2 230 440

Table 4. Report according to the historical truth

All three variants have the disadvantage that the computation leads to no directly comparable values concerning time series analyses. Therefore it is often demanded to take up only elements that are valid during the entire query period (see table 5).

PG1 PG2
period 1 230 150
period 2 230 150

Table 5. Report with comparable results

The following four possibilities for the implementation of the different report requirements are available:

  1. Adjustment of historical data to new structures.
  2. Separate storage of the historical volume of data additionally to the complete existence of new structures.
  3. Structure of parallel hierarchies with old and/or new structures.
  4. Temporal data bases and time stamps.

The adjustment of the historical data has the advantage that the volume of data is not pushed up and the data dimension structures remain visible. Not all report requirements can be covered because old structures are lost.

Old evaluations can be retrieved by proceeding separate storage. This is achieved by a larger volume of data and a complex actualisation.

The solution alternative of parallel hierarchies offers the advantage that all numbers with arbitrary structures can be indicated. The developing dimension structures are rather unclear.

Maximum flexibility is made possible for the report system with arbitrary historical variants of structures by versioning time stamps. However, the tendentious worse performance and the partly still not fully developed concepts are unfavourable.


Home
Last update: 2010-01-13
webmaster@hahneonline.de