Data Warehousing

  • Middleware - ODBC, OLE, OLE DB, DCE, ORBs, and JDBC.
  • Data base connectivity - ODBC, JDBC, OLE DB, and others.
  • Data management - ANSI SQL and FTP.
  • Network access - DCE, DNS, and LDAP.

Regardless of what standards they support, major data warehousing tools are meta data-driven. However, they don’t often share meta data with each other and vary in terms of openness. “So research and shop for tools carefully,” advises Thornthwaite. “The architecture is your guide. And use IT advisory firms, like GartnerGroup, META Group, Giga, etc.”

How detailed does a data warehouse architecture need to be? The question to ask is this: Is this enough information to allow a competent team to build a warehouse that meets the needs of the business? As for how long it will take, the architecture effort will grow exponentially as more people are added for its development (i.e., it becomes “techno-politically complex”), and more complex the resulting system needs to be (i.e., “functionally complex”).

Like almost everything in data warehousing, an iterative process is best. You can’t do it all at once because it’s too big—and the business won’t wait. Also, Thornthwaite says, the data warehouse market isn’t complete yet. So begin with high leverage, high-value parts of the process. Then, use your success to make a case for additional phases.

Conclusions

To sum up, the benefits of having a data warehouse architecture are as follows:

  • Provides an organizing framework - the architecture draws the lines on the map in terms of what the individual components are, how they fit together, who owns what parts, and priorities.
  • Improved flexibility and maintenance - allows you to quickly add new data sources, interface standards allow plug and play, and the model and meta data allow impact analysis and single-point changes.
  • Faster development and reuse - warehouse developers are better able to understand the data warehouse process, data base contents, and business rules more quickly.
  • Management and communications tool - define and communicate direction and scope to set expectations, identify roles and responsibilities, and communicate requirements to vendors.
  • Coordinate parallel efforts - multiple, relatively independent efforts have a chance to converge successfully. Also, data marts without architecture become the stovepipes of tomorrow.

Thornthwaite recommends that companies align with business requirements but to be practical. He also emphasizes the importance of keeping up with advances in the data warehouse industry. Finally, remember that there is always an architecture: implicit or explicit, “almost in time” or planned. Experience shows that the planned and explicit ones have a better chance of succeeding.

Source

“From Bauhaus to warehouse: Understanding data warehouse architecture requirements,” presented by Warren Thornthwaite at DCI’s Data Warehouse Summit, held December 8-10, 1998 in Phoenix, AZ.

References

1. Star schema designs produce a large fact table and many, smaller dimension tables which extend the different aspects of the facts. By processing the dimension tables first, fewer of the detailed records in the fact table need to be scanned to complete the query. (From Interactive Data Warehousing, by Harry Singh. Prentice-Hall, 1999, p.163.)

2. Normalization is the process of removing all model structures that provide multiple ways to know the same fact; a method of controlling and eliminating redundancy in data storage. (from Interactive Data Warehousing, by Harry Singh. Prentice-Hall, 1999, p.444.)

Pages: 1 2 3 4 5

Latest Blog Entries

Full Blog »