“It is the stars, the stars above us, that govern our conditions.”
~William Shakespeare
Need
Business users need to query the data warehouse and to create reports with little or no intervention from IT. Adopting a reporting and online analytical processing (OLAP) tool that engages a
full semantic layer with cubing capabilities will improve the business user experience and lessen the reliance on IT. OLAP tools, while capable of connecting to a third normal form (3NF) database,
prefer a star schema-designed database. The goal of any semantic layer should be to produce user-defined queries with only inner JOINs in the SQL statement.
Star Schema Data Models
Star schema data warehouse models are designed for analytics. Specifically, a star schema:
- Is very easy to understand, even for non-technical business managers.
- Provides better performance and shorter query times.
- Is extensible and will handle future changes more easily than the source transactional system.
Why It Works
In the star schema design, the fact table contains only keys and additive facts, or integers and some decimal numbers. The keys are also numbers. Databases store numbers very efficiently. A star
schema uses two kinds of database tables: a dimension table and a fact table. Each type of table plays a unique role in the overall design allowing an efficient inner JOIN only.
Dimensions can be thought of as categories. Dimensions usually have hierarchies within them. For example, a Campus dimension could have University (all campuses) at the top of the hierarchy.
University can then be broken down into many Campuses and Colleges and can then be further broken down into many Departments and Majors.
A fact table contains the measurements or metrics or facts of the business processes, in this case student registrations. A measurement of this business process would be student FTE, revenue and
headcount. The tuition revenue is also captured in the fact table.
Star SQL
Star schemas are extremely user-friendly and allow users to generate SQL statements that are efficient and can be processed easily by a database. Using a star schema and a business intelligence
(BI) tool, users will generate queries like the samples provided here. This kind of query technique will allow users to run queries across different units of the business. Users may change columns
needed in the report and may apply different filters in the WHERE clause noted here. From an IT perspective, users can generate a standard and yet optimized query with great flexibility and without
much IT intervention.
Here are two examples of the expected SQL statement. The first is a query against one star and the second of a cross-star join.
-----
Related content from the BeyeNETWORK
Why are traditional business intelligence projects giving way to faster, less expensive ones?
Claudia Imhoff answers that question during this web seminar where you'll also learn how to create accessible, technically feasible, low-cost BI solutions.
-----
-
Christina Rouse, Ph.D.
Christina is the Chief Architect at Incisive Analytics,
LLC. An improvement catalyst, Chris applies business intelligence strategy for performance improvement. Leveraging two decades of data experience on a broad range of technical platforms, she
developed a technology-agnostic approach to business intelligence consulting. Clients rave about Chris' unique blend of business acumen, technical architect and trainer skills. She may be
contacted at Christina.Rouse@IncisiveAnalytics.com.
Recent articles by Christina Rouse, Ph.D.
Comments
Want to post a comment? Login or become a member today!
Posted June 22, 2009 by Christina Rouse chris.rouse@incisiveanalytics.com
Well said Andrea and your advice about the caution in a cross star join is heeded. For the results to be accurate the fact table grains must indeed be the same. Some of the new OLAP packages on the market are addressing this very problem. For example, Microsoft's Analysis Services 2005 and 2008 allows two or more different grain level fact tables to exist in the same cube (data source view). To manage the problem, one must manually control the alignment of facts to dimensions in the dimension matrix (cube designer). Business Objects does a nice job of allowing independent queries to be executed and then a desktop join of the results. When all else fails, the UNION works well. Consider UNION ALL over UNION as it does not require a disk sort. Good advice Andrea! ~ Chris
Is this comment inappropriate? Click here to flag this comment.
Posted June 22, 2009 by ANDREA VINCENZI andrea.vincenzi@tiscali.it
Christina, unless I didn't fully understand your example, I think that any discussion of cross-star queries should include a warning of the problems that can arise from different granularities of the fact tables, otherwise called "Chasm Trap". In my experience, querying more than one star at a time shoud be considered with great caution, and the query should be structured as a union.
I'd like to hear if you have found this kind of queries to be working (i.e. delivering the correct results with reasonable response times)
Andrea
Is this comment inappropriate? Click here to flag this comment.