Some People Are Evil… and So Are NULLS

July 29, 2011 Leave a comment

From an end-user “information consumption” standpoint, I absolutely hate NULLS.  I think they are evil (like that girl in The Exorcist) and if I had the power to send either into oblivion, I would (and its now obvious that Jesus’ love has not been made complete in me yet.)  Any project that I am on, I will generally find an opportunity to state that “NULLS are evil and should be avoided”.  The ETL’ers are generally not real happy because they have to code for it.  The data modelers are not thrilled because, well, I’m telling them what they shouldn’t do (and they hate the fact that I know what Ralph Kimball really said.)  And if I’m really on my A-game that day, the PM won’t be happy either because our meeting just went 15-minutes over talking about this, but I have been on too many projects to know better.  NULLS generally mean OUTER JOINS to those of us who build solutions around consuming the data.  And OUTER JOINS means a degradation in performance because OUTER JOINS require the database to do some real goofy stuff that isn’t efficient for BI purposes.  And if you get past all of that, let’s just hope you don’t have to teach the end-user how to pull a query that says “Give me ALL of the records where column_a DOES NOT equal XYZ” when NULLS exist in the column.  Good luck explaining that one to the customer when all they want to do is drag-and-drop in Cognos! (Update:  Cognos 10 does provide this funtionality in Query Studio.  Nice!)

So, its safe to say that I don’t see any benefit in using NULLS in a reporting data mart, except that it makes things easier for IT.  Unfortunately, marts don’t exist to satisfy IT’s need to build something cool (no matter what your management – who just came back from a conference – tells you).  If you want to use NULL as an indicator for when no data came across from the source system, then insert “No Data Available” or something similar.  Just don’t allow a NULL value to be populated into the mart because it has absolutely zero value to reporting.  It just causes headaches.

More times than not, I’ve found that this issue is really based more on education and experience vs. people trying to cut corners.  Because I’ve been blessed with opportunities to work across all the different segments of a BI solution lifecycle, I am able to see the big picture and have a better understanding of “cause-and-effect”.  This is why ultimately it is the responsibility of the BI Architect to educate those on the project team (who may not agree) because the only reason a BI solution is even considered is because we are trying to get the right information to the right people so that they can make the right decisions at the right time (not that you haven’t heard that before, right?)  Anything that we can do as IT professionals to help foster that is a good thing because that helps us stay employed.

So no matter how you “slice it” (that was for all you OLAP folks), a good mart will simply not have NULLS enabled on any of the columns.  That will ensure that a default value is always used and that no OUTER JOINS will have to be performed.  Just make sure that the default value used is one that the business agrees to since they will be the ones looking at the data on a report and not IT.  Values like “~” or “!” or “^” means nothing to the business.  Always assume that you have someone new using the mart and design toward that.

P.S . Don’t replace NULLS with a whitespace.  That is like replacing bacon with turkey bacon.  While bacon isn’t good for you, to replace it with something like turkey bacon is just… well, its sick.  So if anyone tells you that turkey bacon (or whitespaces) is just as good or better as bacon (or NULLS), immediately run the opposite direction and don’t look back because they are probably morphing into some sort of creature and something really, really bad is about to happen.  You’ve been warned.


My IBM Cognos Wish List

July 26, 2011 5 comments

I’ve been using the BI tool Cognos since 1999 and while there have been VAST improvements made over the years (and I mean VAST), there are still items that I wish IBM Cognos would implement. Here are a few things I wish they would add (and yes I know “few” technically means “three”)

Okay, here we go:

1. Give the ability to color code query subject icons in Framework Manager (and subsequently the published package). Why? Have you ever done a model with multiple facts and dozens of dimensions but not every dimension ties to every fact? (If you say no, you must be lying or you are confused.) The ability to color code the query subject icons would allow the modeler to easily designate for the end-user which dimensions belong to which fact and/or highlight the dimensions that are conformed across all facts.

2. Conditional joins in Framework Manager. So if condition A exist, then use Join A. If condition B exists, then use Join B. That would be cool. I would also probably be the only one to ever use it.

3. When using a DMR in Query Studio (or even Report Studio), currently you can right-click to drill-up or drill-down, but there is no way to drill to the top or drill to the bottom. Solution? Add drill-to-top and drill-to-bottom to the options. Easy.

4. Have you ever done a report in Report Studio that has at least 20 queries total? Of course you have. How about 30? Maybe a few of you. 40? It happens. Solution? FOLDERS in QUERY EXPLORER. I wish Cognos would add the option to be able to create a folder in Query Explorer so that you can group your queries and manage them better. That way if I have 10 queries that support one page of my report, I can group them in a folder I title “Page 1”, then maybe I create another folder called “Prompt Queries” and I put the 10 or so queries there. Now when I go to Query Explorer, I’m not having to wait for JavaScript to catch up or try to scroll in that little window to locate my query. This would help exponentially with portfolio type reports were there are 10+ separate pages on a single report.

I know there are a few other items that I’m missing but these are the big ones that immediately come to mind. Are there any others? Maybe a certain IBM guy will stumble across this blog and make it happen? (Heh, you never know!)