Wednesday, June 10, 2009

BI Mashup Maturity Model

James Kobielus wrote a blog article in anticipation of his upcoming report: "Mighty Mashups: Do-It-Yourself Business Intelligence for the New Economy".

In it he lays out the 4 levels of maturity of BI mashup within the enterprise. My paraphrased and simplified list:
  1. Parameterized reports
  2. Analytic dashboards
  3. Data mashup
  4. Collaboration with governance
It is very encouraging to see Forrester's research and predictions in line with our product strategy, so I'll forgive Jim for not giving InetSoft a shout-out as the one vendor who does enable all 4 of these levels of BI mashup.

Monday, April 6, 2009

Charts and Graphs 2: Dimensional Analysis

When displaying a single variable, you should use a single dimension. That is, a point on an axis. Among multiple points, the distance of the point from the origin is what is being compared. To make this explicit, a line can be drawn from the origin to the point. With a linear scale, a line with half the length represents half the value.

To make these lines more visually salient, they are often made into bars. As long as the bars have equal width, the areas of the bars are still in the same proportion as the simple lines.

Many chart engines allow for 3D Bars for their visual appeal. Since each rectangular solid has equal depth, the volumes are in the same proportion as the bars and lines.

The same idea can be used in bubble charts. A variable can be represented by the size of the point. In visual comparison, this variable should be the area of the point. Some engines mistakenly tie the variable to the radius of the circle. It is hard to tell if a point is exactly half the area of another point, but in terms of visual salience, it works.

Now, look at the bubbles used by Advizor Analyst/X in: Multivariate analysis using parallel coordinates.

The "bubbles" are shaded to look 3D, that is like spheres. Should we compare the volumes of the spheres? Unfortunately, the volume of a sphere is not linearly proportional to the area of its 2D projection (circle). In fact the volume of a sphere is r*4/3 times the area of the circle. So, comparison of relative value is skewed.

Visual appeal in a chart is nice to have, but not at the expense of the information it represents.

Thursday, January 8, 2009

Charts and Graphs 1: Missing Data and Irregular Intervals

Stephen Few wrote about Line Graphs and Irregular Intervals, and the debate rages on.

I think the original graph of postage stamp prices is fine. The x-axis uses regular intervals, and the points demarcate the actual known values. I agree that a step/bar graph might be preferable for some purposes, but if you want to see if the rise in stamp prices is in line with inflation, the line is better. If a step graph is used, the trend line should connect the midpoints of the bars (see my version, which includes a CPI line). In effect this spreads out the changes as though they were more continuous, and the total area under the graphs would be about the same.

However, the example of households with computers and internet access has many problems. The original is missing data, but the x-axis it uses is categorical. Instead, it should use a continuous scale so that the gaps are apparent. Like the first one, it uses both points and lines. The points indicate the known data, and the lines help you to interpolate what the missing values might be. Another problem with it is that it seems to have a different purpose. The title of this chart says, "In 2003, more than 88% of households owning a computer were online, up 40% from 1997." To arrive at this fact requires dividing the Internet Access number by the Presence of Computer number. Instead, why not just graph this ratio on the chart? That's what I've done in the second chart below. Notice how this allows you to also see that a greater percentage of computer households had internet access in 2001 than in 2003.

Here's a corrected version that I created using Style Chart. Notice the missing bars and points.

To handle missing data points, there are a few different options:

  • Drop the lines altogether. But it is easier to see the slope of lines than the slope between 2 points that you mentally draw a line between.

  • Don't draw a line when values are missing. This is okay when you have 1 line, but is hard to look at when there are multiple.

  • Drawing a different connector, e.g. a dotted line. This helps in slope analysis, but make it very clear that there is missing data that has been interpolated.

  • Drawing points and lines. This is the most common and, in my opinion, the most intuitive. Line graphs always do some amount of interpolation, otherwise it's really a point graph.

Monday, December 15, 2008

Speaking of Data Mashup

I've been invited to give a presentation to the Data Management Association of Minnesota on Data Mashup this Wednesday.

Also, last month I was interviewed by TDWI's Linda Briggs on the topic of Data Mashup. Read the full transcript.

In both of these venues, two of the key points are:

Data Mashup is a compromise between database administrators and Excel jockeys. It allows the flexibility and self-service power, while maintaining security, integrity, and transparency.

Data Mashup is a complement to a data warehouse, and not a replacement. A data warehouse is not a goal, but rather a solution to certain problems. Data Mashup solves some of the same problems, and some different ones. While there is some overlap, both technologies have there place.

Friday, October 10, 2008

Happy Birthday Business Intelligence and Google

Business Intelligence is celebrating 50 years since conception in a paper by Hans Luhn, and Google turned 10. Upon re-reading A Business Intelligence System from 1958, I see a great many parallels with Google and indications of their future.

The essence of Luhn's idea is a super-librarian who knows the details of all the books and documents in the library, knows the concerns and preferences of all the people with library cards, and plays matchmaker.

The first concept is an auto-abstract, essentially a summary of the document (not unlike the little blurb under a search result). With advances in natural language analysis, search engines like www.cuil.com are focussing even more on content and relevance than pagerank.

Next, Luhn mentions that after new documents are analyzed, parties who might be interested in them should be notified of their existence. I love Google Alerts, because they help me stay abreast of the latest mentions of my name, my company, and my industry.

Then there is the ability to query the librarian, which is just a search. According to the 50 year old article, the request for information should yield a list of abstracts ordered by relevance to the user. The user can then request the complete documents they choose.

Where will Google go from here?

The system that Luhn defined has profiles of its users that are more abstract and change over time based on feedback. Imagine a search engine that learns that when you refer to "fencing" you mean the sport instead of the building supply, by tracking which results you click. Google can already capture your web history.

Also, the article talks about "internal documents", which are user created. Google owns Blogger, and YouTube, and offers services for creating web sites, documents, and spreadsheets. They could leverage this information to help develop user profiles and even connect users with similar interests.

I don't know if Hans Peter Luhn was reincarnated as Larry Page or Sergey Brin, but Google is pretty close to his vision of A Business Intelligence System.

Wednesday, September 17, 2008

Unique, Like Everyone Else

I gave a presentation at one of our partners' User Conference this morning, and stayed for a panel discussion with some industry experts. Even though the industry in question was enterprise asset management, I heard some familiar comments that I suspect are universal truths.

First was advice about getting executive support for a new project. The answer (no surprise here) was being able to demonstrate ROI. Whether BI, or an equipment maintenance initiative, executives need to see the impact on the bottom line.

Second was talk about metrics. The gist was that measuring key areas of your business, and tracking the results of attempted improvements are incredibly important. Again, because of the success of Tom Davenport's Competing on Analytics, this idea is not coming out of left field.

Third was a focus on management. The manufacturing sector's trend toward outsourcing has had a ripple effect by depleting the maintenance engineer talent pool of new workers. The fact is that young people aren't training for this industry, and therefore companies need to do more with less. Software helps to some degree, by squeezing efficiency out of every area, but the real key comes down to the leadership. You can provide a great BI tool, but with a weak manager you will not be able to save a failing business.

In summary, it doesn't matter in what vertical you work, the challenges and best practices are the same at the core.

Wednesday, September 10, 2008

Historical Quadrant

I read an enlightened article called Analysts as a lagging indicator of success that explains issues with some of the large analyst firms very clearly.

In a nutshell, companies that have deeper pockets and/or more market share get more coverage. There is nothing inherently wrong with this setup. It's just that people evaluating options for a new project (e.g. a Business Intelligence deployment) should keep this in mind when looking at their research. The most established, or historically successful companies are not necessarily the best solutions for problems you are facing right now.

Small analyst firms tend to be more focused on the real technology innovations, and customer experiences. This information is of more value to a nascent opportunity.

That being said, for a vendor it feels good to be recognized by the big name analysts. It's validation that you did your job well for the past few years.