Wednesday, July 30, 2008

BI in the Cloud

Have you heard people talk about on-demand hosted subscription SaaS platform in the cloud grid computing?

Colin White wrote a good article that helps to make sense of all the different words and acronyms that companies use to describe these offerings - Business Intelligence in the Cloud: Sorting Out the Terminology

No matter what you call it, BI SaaS seems to be most suited to mid-size firms that meet the following criteria:
  • they are large enough to need BI
  • they are small enough to be unable to host their own
  • the data they have can be provided to the 3rd party
Maybe it's just me, but I don't see this being a large segment of the market.

SaaS is appropriate for applications that produce their own data (like CRM), but BI is all about tapping into the data you already have. It seems that BI SaaS gets a lot of hype, but let me know if you've actually deployed it and why you chose this option.

Monday, July 28, 2008

Emerging Market: India

It seems that India is the latest hot market, especially for Business Intelligence software. Just in the last few days there were announcements around servicing the subcontinent better.

My employer, InetSoft, announced a channel partnership that will provide direct sales to India, and other markets including UAE.

QlikTech is opening an office in India.

Rolta India will acquire an unnamed US-based BI firm.

Also, a number of India-based companies are starting to provide BI, like AnalyticsWorks and MAIA Intelligence.

I think that the relationship of US companies with India for outsourcing has resulted in an influx of capital that has bolstered their economy, and introduced a culture of business analytics and performance management. If this rationale is accurate, we can expect China, Russia, Philippines, Mexico, and Ireland to follow suit.

Wednesday, July 23, 2008

Private = Stable

I read an interesting article that talked about SAS, a privately held software company.

The points made are:

A private firm can be more stable because it is not at the mercy of the market. The sacrifice is the influx of capital from an IPO, but what is gained is consistency.

Public companies can be acquired by hostile takeover, and their mission statement becomes "increasing shareholder value". By keeping the reins, the owner(s) have more control over their future.

It ends with a quote from the founder/CEO, Jim Goodnight, saying, "The capital market guys are the ones that made all these really bad investments. I'm not sure anybody should ever listen to Wall Street again. They don't know what they're doing."

There are advantages to remaining a private firm.

Monday, July 21, 2008

Geographic Charts

proportional (prə-pōr'shə-nəl) adj.
  1. properly related in size, degree, or other measurable characteristics
Visualization is supposed to make it easier to understand data, not mislead the user.



It seems like a requirement that all BI software provide the ability to display data on a map. I just read that Tableau introduced this in their latest release, but most other vendors have provided this ability for a couple years (including my employer, InetSoft).

Maps have a few things going for them. They are familiar ways of displaying geopolitical entities (States, Countries, etc.), and they represent spatial data well.

The issue is that maps are often used to display data, like sales, by filling the interior of the region with a highlight color. Why is this a problem? Because the size and shape of a region directly impacts its visual notability, but has little correlation to population, market size, and general importance.

Consider low sales in Massachusetts and high sales in Montana. A larger portion of the US map would look good, but this belies the truth. I am reminded of a model that shows what a human body would look like if each part was in proportion to the area of the brain involved in its sensory perception.

I am not as much a stickler as Stephen Few when it comes to visual display. I am in favor of nice looking graphs, but not at the expense of conveying accurate information.

Friday, July 18, 2008

Pervasive BI Hurdles

I wrote previously about the Pervasive BI report from Wayne Eckerson at TDWI.

Here I want to highlight just the issues that are cited as reasons why BI has not spread more.

"The biggest impediments to BI adoption are the time and complexity to deploy BI tools followed by the cost of BI licenses, according to our survey."

So BI needs to be easier to get up and running. This is no great surprise, and recently there has been a real push by some technology providers, including ourselves, to deliver more intuitive tools.

The cost of BI licenses is an interesting one, because it seems that most vendors charge per named user. That is, if Bob sees Mary's dashboard and wants his own, they have to pay more money to the vendor. No wonder this reduces the BI adoption rate. A fiscally conservative firm will say, "Do you really need a dashboard? Can't you just use Excel?"

"Once BI tools are in-house, the biggest impediments to greater usage are poor data quality, overly complex tools, slow query response times, lack of executive backing, and the existence of other tools, according to respondents. To accelerate usage, they recommend integrating BI with Microsoft Office, implementing dashboards, embedding BI into a business process, and delivering highly interactive and self-service BI."

The old phrase of "garbage in, garbage out" holds true. In most situations, the consumers of data are in the same department or line of business as the producers of the data. If the BI user can explore the data, drill down to the detail, and fix the problems they discover, then the data quality problem will solve itself.

The BI tool needs to be easy to deploy AND easy to use. If users don't get frustrated, they may even enjoy the time they spend with the tool.

Users complain about slow performance, because they want to use BI more interactively. If they didn't, they would just schedule the necessary reports and move on.

The lack of executive support is sort of a catch-22. If an executive is behind a BI deployment, it will get resources to do it right. If not, large BI projects are doomed to failure, and executives will have their doubts confirmed. Two solutions are: convince a C-level sponsor to take a risk; or deploy a smaller BI project first to gain momentum. One of the best ways to have a successful initial implementation is to get the users participating early and often.

The existence of other tools is a chicken-egg situation. Users may be driven to using the other tools because of the other issues with traditional BI products. Also, "other tools" means desktop tools where the user is omnipotent.

It seems pretty unanimous that in order to use BI more, people want more self-service from their BI tools.

Thursday, July 17, 2008

Future Cloud

dystopia (dĭs-tō'pē-ə) noun.
  1. a society characterized by human misery, as squalor, oppression, disease, and overcrowding
The hyperbole employed by Nicholas Carr in The Big Switch: Rewiring the World, from Edison to Google may turn out to be an accurate prediction.



The basic idea is that cloud computing is going to catch on, this time. What this means for IT staff is joblessness, and what this means for the world is: the failure of traditional businesses; rampant "big brother"-ism; a further shift of profits from producers to aggregators; the continued rotting of human culture; and growing political rifts among the populace.

Many companies, like Amazon, Google, and Salesforce, are already starting to provide grid computing to the general public. Many businesses, especially small ones, will opt for paying pennies for data storage and processing power instead of maintaining their own staff and hardware.

We will probably also see a reemergence of dumb terminals that simply access the network/internet. Also, zero-client, web 2.0 technology will go from a feature to a requirement.

It's a very interesting time for the world, and hopefully it's not the beginning of the end.

Tuesday, July 15, 2008

BI-Phone

There have been some recent press releases from Pentaho and Oracle announcing their support of the iPhone as a client for BI. This got me thinking about the role that mobile devices play in business intelligence, and business in general.

The reason to have a BlackBerry, Palm, Pocket PC, iPhone is to keep informed when away from your desk. Think of receiving urgent emails while in a meeting.

Apply this idea to BI, and it makes sense to be notified proactively about certain exceptions via email, or maybe view a scorecard (raw numbers, and a traffic light). It does not seem efficient to view reports, interact with dashboards (charts and gauges), or analyze data on a 3.5" screen.

I like gadgets as much as the next geek, but let's be practical.

Monday, July 14, 2008

Data Warehousing Optional

In my previous post, Sea Change in BI, I briefly mentioned an article by Colin White, Is Data Warehousing Essential to Business Intelligence? I'd like to highlight a few of the key parts here.

Colin reminds us of the reasons why data warehouses were introduced in the first place:
  1. the data was not usually in a suitable form for reporting
  2. the data often had quality issues
  3. decision support processing degraded business transaction performance
  4. data was often dispersed across many different systems
  5. there was a general lack of historical information
"Data warehousing was introduced to help solve these data and performance issues. While there is no question that data warehousing helped improve business decision making, it is important to realize, nevertheless, that it was introduced primarily to solve design issues in business transaction systems, and also for performance reasons."

"... at present, business intelligence is synonymous with data warehousing. This thinking is wrong and needs to be changed. Data warehousing is a component of business intelligence, but business intelligence may employ data in other data stores. In some cases, a BI application may not even use data managed in a data warehouse."

"Another issue is that people have forgotten that data warehousing was created to overcome deficiencies in business transaction systems. Many of these issues are now solvable."

"The bottom line is that data warehousing is still an important component of business intelligence, but it is no longer the foundation on which all BI projects have to be built."

It is very encouraging to hear this kind of pragmatic approach from an analyst.

Friday, July 11, 2008

BI Mythbusters

debunk (dē-bŭngk') tr. v.
  1. To expose or ridicule the falseness, sham, or exaggerated claims of
Don't get caught up in unsubstantiated buzz.



DM Review had an interesting article that debunked some very common BI claims.

Respected analysts from Boris Evelson to Neil Raden have been talking about the data explosion. The data that DM Review presents shows that the shear volume of data used for BI has not grown significantly in the last 3 years.

The positioning of "BI for the Masses" seems as prevalent in BI today as "new and improved" is in the detergent market. This claim has amounted to little more than wishful thinking over the past 12 months. The technologies seem to be available, so maybe the lag is due to the culture trying to catch up.

Similar to the promise of pervasive business intelligence, many vendors have been throwing around "Enterprise BI" as a best practice. The fact, as presented by the article and as confirmed by InetSoft's customer base, is that BI is more often deployed successfully at the department level. This is not hard to understand, because Enterprise BI can seem like boiling the ocean.

Last is the myth that "bigger is better" when it comes to BI vendors. The data show that users are more satisfied with smaller vendors, due in large part to the better customer service. In the post-consolidation market, this is great news for the independent innovators.

Don't believe the hype.

Thursday, July 10, 2008

Spread Marts

spread mart (sprěd märt) noun.
  1. a spreadsheet that is used for data integration
Spread marts solve some problems but introduce others.



In the latest issue of DM Review, there is an article Business Intelligence The Self-Service Way by Shailesh Kosambia from Tata Consultancy Services. It talks briefly about the data issues inhibiting self-service, saying "Business users do not have access to consolidated, unified data..." Shailesh continues, "When a data warehouse does not adequately cover all lines of business, the users have to get data from different sources and consolidate it in one place, leading to several spread marts within the same organization."

Claudia Imhoff discusses ways to solve the issues of "spread marts" in How to "Excel" in Your Business Intelligence Environment:
  1. Being able to "follow the data"
  2. Scheduled data updates
  3. Expiration dates
  4. Securing certain data
  5. Preserving formats, formulas
  6. Link back to live data
Why are people so hung up on using Excel?

Neil Raden of Smart (enough) Systems explains the infatuation with Excel, saying it's because it is "subversive". To paraphrase, spreadsheets allow users to do things behind the back of IT.

What if users could get all the benefits of using Excel (easy to use; data manipulation and combination; no IT needed) without the problems around transparency, latency, authority, and consistency?

To the man with a hammer, every problem looks like a nail. For another tool, see my previous article on Data Mashup.

Wednesday, July 9, 2008

Pervasive BI by TDWI

pervasive (pər-vā'sĭv) adj.
  1. spreading or spread throughout
  2. having the quality or tendency to become spread throughout all parts of
When pervasive is used as a modifier with BI, it is always this second definition. Often it is also just wishful thinking more than reality.



Wayne Eckerson of TDWI just published a new report: Pervasive Business Intelligence - Techniques and Technologies to Deploy BI on an Enterprise Scale. Along with the 35 page report, he also gave a webinar today highlighting some of the key ideas and points he raised.

One of the main ideas he put forward, as a way of expanding usage of BI tools, was that of information sandboxes. In these interactive dashboards, users can explore and analyze data, without performing traditional ad hoc report creation. I feel that Wayne is playing catch up, because Boris Evelson at Forrester published his research on BI Workspaces (he initially called them "analytical sandboxes") last month, and vendors have been providing this capability for over a year. Better late than never.

Also, to my pleasant surprise Wayne, Mr. Data Warehouse, acknowledged that the advent of new technologies can make a data warehouse optional. I am paraphrasing, but I am glad some of these BI traditionalists are no longer in denial.

Tuesday, July 8, 2008

Self-Service vs. IT?

symbiosis (sĭm'bē-ŏnt') noun.
  1. any interdependent or mutually beneficial relationship between two persons, groups, etc.
Users need IT and vice versa.



The question mark in the title is on purpose. I believe it is an open question, with no right answer.

Just a few short years ago, I don't think anyone would have doubted the requirement for the IT department. Nowadays, with SaaS, appliances, and Web 2.0, many people/companies are.

Specifically in the BI space, Gartner published an article: Emerging Technologies Will Drive Self-Service Business Intelligence. Techworld cited it in a release: IT's role in business intelligence to lessen. The Gartner publication is also the focus of an article published today by DM Review called Gartner Says Emerging Technologies Will Marginalize IT's Role in Business Intelligence.

Aside: The focus of the article is on interactive visualization, and alternatives to data warehousing. I couldn't have made a better case for InetSoft products if I had written it myself.

I don't think the role of IT will be reduced. Rather, I believe that IT will spend less time on the mundane tasks like producing reports, and be able to focus on other, more advanced, BI projects. For example, predictive modeling, decision automation, and embedding BI in business processes are all still firmly in the realm of IT.

Monday, July 7, 2008

Post-pre-aggregation

heuristic (hyŏŏ-rĭs'tĭk) noun.
  1. of, pertaining to, or based on experimentation, evaluation, or trial-and-error methods
The definition of multi-dimensional cubes is an exercise in successive approximation.



The traditional bottom-up design of pre-aggregated data is based on user input collected at the beginning of the project. Then, as they start to use the system, their feedback is incorporated into re-engineering efforts. This type of system has a lot of inertia, and can easily frustrate the user population.

Let's take a step back and remember why pre-aggregates are employed in the first place. Running a report or viewing a dashboard that is based on a large dataset will be slow if it is done on the fly. If we can anticipate the queries that will be executed, and run them before they are needed, we can give the appearance of a faster response.

I think the trouble enters in the interpretation of anticipate. The traditional model aims to address future usage in the data design stage. Consider an alternative of preparing at runtime.

Let me expand on this to clarify what I mean. In the traditional model, data is pre-aggregated, then reports are created. In the alternative model, reports are created, then data is pre-aggregated.

This means that the end products can be used to define what data is pre-calculated and how. If a report displays quarterly sales information, the sales data can be pre-aggregated at the quarter level. This takes the guesswork out of the process, guaranteeing that all scheduled summations are used, and that all end products can take advantage of preprocessing.

Thursday, July 3, 2008

Build vs. Buy

opportunity cost (ŏp'ər-tōō'nĭ-tē kôst) noun.
  1. the cost of an alternative that must be forgone in order to pursue a certain action
  2. the benefits you could have received by taking an alternative action
The choice between build and buy requires more than simple arithmetic.



In many organizations, the choice between purchasing a software package and building your own comes down to license costs. This is a very myopic/nearsighted/shortsighted view.

Hopefully some of you are already thinking in terms of TCO. This is Total Cost of Ownership, and represents the entire cost, which is not so easy to calculate.

First, it is important to understand the difference between cost and outlay. An outlay is cash exchanged for acquisition or usage (license cost) and the concept of cost is a superset that includes outlays and other expenses (maintenance, salaries).

Next, we also have to consider opportunity cost. Consider the following two scenarios: purchasing a product that is easy to deploy and maintain; or having an in-house developer use a couple open source packages to build the desired functionality. It is clear that the second option wins the "lowest license cost" battle. In some cases, a new developer must be hired, in which case his or her salary must factor into the TCO of the second option. In other cases, when a developer is already on staff, there is the opportunity cost to consider. What else could the developer work on, instead of reinventing the wheel? What benefits could that project have yielded? If there is nothing else the developer could have done, then costs can be saved in the first scenario by reducing staff. There is also opportunity cost for the user population, because they must wait for the homegrown solution.

The last item that is a little more difficult to quantify is risk. This applies to the initial creation, where developing a project internally is more likely to fail or suffer from underestimated costs. It also applies to the risk going forward of supporting and maintaining the homegrown system without the help of a vendor.

I am not saying that buy is always better than build. It is important to accurately assess the options, considering the total costs over time.

Wednesday, July 2, 2008

Sea Change in BI

sea change (sē chānj) noun. idiomatic
  1. profound transformation: "Full fathom five thy father lies: / Of his bones are coral made: / Those are pearls that were his eyes: / Nothing of him that doth fade / But doth suffer a sea change / Into something rich and strange." -Shakespeare
  2. a striking change, as in appearance, often for the better
The data warehouse has gone from a means to an end.

It's time to step back and look at the forest.



In thinking about data mashup, I asked myself why more vendors don't provide it. My initial thought was that it was an innovation that was never thought of before, but I then humbled myself and returned to reality. I now believe a combination of two factors cause the lack of availability of data mashup.

First, my vision of user-driven data mashup would not be an effective standalone product. Think of using a separate application to pull/massage/transform/combine disparate data sources. Now, what is the result? In only a few cases is the query the end of the story. Typically that dataset now needs to be presented in an interesting/useful/appealing way. To have this dataset be the input to a separate BI product would require some integration point, like a web service, and a generic way of communicating a recordset of any size in both dimensions. This starts to get complex, and require skills that are possessed only by IT professionals.

Second, let's ask the question of why other BI vendors don't provide data mashup. It is not very hard to see that it is not because they can't, but rather they choose not to. The remainder of this article is devoted to this topic.

Tom Gonzalez wrote a blog entry called What is wrong with the Business Intelligence Industry?. As an "outsider", Tom brings a fresh perspective from his experience applying next generation technology to business problems.

Some industry analysts are even starting to challenge the status quo, which is refreshing.

Colin White wrote an article, Is Data Warehousing Essential to Business Intelligence?, that concludes, "No, it's not." He reminds us of the 5 issues that data warehouses address, and that these issues are solvable through other means.

At the risk of spreading an unconfirmed rumor, Claudia Imhoff, co-author of books on data warehousing, has allegedly admitted that some BI implementations do not require a DW. A baby step, but at least it's in the right direction.

More on this later.

Tuesday, July 1, 2008

Data Mashup Defined

mashup (māsh'ŭp') noun.
  1. an audio recording that is a composite of samples from other recordings, usually from different musical styles: Danger Mouse created the Grey Album, which is a mashup of The Beatles' White Album and Jay-Z's Black Album.
  2. the creation of a new work from two sources that were not initially designed to be combined
Apply this idea to the internet and you get things like Google Maps with Subway lines.

Apply this idea to data and you have an alternative to data warehousing.



Some have tried to define data mashup as data federation that makes use of web services, screen scraping, and even Yahoo! Pipes. This is all well and good, but doesn't really live up to the spirit of mashup because of the dependence on IT skills. After all, you wouldn't label ETL as "data mashup". I think that the key component that is missing is user involvement.

From a Business Intelligence perspective, users have had ad hoc query abilities, to some degree, for years. Imagine for a moment, a business user pulling together data from a sales system, a marketing database, and a local spreadsheet, creating a dataset that will help guide future decisions.

This is certainly possible if IT populated a data warehouse with all this information, or if IT built the federation infrastructure to present all this data as a virtual warehouse. Either way, it is a static definition that IT needs to build and maintain. Changes of a permanent nature are passed down from users to developers, and changes of a temporary/hypothetical nature are left unaddressable.

Enter what I call true Data Mashup. The user-defined variety that takes ad hoc query to the next level. It gives users the lego bricks, and lets them build a car, house, or Taj Mahal.

The IT staff will always be responsible for defining the lego bricks, but end users can play with the data in ways that are unanticipated. They can combine datasets that were not initially designed to be combined.