Sunday, October 10, 2010

BI For The Masses: 3 Solutions That Will Never Happen

I was reading a Business Intelligence (BI) white paper feed – I can’t remember which – when I happened across one whose title was, more or less, “Data Visualization: Is This the Way To Attract the Common End User?” And I thought, boy, here we go again.

You see, the idea that just a little better user interface will finally get Joe and Jane (no, not you, Mr. Clabby) to use databases dates back at least 26 years. I know, because I had an argument with my boss at CCA, Dan Ries iirc (a very smart fellow), about it. He was sure that with a fill-out-the-form approach, any line employee could do his or her own ad-hoc queries and reporting. Based on my own experiences as a na├»ve end user, I felt we were very far from being able to give the average end user an interface that he or she would be able or motivated to use. Here we are, 26-plus years later, and all through those years, someone would pipe up and say, in the immortal words of Bullwinkle, “This time for sure!” And every time, it hasn’t happened.

I divide the blame for this equally between vendor marketing and IT buying. Database and BI vendors, first and foremost, look to extend the ability of specific targets within the business to gain insights. That requires ever more sophisticated statistical and relationship-identifying tools. The vendor looking to design a “common-person” user interface retrofits the interface to these tools. In other words, the vendor acts like it is selling to a business-expert, not a consumer, market.

Meanwhile, IT buyers looking to justify the expense of BI try to extend its use to upper-level executives and business processes, not demand that it extend the interface approach of popular consumer apps to using data, or that it give the line supervisor who uses it at home a leg up at work. And yet, that is precisely how Word, Excel, maybe PowerPoint, and Google search wound up being far more frequently used than SQL or OLAP.

I have been saying things like this for the last 26 years, and somehow, the problem never gets solved. At this point, I am convinced that no one is really listening. So, for my own amusement, I give you three ideas – ideas proven in the real world, but never implemented in a vendor product – that if I were a user I would really like, and that I think would come as close as anything can to achieving “BI for the masses.”

Idea Number 1: Google Exploratory Data Analysis

I’m reading through someone’s blog when they mention “graphical analysis.” What the hey? There’s a pointer to another blog, where they make a lot of unproven assertions about graphical analysis. Time for Google: a search on graphical analysis results in a lot of extraneous stuff, some of it very interesting, plus Wikipedia and a vendor who is selling this stuff. Wikipedia is off-topic, but carefully reading the article shows that there are a couple of articles that might be on point. One of them gives me some of the social-networking theory behind graphical analysis, but not the products or the market. Back to Google, forward to a couple of analyst market figures. They sound iffy, so I go to a vendor site and get their financials to cross-check. Not much in there, but enough that I can guesstimate. Back to Google, change the search to “graphical BI.” Bingo, another vendor with much more market information and ways to cross-check the first vendor’s claims. Which products have been left out? An analyst report lists the two vendors, but in a different market, and also lists their competitors. Let’s take a sample competitor: what’s their response to “graphical analysis” or graphical BI? Nothing, but they seem to feel that statistical analysis is their best competitive weapon. Does statistical analysis cover graphical analysis? The names SAS and SPSS keep coming up in my Google searches. It doesn’t seem as if their user manuals even mention the word “graph”. What are the potential use cases? Computation of shortest path. Well, only if you’re driving somewhere. Still, if it’s made easy for me … Is this really easier than Mapquest? Let’s try a multi-step trip. Oog. It definitely could be easier than Mapquest. Can I try out this product? All right, I’ve got the free trial version loaded, let’s try the multi-step trip. You know, this could do better for a sales trip than my company’s path optimization stuff, because I can tweak it for my personal needs. Combine with Google Maps, stir … wouldn’t it be nice if there was a Wikimaps, so that people could warn us about all these little construction obstructions and missing signs? Anyway, I’ve just given myself an extra half-hour on the trip to spend on one more call, without having to clear it.

Two points about this. First, Google is superb at free-association exploratory analysis of documents. You search for something, you alter the search because of facts you’ve found, you use the results to find other useful facts about it, you change the topic of the search to cross-check, you dig down into specific examples to verify, you even go completely off-topic and then come back. The result is far richer, far more useful to the “common end user” and his or her organization, and far more fun than just doing a query on graphical data in the company data warehouse.

Second, Google is lousy at exploratory data analysis, because it is “data dumb”: It can find metadata and individual pieces of data, but it can’t detect patterns in the data, so you have to do it yourself. If you are searching for “graphical analysis” across vendor web sites, Google can’t figure out that it would be nice to know that 9 of 10 vendors in the market don’t mention “graph” on their web sites, or that no vendors offer free trial downloads.

The answer to this seems straightforward enough: add “guess-type” data analysis capabilities to Google. And, by the way, if you’re at work, make the first port of call your company’s data-warehouse data store, full of data you can’t get anywhere else. You’re looking for the low-priced product for graphical analysis? Hmm, your company offers three types through a deal with the vendor, but none is the low-cost one. I wonder what effect that has had on sales? Your company did a recent price cut; sure enough, it hasn’t had a big effect. Except in China: does that have to do with the recent exchange rate manipulations, and the fact that you sell via a Chinese firm instead of on your own? It might indeed, since Google tells you the manipulations started 3 weeks ago, just when the price cut happened.

You get the idea? Note that the search/analysis engine guessed that you wanted your company’s data called out, and that you wanted sales broken down by geography and in a monthly time series. Moreover, this is exploratory data analysis, which means that you get to see both the summary report/statistics and individual pieces of raw data – to see if your theories about what’s going on make sense.

In Google exploratory data analysis, the search engine and your exploration drive the data analysis; the tools available don’t. It’s a fundamental mind shift, and one that explains why Excel became popular and in-house on-demand reporting schemes didn’t, or why Google search was accepted and SQL wasn’t. One’s about the features; the other’s about the consumer’s needs.

Oh, by the way, once this takes off, you can start using information about user searches to drive adding really useful data to the data warehouse.

Idea Number 2: The Do The Right Thing Key

Back in 1986, I loved the idea behind the Spike Lee movie title so much that I designed an email system around it. Here’s how it works:

You know how when you are doing a “replace all” in Word, you have to specify an exact character string, and then Word mindlessly replaces all occurrences, even if some should be capitalized and some not, or even if you just want whole words to be changed and not character strings within words? Well, think about it. If you type a whole word, 90% of the time you want only words to be replaced, and capitals to be added at the start of sentences. If you type a string that is only part of a word, 90% of the time you want all occurrences of that string replaced, and capitals when and only when that string occurs at the start of a sentence. So take that Word “replace” window, and add a Do the Right Thing key (really, a point and click option) at the end. If it’s not right, the user can just Undo and take the long route.

The Do The Right Thing key is a macro; but it’s a smart macro. You don’t need to create it, and it makes some reasonable guesses about what you want to do, rather than you having to specify what it should do exactly. I found when I designed my email system that every menu, and every submenu or screen, would benefit from having a Do The Right Thing key. It’s that powerful an idea.

How does that apply to BI? Suppose you are trying to track down a sudden drop in sales one week in North America. You could dive down, layer by layer, until you found that stores in Manitoba all saw a big drop that week. Or, you could press the Break in the Pattern key, which would round up all breaks in patterns of sales, and dig down not only to Manitoba but also to big offsetting changes in sales in Vancouver and Toronto, with appropriate highlighting. 9 times out of ten, that will be the right information, and the other time, you’ll find out some other information that may prove to be just as valuable. Now do the same type of thing for every querying or reporting screen …

The idea behind the Do The Right Thing key is actually very similar to that behind Google Exploratory Data Analysis. In both cases, you are really considering what the end user would probably want to do first, and only then finding a BI tool that will do that. The Do The Right Thing key is a bit more buttoned-up: you’re probably carrying out a task that the business wants you to do. Still, it’s way better than “do it this way or else.”

Idea Number 3: Build Your Own Data Store

Back in the days before Microsoft Access, there was a funny little database company called FileMaker. It had the odd idea that people who wanted to create their own contact lists, their own lists of the stocks they owned and their values, their own grades or assets and expenses, should be able to do so, in just the format they wanted. As Oracle steadily cut away at other competitors in the top end of the database market, FileMaker kept gaining individual customers who would bring FileMaker into their local offices and use it for little projects. To this day, it is still pretty much unique in its ability to let users quickly whip up small-sized, custom data stores to drive, say, class registrations at a college.

To my mind, FileMaker never quite took the idea far enough. You see, FileMaker was competing against folks like Borland in the days when the cutting edge was allowing two-way links between, let’s say, students and teachers (a student has multiple teachers, and teachers have multiple students). But what people really want, often, is “serial hierarchy”. You start out with a list of all your teachers; the student is the top level, the teachers and class location/time/topic the next level. But you next want to see if there’s an alternate class; now the topic is the top level, the time at the next level, the students (you, and if the class is full) at a third level. If the number of data items is too small to require aggregation, statistics, etc.; you can eyeball the raw data to get your answers. And you don’t need to learn a new application (Outlook, Microsoft Money, Excel) for each new personal database need.

The reason this fits BI is that, often, the next step after getting your personal answers is to merge them with company data. You’ve figured out your budget, now do “what if”: does this fit with the company budget? You’ve identified your own sales targets, so how do these match up against those supplied by the company? You download company data into your own personal workspace, and use your own simple analysis tools to see how your plans mesh with the company’s. You only get as complex a user interface as you need.

Conclusions

I hope you enjoyed these ideas, because, dollars to doughnuts, they’ll never happen. It’s been 25 years, and the crippled desktop/folder metaphor and its slightly less crippled cousin, the document/link browser metaphor, still dominate user interfaces. It’s been fifteen years, and only now is Composite Software’s Robert Eve getting marketing traction by pointing out that trying to put all the company’s data in a data warehouse is a fool’s errand. It’s been almost 35 years, and still no one seems to have noticed that seeing a full page of a document you are composing on a screen makes your writing better. At least, after 20 years, Google Gmail finally showed that it was a good idea to group a message and its replies. What a revelation!

No, what users should really be wary of is vendors who claim they do indeed do any of the ideas listed above. This is a bit like vendors claiming that requirements management software is an agile development tool. No; it’s a retrofitted, slightly less sclerotic tool instead of something designed from the ground up to serve the developer, not the process.

But if you dig down, and the vendor really does walk the walk, grab the BI tool. And then let me know the millennium has finally arrived. Preferably not after another 26 years.

Thursday, October 7, 2010

IBM's Sustainability Initiative: Outstanding, and Out of Date

IBM’s launch of its new sustainability initiative on October 1 prompted the following thoughts: This is among the best-targeted, best-thought-out initiatives I have ever seen from IBM. It surprises me by dealing with all the recent reservations I have had about IBM’s green IT strategy. It’s all that I could have reasonably asked IBM to do. And it’s not enough.

Key Details of the Initiative
We can skip IBM’s assertion that the world is more instrumented and interconnected, and systems are more intelligent, so that we can make smarter decisions; it’s the effect of IBM’s specific solutions on carbon emissions that really matters. What is new – at least compared to a couple of years ago – is a focus on end-to-end solutions, and on solutions that are driven by extensive measurement. Also new is a particular focus on building efficiency, although IBM’s applications of sustainability technology extend far beyond that.

The details make it clear that IBM has carefully thought through what it means to instrument an organization and use that information to drive reductions in energy – which is the major initial thrust of any emission-reduction strategy. Without going too much into particular elements of the initiative, we can note that IBM considers the role of asset management, ensures visibility of energy management at the local/department level, includes trend analysis, aims to improve space utilization, seeks to switch to renewable energy where available, and optimizes HVAC for current weather predictions. Moreover, it partners with others in a Green Sigma coalition that delivers building, smart grid, and monitoring solutions across a wide range of industries, as well as in the government sector. And it does consider the political aspects of the effort. As I said, it’s very well targeted and very well thought out.
Finally, we may note that IBM has “walked the walk”, or “eaten its own dog food”, if you prefer, in sustainability. Its citation of “having avoided carbon emissions by an amount equal to 50% of our 1990 emissions” is particularly impressive.

The Effects
Fairly or unfairly, carbon emission reductions focus on reducing carbon emissions within enterprises, and emissions from the products that companies create. Just about everything controllable that generates emissions is typically used, administered, or produced by a company – buildings, factories, offices, energy, heating and cooling, transportation (cars), entertainment, and, of course, computing. Buildings, as IBM notes, are a large part of that emissions generation, and, unlike cars and airplanes, can relatively easily achieve much greater energy efficiency, with a much shorter payback period. That means that a full implementation of building energy improvement across the world would lead to at least a 10% decrease in the rate of human emissions (please note the italics; I will explain later). It’s hard to imagine an IBM strategy with much greater immediate impact.

The IBM emphasis on measurement is, in fact, likely to have far more impact in the long run. The fact is that we are not completely sure how to break down human-caused carbon emissions by business process or by use. Therefore, our attempts to reduce them are blunt instruments, often hitting unintended targets or squashing flies. Full company instrumentation, as well as full product instrumentation, would allow major improvements in carbon-emission-reduction efficiency and effectiveness, not just in buildings or data centers but across the board.

These IBM announcements paint a picture of major improvements in energy efficiency leading, very optimistically, to 30% improvements in energy efficiency and increases in renewable energy over the next 10 years – beyond the targets of most of today’s nations seeking to achieve a “moderate-cost” ultimate global warming of 2 degrees centigrade, in their best-case scenarios. In effect, initiatives like IBM’s plus global government efforts could reduce the rate of human emissions (I will explain the italics later) beyond existing targets. Meanwhile, Lester Brown has noted that from 2008 to 2009, measurable US human carbon emissions from fossil fuels went down 9 percent.

This should be good news. But I find that it isn’t. It’s just slightly less bad news.

Everybody Suffers
Everyone trying to do something about global warming has been operating under a set of conservative scientific projections that, for the most part, correspond to the state of the science in 2007. As far as I can tell, here’s what’s happened since, in a very brief form:

1. Sea rise projections have doubled, to 5 feet of rise in 80 years. In fact, more rapid than expected land ice loss means that 15 feet of rise may be more likely, with even more after that.

2. Scientists have determined that “feedback loops” such as loss of the ability of ice to reflect back light and therefore decrease ocean heat, which loss in turn increases global temperature, are in fact “augmenting feedbacks”, meaning that they will contribute to additional global warming even if we decrease emissions to near zero right now.

3. Carbon in the atmosphere is apparently headed still towards the “worst case” scenario of 1100 ppm. That, in turn, apparently means that the “moderate effect” scenario underlying all present global plans for mitigation of climate change with moderate cost (450 ppm) will in all likelihood not be achieved . Each doubling of ppm leads to 3.5 degrees centigrade or 6 degrees Fahrenheit average rise in temperature (in many cases, more like 10 degrees Fahrenheit in summer), and the start level was about 280 ppm, so we are talking 12 degrees Fahrenheit rise from reaching 1100 ppm , with follow-on effects and costs that are linear up to 700-800 ppm and difficult to calculate but almost certainly accelerating beyond that.

4. There is growing consensus that technologies to somehow sequester atmospheric carbon or carbon emissions in the ground, if feasible, will not be operative for 5-10 years, not at full effectiveness until 5-10 years after that, and not able to take us back to 450 ppm for many years after that – and not able to end the continuing effects of global warming for many years after that, if ever .
Oh, by the way, that 9 % reduction in emissions in the US? Three problems. First, that was under conditions in which GNP was mostly going down. As we reach conditions of moderate or fast growth, that reduction goes to zero. Second, aside from recession, most of the reductions achieved up to now come from low-cost-to-implement technologies. That means that achieving the next 9%, and the next 9% after that, becomes more costly and politically harder to implement. Third, at least some of the reductions come from outsourcing jobs and therefore plant and equipment to faster-growing economies with lower costs. Even where IBM is applying energy efficiencies to these sites, the follow-on jobs outside of IBM are typically less energy-efficient. The result is a decrease in the worldwide effect of US emission cuts. As noted above, the pace of worldwide atmospheric carbon dioxide rise continued unabated through 2008 and 2009. Reducing the rate of human emissions isn’t good enough; you have to reduce the absolute amount of human, human-caused (like reduced reflection of sunlight by ice) and follow-on (like melting permafrost, which in the Arctic holds massive amounts of carbon and methane) emissions.

That leaves adaptation to what some scientists call climate disruption. What does that mean?

Adaptation may mean adapting to a rise in sea level of 15 feet in the next 60 years and an even larger rise in the 60 years after that. Adaptation means adapting to disasters that are 3-8 times more damaging and costly than they are now, on average (a very rough calculation, based on the scientific estimate that a 3% C temperature rise doubles the frequency of category 4-5 hurricanes; the reason is that the atmosphere involved in disasters such as hurricanes and tornados can store and release more energy and water with a temperature rise). Adaptation means adjusting to the loss of food and water related to ecosystems that cannot move north or south, blocked by human paved cities and towns. Adaptation means moving to lower-cost areas or constantly revising heating and cooling systems in the same area, as the amount of cooling and heating needed in an area changes drastically. Adaptation means moving food sources from where they are in response to changing climates that make some areas better for growing food, others worse. Adaptation may mean moving 1/6 of the world’s population from one-third of the world’s cultivable land which will become desert . In other words, much of this adaptation will affect all of us, and the costs of carrying out this adaptation will fall to some extent on all of us, no matter how rich. And we’re talking the adaptation that, according to recent posts , appears to be already baked into the system. Moreover, if we continue to be ineffectual at reducing emissions, each decade will bring additional adaptation costs on top of what we are bound to pay already.

Adaptation will mean significant additional costs to everyone – because climate disruption brings costs to everyone in their personal lives. It is hard to find a place on the globe that will not be further affected by floods, hurricanes, sea-level rise, wildfires, desertification, heat that makes some places effectively unlivable, drought, permafrost collapse, or loss of food supplies. Spending to avoid those things for one’s own personal home will rise sharply – well beyond the costs of “mitigating” further climate disruption by low-cost or even expensive carbon-emission reductions.

What Does IBM Need To Do?
Obviously, IBM can’t do much about this by itself; but I would suggest two further steps.

First, it is time to make physical infrastructure agile. As the climate in each place continually changes, the feasible or optimum places for head offices, data centers, and residences endlessly change. It is time to design workplaces and homes that can be inexpensively transferred from physical location to physical location. Moving continually is not a pleasant existence to contemplate; but virtual infrastructure is probably the least-cost solution.

Second, it is time to accept limits. The effort to pretend that we do not need to accept the need to reduce emissions in absolute, overall terms, because technology, economics, or sheer willpower will save us, as we have practiced it since our first warning in the 1970s, is failing badly. Instead of talking in terms of improving energy efficiency, IBM needs to start talking in terms of absolute carbon emissions reduction every year, for itself, for its customers, and for use of its products, no matter what the business’ growth rate is.

One more minor point: because climate will be changing continually, adjusting HVAC for upcoming weather forecasts, which only go five days out, is not enough. When a place that has seen four days of 100 degree weather every summer suddenly sees almost 3 months of it, no short-term HVAC adjustment will handle continual brownouts adequately. IBM needs to add climate forecasts to the mix.

Politics, Alas
I mention this only reluctantly, and in the certain knowledge that for some, this will devalue everything I have said. But there is every indication, unfortunately, that without effective cooperation from governments, the sustainability goal that IBM seeks, and avoidance of harms beyond what I have described here, are not achievable.

Therefore, IBM membership in an organization (the US Chamber of Commerce) that actively and preferentially funnels money to candidates and legislators that deny there is a scientific consensus about global warming and its serious effects undercuts IBM’s credibility in its sustainability initiative and causes serious damage to IBM’s brand. Sam Palmisano as Chairman of the Board of a company (Exxon Mobil) that continues to fund some “climate skeptic” financial supporters (the Heritage Foundation, at the least) and preferentially funnels money to candidates and legislators that deny the scientific consensus does likewise.

Summary
IBM deserves enormous credit for creating today comprehensive and effective efforts to tackle the climate disruption crisis as it was understood 3 years ago. But they are three years out of date. They need to use their previous efforts as the starting point for creating new solutions within the next year, solutions aimed at a far bigger task: tackling the climate disruption crisis as it is now.