Monday, September 24, 2012

A Horrible Quote

I found the following quote today in www.climateprogress.com:

"Responding to Rupert Murdoch’s disinformation campaign, one Australian climate scientist put it bluntly: 'The Murdoch media empire has cost humanity perhaps one or two decades of time in the battle against climate change.'"

Or, to put it another way: perhaps 500 million murders.

I really, really, really, really hope that I am wildly exaggerating, even though the evidence suggests that maybe not.

Because 500 million here, 500 million there, and pretty soon -- perhaps 40 years from now -- those 500 million we delay too late to save will be our own great-great-grandchildren.

Sunday, September 23, 2012

Is Your Organization Suffering From Data Warehouse Disease?

I have a feeling that a fair amount of readers – especially vendors and IT BI types – are going to be upset by what I have to say in this post. However, viewing some of the material that has passed across my desk recently, I really think it’s time to raise the question of whether too much organizational power given to data warehouse folks is beginning to cause some significant under-performance in meeting today’s key organizational information management needs.

The immediate occasion for these reflections is that I am partway through a book on a related subject that goes into some detail on data warehousing’s view of the world: how BI should be handled, what the organizational information architecture should be, and how we got this way. This book will remain nameless, because in many ways it’s an excellent primer. However, over the last 22-31 years (depending on whether you count my software development days), I have had a cross-organization, cross-vendor view of the same area, and I have to say that the book redefines history and the purposes of various things in the ideal information architecture in major ways.

Usually, I find that going over history just wastes time in a blog post – but here, it helps to see how data warehouse concepts of common information management terms make them reinterpret the purposes of the underlying products, making the information architecture – and the whole information handling process – potentially (and, probably, actually) less effective in the medium and long term. So let’s combine history and exposition of my assertion.

A Data Warehousing View of the World

In brief, the book’s view of the information architecture seems to be as follows: Data of all types comes in to production systems, which immediately pass it on to the data warehouse for cleansing and aggregation. Behind the data warehouse is an optional operational data store for key data, and things like master data management operate in parallel with the data warehouse to provide a global view of multiple local ways to store customer data. On top of the data warehouse are key Business Intelligence applications, which include both repetitive, scheduled reporting and analytics.

Now, this view of the world seems reasonable if you were born yesterday, or if you’ve spent the last fifteen years entirely in data warehousing. However, there are, in my view, some major problems with it.

In the first place, afaik, only in data warehousing are the databases at the initial entry point referred to as “production systems”. For twenty years, I have been calling them “operational databases”. In fact, they were business-critical before data warehousing existed, and so were the apps on top of them – like ERP.

Why does this matter? Because it allows data warehouse folks to shift the “operational data store” behind the data warehouse. The operational data store is a later concept, and one that I (among others, I assume) wrote papers proposing around 2004 and 2005. The idea is that the data warehouse is simply too slow to react immediately to key operational data – but that operational data is scattered across multiple operational data stores, and so an “operational data store” makes sure that a subset of operational data for quick decision-making is either put in a central point for quick analysis in parallel with its arrival, or monitored by a central “virtual database.” Putting the operational data store behind the data warehouse defeats its entire purpose.

Likewise, the master data management system. I wrote papers on this in assessing IBM’s version of the concept in 2006 and 2007. Again, the notion was of combining operational data coming in to operational databases – in this case, by enforcing a common format that allowed cross-organization and cross-country leveraging of operational data by ERP and customer intelligence apps. By redefining the master data management as existing within the data warehouse or at the same remove from operational databases, data warehouse folks ensure that master data management moves no faster than the data warehouse.

And finally, there is the idea that (implicitly) analytics is entirely contained in BI, and hence is entirely dependent on the data warehouse. On the contrary, an increasing amount of analytics goes on outside of BI. For example, analytics is part of products that analyze computer infrastructure semi-automatically to optimize performance or detect upcoming problems. Or, it is used to analyze key computer-supported business processes. This is “intelligence” in the sense of “military intelligence” – proactively going out and finding out what’s going on – but it is not “business intelligence” in the sense of finding out what’s going on inside and outside the business on the basis of data that is handed to you, and that your reporting tools are too slow or shallow to tell you. In other words, these applications of analytics are entirely outside of a reactive data warehouse.

Why It Matters

There are two places that over-emphasis on data warehousing can impede organizational BI and other information management effectiveness: the information architecture, and the organization’s “agility” in responding to new kinds of information from outside. As I’ve suggested in the previous section, a data warehousing view of the information architecture shifts operations that involve lots of “updates” and data just arrived from outside to the data warehouse or behind it. That means going through the data-warehouse cleansing and aggregation process and arriving in a centralized location that is handling queries from all over the organization and is optimized for adding new data not “on the fly” but in delayed bursts. There is simply no way that is going to be as timely as performing tasks on the data as it arrives in the operational systems.

Just as troubling, the entire emphasis of the organization is now more reactive and focused farther away from the organization’s “antennae” to the outside environment. The IT organization appears to be focused on responding to new demands from business for timelier data, not actively seeking the latest new information and merging it back into existing systems. The IT organization appears to emphasize cleaning up the data and merging it and only then analyzing it at an internal “choke point”, rather than handling the information faster where it arrives.

If you think these concerns are theoretical, think about the case of social-media Big Data. Yes, Oracle as a major vendor is emphasizing inhaling huge amounts of this data from multiple clouds into the data warehouse and then analyzing it – when the whole purpose of the NoSQL movement is to allow rapid in-cloud analysis of inconsistent, uncleansed data – but it would not do so unless there was some organizational push to avoid analytics outside the data warehouse. I conclude that there is some strong evidence that a data warehousing focus is impeding organizational ability to process and feed to business decision makers key information in as timely a fashion as possible.

Moreover, there is some sense that this is not an organizational quirk but a tendency so embedded in the IT organization that this impediment is a symptom not of a temporary problem that is easy to fix, but rather of an organizational “disease.” In other words, simply directing the organization to pay more attention to doing social-media processing in the cloud will probably not work.

Action Strategies and Conclusion

First (although I think there is little danger of this) I must caution against throwing the baby out with the bathwater. There are very good reasons to have a data warehouse performing the core functions of querying for BI. I have, in the past, conjectured that if I were to design a new information architecture today, I might not create a data warehouse or data mart at all – instead, I might impose “data virtualization” and master data management tools over existing operational databases. However, practically speaking, in most if not all cases, the sheer experience behind today’s data warehousing products makes them far more preferable for core functions.

Rather, I would suggest that data warehousing be placed under, and be responsive to rather than dominant over, an information architecture and information strategy function aimed more at the edge of the organization than its central data center. This is not a matter of making the organization more responsive to the business; it is a matter of making the IT organization more agile (by my definition, which stresses the utility of proactive and outside-the-organization-directed agility).

Until I saw this book, which suggested that data warehouse folks had gone too far in asserting “IT information handling is all about the data warehouse”, I was not too concerned about data warehousing folks; I would get into annoying arguments with folks who thought I just didn’t “get” data warehousing, but it seemed to me that the benefits of a powerful database-related IT function outweighed the negatives of data warehouse folks’ “not invented here” blind spots. Now, I am rethinking my position. If the result of this type of rewriting of history is an increasingly sub-optimal information architecture, then such a “disease” is not so harmless after all.

Does your organization suffer from data warehouse disease? If so, what do you think should be done about it?

Monday, September 17, 2012

Update to My Last 2 Posts

At this time, Arctic sea ice extent has now reached about 3.47 mkm2, about 21 % below the previous record; area has reached about 2.23 mkm2, about 23% below the previous record; and scientists report that up to 150 miles from the Pole (as far as they investigated) ice was very thin and broken into small pieces. Apparently, this means that present measures of extent are overestimating it. For the first time since monitoring began, measures of air temperature above 80 degrees North are not decreasing towards the refreeze point. What more can I say than I have said?

Monday, September 3, 2012

How To Tell If It's a Lie: Arctic Sea Ice

The last five or so years have provided a useful case study in lying and how to see through the lies – useful in assessing products, in assessing strategies, and in reassessing one’s national and global views that affect how we all act in business and out of it. I am referring to the fascinating case of following the ups and downs of Arctic sea ice.

What does this have to do with lying in daily life? We’ll see.

Setting the Stage

Starting in the late 1960s, scientists began to establish that climate – the overall patterns of temperature, wind, and precipitation at various points on the globe within which weather fluctuates – was being affected by carbon in the atmosphere, and the data began to suggest that human carbon emissions were a major if not the primary cause of this change.

In reaction, self-called “skeptics” began to deny the role of humans in climate change. Over time, these became known as “climate change deniers” or “climate deniers” for short.

One of the key areas of focus of both climate scientists and deniers has been Arctic sea ice. Climate change science predicts that Arctic sea ice will melt due to human-caused climate change, first to almost nothing at minimum in September, and eventually year-round. Satellite and buoy data available beginning in 1979 has kept track of the area and extent (that is, area including cells with both open water and ice). A model supplemented by sampling has estimated volume (including the depth of the ice), and this year for the first time a good method of measuring volume has supplemented the model.

The reason that Arctic sea ice is of such fascination is that it is the equivalent of a “canary in a coal mine.” Like the canaries that coal miners carried with them whose sickness and death were a first warning of bad air in the mine, Arctic sea ice tells just how imminent major human-cause climate change is, and how quickly it is proceeding – and it is one of the first really visible signs of major change.

However, until very late in the process of melting, Arctic sea ice diminution is not very visible. What we see on the surface is the area and extent, and the ice is being melted on the top, bottom, and sides every year, and then in winter it is being frozen again. In essence, Arctic sea ice is more or less like a giant thin ice cube floating in the Arctic Ocean, with wind and currents constantly pushing ice out of the ocean to melt at one end and new ice forming at the other end. As a result, volume may drop steadily year after year, and only in September of one year late in the process (when the ice becomes too thin at minimum) do we see major drops in area and extent.

The Lie

The basic lie of the climate denier is that there is no such thing as human-caused global warming. Behind that lie is a psychological message: You (the listener) need not be forced to do anything about it, or even think about it, except as an amusing hobby. Those who insist are “them”, and they are trying to bother “us” for selfish purposes. Stand guard against “them”, and do not be fooled.

Behind that lie is an endless series of “fall-back positions.” Global climate change is not occurring. It is not human-caused, but caused by many other factors. The data on each bit of evidence is wrong, or not to be trusted, because it comes from “them.” And each argument visibly and clearly refuted simply means that the denier stops talking about that argument and focuses on the next one, while preserving the basic lie.

In politics, there is a final fall-back position, in which the politician pretends that he or she never was a proponent of the lie in the first place. However, the psychological message remains: Yes, I agree with human-cause climate change whole-heartedly and always have (!), but it’s no big deal. Let’s do as little as possible, as slowly as possible, because the methods of dealing with it are the ones being pushed by “them,” for their own selfish purposes.

Politics is particularly relevant here, because in this case governments fund data collection. The less data collected, the less easily the lie is exposed. The corresponding case in business is the collection of customer and accounting data. The “power center” in the company has a vested interest in saying that present strategies and tactics are not wrong-headed. In many cases, it can be difficult to tell the source of failure. There is, for example, the reported case of the performance-testing team that reported a slow software product, to the point of likely major customer dissatisfaction – the person in charge simply fired the team, and blame for the resulting poor sales was passed to his successor.

So the denier alleges initially that there is no change in Arctic sea ice that is not accounted for by “natural variability.” He or she drops or alters arguments to suit over the years as data comes in. And each new or continuing listener, safe in the cocoon of the lie, moves ever further into delusion.

The Lie Exposed

Perhaps the foremost exponent of the Arctic sea ice variant of the lie is Anthony Watts of WattsUpWithThat – although Andrew Revkin of the NY Times in his blog has apparently persistently played a subtler denier role. Over the last 2-3 years I have followed at a distance the evolution of their arguments as the data on Arctic sea ice continues to come in. However, as we will see, the fallout from 30 and more years of previous lies has also affected what happens as the lie gets exposed.

Let’s start with an odd event: Al Gore a little over 2 years ago embracing some scientific predictions that Arctic sea ice would go to near zero by about 2016. Now, I know that some people reading this will immediately want to stop reading, because they have an image of Al Gore as an untrustworthy politician. Unfortunately for that preconception, there is ample testimony from climate scientists that Gore has taken great pains to understand climate science better, and is therefore to an astonishing degree reasonably close to representing fairly the scientific findings and what they mean. To put it bluntly: whatever you think of Al Gore in other areas, he is not a typical politician in this area, and therefore your mistrust is just plain wrong.

Gore’s remark was immediately seized on as yet another proof of the ludicrousness of climate change predictions in general and Arctic sea ice ones in particular. In 2007, there had been some concern among a few, as, aided by a confluence of weather factors, Arctic sea ice area and extent had fallen to a new low (since 1979) in early September. However, due to the absence of these weather factors, area and extent at minimum had rebounded somewhat in 2008 and 2009, and deniers pointed to these numbers and asked how one could possibly believe that it would all be gone in five years. Loosed by Watts and his ilk, “trolls” haunted serious or denial-countering sites jeering at those who, like Neven (see one of my previous blog posts), were attempting to follow the clear thread of the data.

A particular focus of their ire was the use of a “speculative” volume model – clearly, not in anyone’s scientific mainstream. The fact that the model had been refined and checked by physical sampling was of no relevance, nor did deniers raise the point that, if accurate, it was a better measure of what was going on.

And then, in September of 2010, area and extent turned downward again – and volume took a major plunge. None of that was reflected in Watts – it was just part of “natural variability.” By September of 2011, area and extent had reached close to their 2007 lows, and volume continued to decrease, while the only semi-troll on the Neven site tried to argue that even if other areas melted, the Central Arctic Basin would take a long time to do so, if ever.

By this time, a little sporting competition had developed, with scientific models and enthusiastic amateurs filing their predictions for this year’s minimum area and extent. In 2012, Watts finally abandoned his perennial prediction that these would move back to pre-2007 – but he was still on the high side, with a 4.7 mkm2 extent prediction. And, of course, there was no indication in his blog that he was wrong in the slightest, or that there was anything amiss.

And now here we are at the beginning of September, and all previous records have been easily shattered. Extent is at 3.67 mkm2, and probably will wind up below 3.5 mkm2. Area is already almost 20% below 2011 and 2007, and probably will wind up at 20% below. Volume is already 10% below 2011, and will probably wind up 15%-20% below. The Central Arctic Basin is already easily at a record low. And weather conditions have not been favorable for records at all.

So what has Watts said and done? At first, Watts kept pointing to the records that had not yet been broken. Then, he resorted to comparing the largest measure of extent in 2012 to one of the smallest measures in 2007. And now, apparently, he has ascribed this year to, in one commentator’s pithy phrase, “natural unnatural variability” – the argument that this is a once in heaven-knows-how-many-years occurrence. Other, that is, than not talking about it at all. Neven at one point posted a comment on Watt’s blog saying, sarcastically, “Hey, there’s nothing going on with Arctic sea ice, right?” and Watts’ only response was to dismiss him as a troll.

When a Lie Is Exposed and No One Notices

But it has been the reaction of most of the world that makes it very clear how much most of us have been affected by the lie. Since shortly after the beginning of August, the Neven web site and Joe Romm at www.climateprogress.com have been telling us this was coming and how serious it is. In fact, since at least 2010, both have been telling us how serious the situation was. And so what was the reaction of the world?

Well, in the US, I can find no major publication – or even minor one – pointing out this was coming. When the records actually fell, pretty much all in one week, a week before the end of August, no major publication reported it for until the very end of August, more than a week later than most of the record-setting. Of those that have – Bloomberg BusinessWeek, NBC News, and US News and World Report are a reasonable sample – none has come anywhere near understanding the magnitude of the loss, nor the implications. Over the last three years, only Joe Romm among major commentators has shown an appreciation for the likelihood that this would happen. Only very recently did Paul Krugman connect the dots between his reading of Joe and the implications for climate change’s effect on the global economy. The rest of the news and commentary? Just about nothing.

Meanwhile, in politics, only www.dailykos.com, a so-called “liberal” web site (clearly, part of “them”) has paid this subject the attention it deserves, and then only in the last half-month. We continue to see the spectacle of the Republican party and the 46% of voters who support it denying that either global warming or its human cause is settled science, and pledged to do even less than is already being done to combat it. Abroad, the Australian Prime Minister is threatened with being voted out of office primarily for having pushed a clearly inadequate attempt to combat carbon emissions. Canada’s Harper has persistently been quoted as believing that in the foreseeable future, Arctic sea ice will not melt enough that shippers can bypass Canada’s Northwest Passage. The powers that ring the Arctic Ocean are busy contemplating oil drilling in the Arctic that would increase carbon emissions in future, and the first vessels from Shell, an oil company, only failed to start exploration this summer because they failed to ready themselves in time.

In other words, some form of belief in the lie is pervasive. Either we believe that global warming isn’t happening, or that it isn’t human-caused, or that Arctic sea ice has nothing to do with it, or that we don’t need to do something about it, because it won’t happen or affect us in the near future. And even when the data becomes overwhelming and visible in Arctic sea ice, we don’t revisit or connect the dots.

How can this be? And how can we do better at detecting lies?

Doing Better

The first thing to notice about such lies is that they work only if they become embedded in some way in “history.” Often, this happens when the public notices an accusation but not its disproof, as witness the idea that Al Gore claimed he invented the Internet, or John Kerry’s “swiftboating.” Or, the lie simply becomes repeated long enough that those not paying attention assume it’s true. To take a recent example, the so-called Simpson-Bowles commission made no majority recommendation at all – and yet, we hear politicians from both parties claiming the contrary. A couple of years ago I was shocked, when attending a graduation, to hear a prominent business/economics professor at Yale refer to 1933, “when the Great Depression was starting.” Not only did this ignore the steady unraveling since about the great stock market crash of 1929, complete with starving veteran Bonus Marchers in Washington; it also ignored the ways in which revered figures played a role, with America refusing to forgive any of its WWI loans, Winston Churchill clinging disastrously to a gold standard, Andrew Jackson’s deep-sixing of a National Bank leading to a series of severe recessions of which this was only the latest, and the failure to regulate separation of bank and investment company – hence the junking of those FDR regulations, mainly by business and Republicans, in the late 1990s. I have written about similar “false memories,” as I perceive them, in the computer industry.

And so, the work of doing better begins with combating the lack of accurate “institutional memory.” This should not be as difficult as it sounds, in business as in politics, because the particular person who has something to gain from a particular version of the lie has often moved on by five years down the line. It is therefore important, even if it seems not so, to bring back the truth if it has been distorted, and to keep the truth alive in your mind. It is important to remember.

But that, it seems to me, is only half the task. We may, at some times, be constantly bombarded by these lies. Those who surround themselves by a cocoon of lies create a worldview and invite you in – and even if you do not enter, it is very hard to not always have second thoughts or to begin to think the same way. There’s a marvelous Mark Twain joke about the man who hated his neighbors and started a rumor there was gold in Hell. A little while later a friend stopped by and saw him packing to go there himself. Why? The neighbor asked. Well, the man said, I got to thinking there must be something in that rumor.

However, lies are crafted, piece by piece, as needed, and the contradictions and seams begin to show more and more as you examine them. The truth, by contrast, hangs together – the loose ends are those that have not yet been fully investigated. And this is particularly true of scientific truth – which we call scientific theory. Your job, as laymen, is to look at the information provided and ask, what’s the model? How does it cover everything? What does it predict in these situations? And only then do you ask, are those predictions near reality, as far as you can tell? For example, ask, what is a free market? And only after that do you ask, does that make sense to you? Does it really seem to capture what happens to you in your work? What more is needed?

And finally, I should add that we should be humble about connecting the virtues and vices of the person with whether something is lies or the truth. Yes, there’s a connection, as in the old joke that noted that once a person first starts in to murder, eventually even Sabbath-breaking is not beyond his capability for evil. But it’s not a simple connection. The connection is more between the person’s ability to perceive reality and the truth or between the person’s expertise and the subject at hand. Al Gore understands climate science pretty well; but given a choice between his model and that of James Hansen, I’ll start with Hansen first, even though Hansen’s politics is alien to me.

It’s Just a Flesh Wound!

We laugh at the memorable Monty Python routine in which the Black Knight, amputated in most limbs, refuses to recognize any problem and demands that our hero continue fighting – “It’s just a flesh wound!” And that is precisely what Anthony Watts, James Inhofe, and the like are saying today – and will probably continue to say, in one form or another, indefinitely.

However, I must point out that in this, at least, I and many more like me have been able, even as laymen, to see through the lie. And I did it more or less as I described above: refused to accept the false implanted memories of Al Gore, refused to buy into the assertions of “us” vs. “them”, and took some time to put together a model in layman’s terms for Arctic sea ice and global warming in general, based on reflection on scientific papers as much as or more than assurances by folks such as Neven and Joe Romm. And so, for more than two years, I have been saying that this time was coming sometime between 2012 and 2015, that volume would turn out to be the key metric, and that decline was exponential, not linear. And we’ve been in the middle of the plausible, not on the outer fringe, as denier interpretations of scientific conservatism would have you believe. So maybe it’s time for you to consider applying this either to global warming – which is about as important as it gets – or to ideas like data virtualization or agile marketing.

And one more thing: once you’ve handled the lie, one effective way of combating it going forward is simply, whenever possible, to nail a specific version of it that’s clearly false. Not in the denier’s cocoon; outside, in blog comments where all are welcome, or in conversations where it is permitted to say, that’s not true. It is amazing how that kind of modest but powerful statement gets across to the persuadable, where ad hominem argument obeys a kind of Gresham’s Law and makes the reader see all as indistinguishably bad.

Watts may be mortal, but lies are much more durable. It is one of our tasks, not to hope that all the big questions will not force us to do something, but to make a good effort to perceive big lies, so that the truth never quits, either. Because if the truth never quits, then there is some hope that a big lie will be only a flesh wound. Rather than the cause of a massive human disaster. Happy Labor Day.

Thoughts From a Retired Software IT Analyst