Tuesday, November 19, 2013

MIni-Post: The Past Isn't Even Past -- Unless You Use "Proactive Analytics"

I'm referring here to a quote from William Faulkner: "The past isn't dead. It isn't even past." More specifically, I am noting the degree to which past mistakes in dealing with the customer get embedded in the organization, and hence repeated, leading to greater and greater aggravation that eventually results in a divorce over what seems to the organization to be the most trivial of customer pretexts.

I was reminded of this quote when I visited one of the demo booths at IOD 2013 and found a fascinating example of the application of artificial intelligence to analytics -- so-called "proactive analytics." Essentially, the app combs through past data to see if it's consistent, or whether some of it is possibly wrong, and then comes up with a suggested change to the data or metadata (as in, a change to one's model of the customer) to fit the amended data.

To me, this is one of the good things about artificial intelligence, as opposed to the "knowledge-base" approach that is somehow supposed to produce human-equivalent intelligence (and about which I have always been a bit skeptical). One of the key original insights of AI was to divide reasoning into logical (where there is an adequate base of facts to establish a good rule about what to do) and intuitive (where facts are incomplete or unclear, so you have to muddle through to a rule) processes. In intuitive situations, AI found, the quickest way to a rule that gets closest to the underlying reality is to focus on the data that didn't fit, rather than focusing on data that seems to confirm an initial hypothesis. And so, proactive analytics apparently targets this highly useful AI insight, focusing on what doesn't fit and thereby iteratively coming up with a much better model of the customer or the process.

All this is abstract; but then the demo person came up with an example that every company should drool over; combing through customer records and detect deceased, moved or terminated customers. And to me as a customer, this holds out the real possibility that I will no longer get sales literature about an estate-property-sale LLC long since terminated, letters addressed to "Wayne Keywochew, Aberdeen Group" (a stint that ended 9 years ago), or endless magazine subscription offers because of something I once ordered. Multiply me by millions, and you have an incredible upgrade in your customer relations.

Monday, November 11, 2013

James Hansen’s Climate Change Magnum Opus: Unsurpassed Horror and Sad Beauty

I warn you that this description of the latest draft paper by James Hansen and others should horrify you, if you are sane. After Joe Romm first published its sound bite (30 degrees Fahrenheit increase if most fossil fuels are burned, 50 degrees at higher latitudes), I delayed reading it in detail. Now that I have, I find it builds on his (and others’) 40 years of work in the area and the latest research to provide an up-to-date climate change model whose implications are mostly more alarming than any I have seen elsewhere.

What follows is my layperson’s attempt to summarize and draw further conclusions. I scant the discussion of new analyses of previous episodes of global warming (and cooling) that allow the development of the new model, focusing instead on the mechanisms and implications of the model. Please note that, afaik, this is the first model that attempts to fully include the effects of methane and permafrost melting.

It’s About CO2

The first insight in the new model I summarize as follows:

As atmospheric CO2 increases or decreases, global average temperature increases or decreases proportionally, with a lag either way typically of a few decades.

This increase or decrease can be broken down into three parts:

1. The immediate effect of the CO2 itself -- perhaps 60% of the total effect.

2. The immediate and “over a few decades” effect of other “greenhouse gases” , or GHGs (here we are talking particularly about the methane in permafrost and methane hydrates on the continental shelves, released by warming, as well as GHGs such as nitrous oxide) – perhaps 20% of the total effect.

3. The so-called “fast feedback” effects, in which the released CO2 and other factors (e.g., increased albedo) lead to additional warming “over a few decades”.

Two quick notes: First, Hansen does not do my split; instead, he distinguishes between the effects of CO2 and the effects of other GHGs over the medium term (about 75-25) and then separately distinguishes between the immediate overall “climate sensitivity” and the medium-term or total “climate sensitivity” (again, about 75 % immediately and 100 % in the long term). Second, the “over a few decades” is my interpretation of how quickly “X times CO2” seems to match global temperature data over more recent sets of data. Hansen might very well say that this may or may not occur quite this rapidly, but it doesn’t matter to him because even with a thousand-year time frame for the full effect, CO2 will not be recycled out of the atmosphere for “few thousand years”, so we still reach the full “climate sensitivity”.

Just to get the usual objections out of the way, Hansen is not saying that CO2 always leads the way – on the contrary, in Ice Age scenarios in which a certain point in a Milankovitch cycle causes extreme winters in our northern hemisphere, leading to increased glaciation and therefore decreased albedo and CO2 release to the atmosphere, CO2 follows other factors. Today, however, primarily because of fossil-fuel emissions, CO2 is leading the way.

A sub-finding, still important, is that there is a linear relationship (again, sometimes with a lag of decades) between deep-ocean temperature change and atmospheric temperature change (expressed as “the change in temp at the surface is somewhere between 1.5 and 2.5 times the change in temp of the deep ocean” – or, about 67% of the global temperature increase goes into surface temps, 33% into deep ocean temps). I include this because it seems that the recent “slowdown” in global surface temperature ascent is primarily caused by increased accumulation in the deep ocean. However, again in a relatively short time frame, we should go back to more rapid average global surface temperature increases, because we’re still increasing atmospheric CO2 rapidly and 2/3 of that will start again going back into surface temps.

The Effect of “CO2 Plus” Is Bigger Than We Thought

In the past, Hansen among others has seen the effect of doubled CO2 as somewhere in the 2-3 degrees Celsius range. Now, he sees a range of 3-4 degrees C – apparently, primarily because he now takes into account “other GHGs”. To put it more pointedly, in my own interpretation:

Each doubling of CO2 leads to a global temperature change of 2.25-3 degrees Celsius (4-5.4 degrees F) “over a few decades”, and to a change of 3-4 degrees C (5.4-7.2 degrees F) “over 1 or 2 centuries.”

I mention this not only because the consequences of today’s global warming are more dire than we thought (i.e., the effects of that warming, immediately and over the next century or two), but also because many of us are still hung up over that “stop emissions and hold the increase to 2 degrees C” target that was the main topic at recent global governmental summits. The atmospheric CO2 level at the beginning of the Industrial Revolution was about 250 parts per million (ppm), and is now at about 400 ppm. If you do the math, that means we have baked in at least 2.2-3 degrees C of global temperature increase already. After 15 years of inaction, that target now has zero chance of success.

At this point, I want to do a shout-out to those wonderful folks at the Arctic Sea Ice blog and forum. Hansen specifically notes the data supporting melting of Arctic sea ice, plus collapse of the Greenland and West Antarctic ice sheets, at levels slightly below today’s CO2. He also notes data supporting the idea that Greenland and West Antarctica can go pretty rapidly, “in a few centuries”, iirc – I interpret “in a few centuries” as within 250-450 years from now.

The Percent of Fossil Fuels We Need To Leave In The Ground Forever Is Greater Than We Thought

Before I get to the consequences if we don’t leave a percentage of fossil fuels in the ground, let’s see how the minimum amount of fossil fuels burned before we reach “worst consequences” has changed. Today’s estimate of total recoverable fossil-fuel reserves (coal, oil [primarily tar sands and oil shale], and natural gas) is about the equivalent of 15,000 Gt C (billions of tons of carbon emitted). Of this, coal is about 7.3-11 Gt C, and the rest is split approximately equivalently between natural gas and tar sands/oil shale. Originally, we thought that burning 10,000 Gt C in the next century would get us to “worst consequences”. Now, Hansen places the correct amount as somewhere between 5,000 Gt C and 10,000 Gt C. Reading between the lines, I am placing the range as 6,000-7,000 Gt C, with 5,000 Gt C if we want to be ultra-safe, and I’m estimating coal as 60% of the emittable total, 20% tar sands/oil shale/oil, 20% natural gas. Note, btw, that according to Hansen fossil-fuel emissions have increased consistently by about 3 % per year since 1950, including last year. At that rate, we’d reach 6,000-7,000 Gt C in about 65-70 years.

Again, note that Hansen breaks the fossil fuels down as coal, traditional oil/gas, and oil shale/tar sands/fracked gas, so I’m guesstimating the equivalents.

So here’s the way it works out:

If we burn all the coal plus a very minor amount of everything else, we reach “worst consequences.”

If we burn all of everything but coal and 33% of the coal, we reach “worst consequences”.

If we burn 17% of the coal, 50% of the natural gas, and all the tar sands/oil shale/oil, we reach “worst consequences”.

So this, imho, is why I agree with Hansen that allowing the Keystone XL pipeline is “game over” for the climate, as in “worst consequences almost inevitable”. The Keystone XL pipeline is a “gateway drug” for tar sands and oil shale. The source (Alberta, Canada) has a large part of the known tar sands oil, and presents similar difficulties in abstracting and processing to oil shale. It’s the furthest along in terms of entering the world market. If that source succeeds, as the saying goes, once the nose of the camel is in the tent, you may expect the rest of the camel to enter. In this case, if Alberta succeeds in getting the Keystone XL pipeline, it is probably the case that most of the tar sands and oil shale will be used; if not, probably not.

Right now, Alberta has no real buyers except the US, and the US is not set up to accept the oil, nor Canada to ship it to them in bulk. The pipeline would effectively create an infrastructure to ship it, primarily to the rest of the world, which presumably would accept it – especially China – creating a market that allows Alberta profitability. Alternatives are much more costly, are susceptible to pressure from the US, and would probably not be undertaken at all. Note that increased shipment via truck is more costly, and would probably require major investments in truck structure, to handle the more toxic tar-sands crude, so that it is probably not a large-scale alternative that would make the project a success. Likewise, trains and tracks to the Canadian ports to ship directly to world markets would probably prove too costly.

Now go back to the model. It’s pretty darn likely we’ll burn 17% of the coal no matter what, and the majority of the natural gas. Now add the tar sands and oil shale. Worst consequences, here we come.

The Worst Is Likelier Than We Thought, Arrives Sooner, Is Almost As Bad As Our Worst Nightmare, And Is More Inescapable Once We Get There Than We Hoped

We’ve already dealt with “likelier than we thought”, and we can guess from the rapidity of response to atmospheric CO2 rise and the increase in the estimated climate sensitivity to atmospheric CO2 that it arrives sooner than we had projected. But what is this “worst consequences almost as bad as our worst nightmare”, and “worst consequences, once arrived, more inescapable that we hoped”?

For us, the worst consequences are not “snowball Earth”, locked in eternal ice, but “runaway GHG Earth” a la Venus, with the surface and air too hot and too acid to support water or any life at all (water vapor in the atmosphere vaporizes from the heat long before it reaches the surface). It’s an inescapable condition, since once the atmosphere locks in the heat, the Sun’s heat from outside trapped by the CO2 and other gases in the atmosphere balances escaping heat from the troposphere (top of the atmosphere). Hansen’s model shows that we are still 100 million to 1 billion years from being able to reach that state, even by burning all fossil fuels in a gigantic funeral pyre.

The worst consequence, as cited before, is therefore as cited at the very beginning, Joe Romm’s sound bite: 30 degrees F increase globally, 50 degrees in the high latitudes. Here’s Hansen’s take on what that means: it will take all areas of the Earth except the mountains above 35 degrees C “wet bulb temperature” during their summers. That in turn, according to Hansen, would mean the following:

In the worst-consequence world, humans could survive below the mountains during the day outside only for short periods of time during the summer, and there would be few if any places to grow grains.

Effectively, most areas of the globe would be Death Valley-like or worse, at least during the summer.

Here I think Hansen, because he properly doesn’t diverge into movement polewards of weather patterns and the effects of high water and possible toxic blooms, underestimates the threat to humanity’s survival. Recent research suggests that with global warming, tropic climates stretch northwards. Thus, projections for the US (not to mention Europe below Scandinavia, Australia, southern Africa, and southern Russia) is for extreme drought. How can this be, when there will be lots of increased water vapor in the air? Answer: it will be rare in falling, and far more massive and violent when it does. The heat will bake the ground hard, so that when it does rain, the rain will merely bounce off the ground and run off (with possible erosion), rather than irrigating anything. Add depletion of aquifers and of ice-pack runoff, and it will be very hard to grow anything (I suppose, mountains partially excepted) below Siberia, northern Canada/Alaska, and Scandinavia.

However, these have their own problems: rains too massive (and violent) to support large-scale agriculture – which is why you don’t see farming on Seattle’s Olympic Peninsula. The only “moderate-rainfall” areas projected as of now, away from the sea and the equator, are a strip in northern Canada, one in northern Argentina, one in Siberia, and possibly one in Manchuria. Most of this land is permafrost right now. To even start farming there would require waiting until the permafrost melts, and moving in the meantime to “intermediate” farming areas. Two moves, minimal farmland, and greater challenges from violent weather. Oh, and if you want to turn to hunting you’ll be lucky if you have an ecosystem that supports top-level meat animals, not to mention the 90% of plant and animal species that will likely be extinct by then. As for the ocean, forget about it as a food source, unless you like jellyfish (according to research done for the UN recently).

In my version of Hansen's worst-consequence world, we would try to survive on less than 10 % of today's farmland, less than 10% of the animal and vegetable species with disrupted ecosystems, and practically zero edible ocean species, in territory that must be developed before it is usable, in dangerous weather, for thousands of years.

Hansen notes that one effective animal evolutionary response to past heat episodes has been hereditary dwarfism. Or, as I like to think about it, we could all become hobbits. However, because we are heading towards this excessive heat much faster than in those times, we can’t evolve fast enough; so that’s out.

What about inescapable? Well, according to Hansen, CO2 levels would not get out of what he calls the “moderately moist greenhouse” area for thousands of years, and would not reach close to where we are now until 10,000-100,000 years hence. By which time, not only will we be dead, but most of humanity, if not all.

Now, I had feared the Venus scenario, so the worst consequences are not as bad as I thought. However, the increased estimate for temperatures in the moderately moist greenhouse and the wet bulb temperature consideration makes the next-worst scenario more likely than before to end humanity altogether.

Snowball Earth: Sad Beauty of a Sidelight

Having said all this, Hansen at least gives a beautiful analysis of why we don’t wind up a “snowball Earth” (the opposite scenario from a “runaway greenhouse”). He notes that once the Earth is covered with ice, carbon can’t be recycled to the Earth via “weathering” (absorption from the atmosphere by rocks whose surfaces are abraded by wind and water). So volcanic emissions and the like put more and more carbon dioxide in the atmosphere, until the temperature warms up enough and melting of the ice begins. Apparently, evidence suggests that this may have happened once or twice in the past, when the Sun was delivering less light and hence heat.

Envoi

The usual caveats apply. Primarily, they fall in the category of “I was reading Hansen out of fear, and so I may be stretching the outer limits of what may happen, just as Hansen may be understating out of scientific conservatism.” Make up your own mind.

I am reminded of a British Beyond the Fringe comedy skit about WW II, suitably amended:

“Go up in the air, carbon. Don’t come back.”

“Goodbye, sir. Or perhaps it’s ‘au revoir’?”

“No, carbon.”

And what will it take for humanity to really start listening to Hansen, and to the science?

Sunday, November 10, 2013

Mini-Post -- IBM IOD -- Consider Informix For Time

One more IOD mini-post, and then I'm out of time (pun intended).

It was nice to see the Informix folks again, and for all their past database innovation I owe them some mention. They introduced object-relational, among other things, plus "blades" to provide specific performance speed-up for particular querying tasks, and they even (although I never saw it implemented) proposed a relational feature that eliminated the sort step.

In today's IBM, their impact is necessarily muffled, although they still have the same virtues they always had: An indirect channel untouchable by Oracle that services the prevalent medium-sized businesses of Germany and the new SMBs of China, among others; mature object-relational and relational technologies with scalability in certain query types, and with administrative simplicity relative to the Oracles of the world; and mastery of time-series data. As I understand it, their near-term plans involve attempting to expand these strengths, e.g., by providing "edge" Web streaming-data processing for SMBs.

From my viewpoint, fairly or unfairly, the place the general user public ought to pay attention to Informix is in handling time-series data. In recent years (again, this is my imperfect understanding) Informix has expanded its intertemporal "blade" technology to provide compression of (and hence very rapid querying of) time-series data via a process very similar to that of storage companies' dedupe "windows" -- taking a certain time frame, noting that only changes need to be stored given and original value, and hence compressing the time series data by (in storage's case, and probably in Informix's) 70-90%. I should also note that Informix can take advantage of IBM DB2 BLU Acceleration to further compress and speed querying of 1-to-2-column-per-query time-series data -- it's an earlier version of BLU, apparently, but one tuned more, and Informix provides a simple "fork" when it's time to go columnar.

How would I suggest any enterprise -- not just present Informix customers -- could use this time capability? Well, let's face it, more and more data from the Web has part of its value as assessing behavior over time -- social network analysis, buying patterns, sensor data, the buying process. For this type of analytics, I would argue that neither the streams processor up front (can't store enough historical data) nor the data warehouse (not optimized for time-series data) is the best fit. Rather, I would suggest a "mart" running Informix parallel to the main data warehouse, storing only time-series data, with a common front-end for querying -- somewhat akin to what I have seen called an "operational data store", or ODS (I tend to use the term ODS for something quite different, but that's a whole other conversation).

This would be of value not merely because it provides better performance on queries on time-series data. Let's face it, time-series data has up to now not been considered worthy of separate consideration by today's enterprises. And yet, our understanding of the customer should be far more enriched by understanding of customer processes and changing customers than it has been. Creating such an Informix ODS would at least start the IT-business-exec conversation.

Just a thought ... and hopefully a timely one :) Ttfn.

Mini-Post -- IBM IOD -- The Customer-Driven Big Data Enterprise

I have been challenged by several people as follows: If you think that users are not spending as much as they should on Big Data because they don't see it as a process, but rather as a series of one-shot "big value-add insights", what then is the process they should create? I don't pretend to have all the answers, but here, off the cuff, are my thoughts.

My first reflex in answering these questions is to recommend something "agile" (my definition, not the usual marketing hype). However, in today's un-agile enterprise, and particularly dealing with un-agile senior executives, that won't work. Btw, there's a wonderful phrase for this kind of problem, that I credit to Jim Ewel of Agile Manifesto fame: HIPPO. It stands for the HIghest-Paid Person in the rOom, and it refers to the tendency of decision-making to be made according to the market beliefs of the HIPPO rather than customer data. Still, I believe something can be salvaged from agile marketing practices in answering the question -- the idea of being customer-data-driven.

Next, I assert that the process should have a Big Data information architecture aimed at supporting it. If we are to use Big Data in gaining customer insights, then our architecture should allow access to and support integration of the three types of Big Data sources: Loosely, (1) sensor-driven (real-time streams of data from Web sensors such as GPS tracking and smartphone video), (2) social-media (the usual Facebook/Pinterest sources of customer interest/interaction unstructured data), and (3) the traditional inhaled in-house data that tends to show up in the data warehouse or data marts.

The process itself would be one of iterative deeper understanding of the customer, equal parts understanding the customer as he/she is now (buying procedures/patterns plus how to chunk the present and potential parts of the market) and where he/she is going (changes in buying behavior, new technologies/customer interests, evolution of present changes via predictive analytics carefully applied -- because agile marketing tells us there's a danger in uncritical over-application of predictive analytics). The process would be one of rapid iteration of customer-focused, Big-Data using insights, typically by data scientists, often feeding the CMO and marketing first, as befits the increased importance in today's large enterprise of the CMO.

What I suggest you have in this process is a Big-Data-focused, customer-insight-driven enterprise-driving analytical process. Or, for short, the "customer-driven enterprise." As in, Big Data for the customer-driven enterprise.

Mini-Post -- IBM IOD -- DB2 BLU Acceleration Revolution

Rounding out my quickies from the IBM Information on Demand conference ... I note from conversations with BLU experts that they are seeing "2-5 times acceleration" via tuning over the initial release; and I would expect that acceleration to show up in a new release some time in the next 3-6 months. I did expect a performance boost, although more in the 50%-100% range. If form holds true, we should see another performance boost through customization for particular workload types, maybe of 50-100%, conservatively. Total at some point in the next 9 months, at a guess: 3-10 times acceleration. Or, an order of magnitude on top of an order of magnitude, leading to 2 orders of magnitude compared to the obvious competition.

When are people going to admit that this is not just another database technology?

Tuesday, November 5, 2013

Mini-Post -- IBM IOD -- Data Scientist

This is the first in a series of mini-posts based on what I’ve been hearing at IBM IOD. They are mini-posts because there are too many thoughts here worth at least mentioning, and hence no time to develop the thoughts fully.

One key difference with past vendor data-related presentations is the prominence of the “data scientist.” I wish folks hadn’t chosen that term; I find it confuses more than it enlightens, giving a flavor of scientific rigor, data governance, and above all emphasis on unmassaged, unenlightening data. Rather, I see the “data scientist” more as an analyst using Big Data to generate company-valuable informational insights iteratively, building on the last insight – “information analyticist” for short. Still, it appears we’re stuck with “data scientist”.

The reason I think users ought to pay attention to the data scientist is that in business terms, he or she is the equivalent of the agile developer for information leveraging. The typical data scientist, as presented in past studies, goes out and whips up analysis after analysis to pursue cost-cutting or customer-insight-using insights. This is particularly useful to the CMO, who is now much more aware of the need to understand the customer better and get the organization in sync with company strategy – because they are often entirely unmotivated to do so now as a result of cost-cutting focuses.

Effectively, a focus on the data scientist as the spearpoint of a Big Data strategy ensures that such a strategy is far more likely to be successful, because it will be based on the latest customer data rather than senior executive opinion. If vendors truly want Big Data to be successful, the data scientist role in an organization is one that they and the firms themselves badly need to encourage.

Thursday, October 10, 2013

The Good News From Composite Software/Cisco: To ‘Global’, Faster Data Virtualization And Beyond

Like some Composite Software users represented at their annual “Data Virtualization Day” today, my concerns about the future of Composite as a Cisco acquisition had not been completely allayed before the conference – and yet, by the end, I can say that my original concerns have been replaced by a hope that Composite Software will deliver user benefits well beyond what I had anticipated from Composite Software going it alone, over the next few years. In fact – and this I really did not expect – I believe that some of these benefits will lie well outside of the traditional turf of data virtualization.

Of course, with hope comes new concerns. Specifically, Composite Software’s roadmap now involves an ambitious expansion of their solutions, and therefore of product-development tasks. With Composite Software’s track record and intellectual capital, I have little doubt that these tasks will be accomplished; with new folks to be brought on board, I am not sure how long full implementation will take. And, as an analyst greedy on behalf of users, I would argue that an implementation of most of the goals set forth, within the next two years, would be far more valuable to IT. But this is far more of a nit than questioning the future of data virtualization without the impetus of its typical technology leader.

My change of mind happened with a speech by Jim Green, long-time technology driver at Composite and now General Manager of his own Business Unit within Cisco. It was, imho, the best speech, for breadth and accuracy of vision, I have heard him give. Enough of the lead-in; let’s go on to my analysis of what I think it all means.

Business As Unusual Plus Three New Directions

When I say “business as unusual” I mean that many of the upcoming products and aims that Jim or others have mentioned fall firmly in the category of extensions of already evident technology improvements – e.g., continued performance fine-tuning, and support for more Web use cases such as those involving Hadoop. I don’t want to call this “business as usual”, because I don’t see too many other infrastructure-software companies out there that continue to anticipate as well as reactively fulfil the expressed needs of users dealing with Web Big Data. Hence, what seems usual Composite-Software practice strikes me as unusual for many other companies. And so, when Jim Green talks about extending data-virtualization support from the cloud to “global” situations, I see business as unusual.

Beyond this, I hear three major new directions:

Software/app-driven transactional network optimization;
The “virtual data sandbox”; and
“composite clouds”.

Let’s take each in turn.

Software/app-driven transactional network optimization

It has been obvious that a driver of the acquisition was the hope on the part of both Composite Software and Cisco that Composite could use Cisco’s network dominance to do Good Stuff. The questions were, specifically what Good Stuff, and how can it be implemented effectively without breaking Composite’s “we handle any data from anyone in an open way” model.

Here’s the way I read Jim Green’s answer to What Good Stuff? As he pointed out, the typical Composite cross-database query takes up 90% of its time passing data back and forth over the network – and we should note that Composite has done quite a bit of performance optimization over the years via “driving querying to the best vendor database instance” and thus minimizing data transmission. The answer, he suggested, was to surface the network’s decisions on data routing and prioritization, and allow software to drive those scheduling decisions – specifically, software that is deciding routing/prioritization based on transactional optimization, not on a snapshot of an array of heterogeneous packet transmission demands. To put it another way, your app uses software to demand results of a query, Composite software tells the network the prioritization of the transmissions involved in the resulting transactions from you and other users, and Cisco aids the Composite software in this optimization by telling it what the state of the network is and what the pros and cons of various routes are.

The answer to avoiding breaking Composite’s open stance is, apparently, to use Cisco’s open network software and protocols. As for implementation, it appears that Cisco surfacing the network data via its router and other network software (as other networking vendors can do as well), plus Composite embedding both transactional network optimization and support for app-developer network optimization in its developer-facing software, is a straightforward way to do the job.

What is relatively straightforward in implementation should not obscure a fundamentally fairly novel approach to network optimization. As in the storage area, it used to be the job of the bottom-of-the-stack distributed devices to optimize network performance. If we now give the top-of-the-stack applications the power to determine priorities, we are (a) drawing a much more direct line between corporate user needs and network operation, and (b) squarely facing the need to load-balance network usage between competing applications. It’s not just a data-virtualization optimization; it’s a change (and a very beneficial one) in overall administrative mindset and network architecture, useful well beyond the traditional sphere of data virtualization software.

The”Virtual Data Sandbox”

Jim described a Collage product that allowed self-service BI users to create their own spaces in which to carry out queries, and administrators to support them. More broadly, the idea is to isolate the data with which the ad-hoc BI end user is playing, where appropriate, by copying it elsewhere, while still allowing self-service-user queries on operational databases and data warehouses where it is not too impactful. More broadly, the idea is to semi-automatically set up a “virtual data sandbox” in which the data analyst can play, allowing IT to focus on being “data curators” or managers rather than putting out unexpected self-service-user “query from hell” fires all the time.

My comment from the peanut gallery is that this, like the software-driven transactional optimization described in the previous section, will take Composite well beyond its traditional data-virtualization turf, and that will turn out to be good for both Composite/Cisco and the end user. Necessarily, evolving Collage will mean supporting more ad-hoc, more exploratory BI – a business-user app rather than an IT infrastructure solution. This should mean such features as the “virtual metadata sandbox”, in which the analyst not only searches for answers to initial questions but then explores what new data types might be available for further exploration – without the need for administrator hand-holding, and allowing administrators to do role-based view limitation semi-automatically. Meanwhile, Composite and Cisco will be talking more directly with the ultimate end user of their software and hardware, rather than an endless series of IT and business mediators.

The “Composite Cloud”

Finally, Jim briefly alluded to software to provide a single data-virtualization view and database veneer for heterogeneous data (e.g., social-media data and Hadoop file systems) from multiple cloud providers – the so-called “composite cloud.” This is a more straightforward extension of data virtualization – but it’s a need that I have been talking about and users have been recognizing for a couple of years at least, and I don’t hear most if not all other Big Data vendors talking about it.

It is also a welcome break in the hype about cloud technology. No, cloud technology does not make everything into one fuzzy “ball” in which anything physical is transparent to the user, administrator, and developer. Location still matters a lot, and so does which public cloud or public clouds you get your data from. Thus, creation of a “composite cloud” to deal with multiple-cloud data access represents an important step forward in real-world use of the cloud.

Interlude: The Users Evolve

I should also note striking differences in user reports of usage of data virtualization software, compared with the last few years I’ve attended Data Virtualization Day and spoken with them. For one thing, users were talking about implementing global metadata repositories or “logical data models” filled with semantic information on top of Composite, and it was quite clearly a major strategic direction for the firms – e.g., Goldman Sachs and Sky, among the largest of financial-service and TV/entertainment companies. Moreover, the questions from the audience centered on “how to”, indicating corresponding strategic efforts or plans among plenty of other companies. What I among others envisioned as a strategic global metadata repository based on data-virtualization software more than a decade ago has now arrived.

Moreover, the discussion showed that users now “get it” in implementation of such repositories. There is always a tradeoff between defining corporate metadata and hence constraining users’ ability to use new data sources within the organization, and a Wild West in which no one but you realizes that there’s this valuable information in the organization, and IT is expected to pick up after you when you misuse it. Users are now aware of the need to balance the two, and it is not deterring them in the slightest from seeing and seizing the benefits of the global metadata repository. In effect, global metadata repositories are now pretty much mature technology.

The other striking difference was the degree to which users were taking up the idea of routing all their data-accessing applications through a data virtualization layer. The benefits of this are so great in terms of allowing data movement and redefinition without needing to rewrite hundreds of ill-documented applications (and, of course, loss of performance due to the added layer continues to be minimal or an actual performance gain in some cases), as I also wrote a decade ago, that it still surprises me that it took this long for users to “get it”; but get it they apparently have. And so, now, users see the benefits of data virtualization not only for the end user (originally) and the administrator (more recently), but the developer as well.

Conclusion: The IT Bottom Line

It remains true that good data virtualization solutions are thin on the ground, hence my original worry about the Cisco acquisition. The message of Data Virtualization Day to customers and prospects should be that not only Composite Software’s solutions, but also data virtualization solutions in general, are set for the near and medium-term future on their present course. Moreover, not only are the potential benefits as great as they ever were, but now, in just about every area, there is mature, user-tested technology to back up that potential.

So now we can move on to the next concern, about new potential benefits. How important are software/app-driven transactional network optimization, the “virtual data sandbox”, and “composite clouds”, and how “real” is the prospect of near-term or medium-term benefits from these, from Composite Software or anyone else? My answer to each of these questions, respectively, is “the first two are likely to be very important in the medium term, the third in the short term”, and “Composite Software should deliver; the only question is how long it takes them to get there.”

My action items, therefore, for IT, are to check out Composite Software if you haven’t done so, to continue to ramp up the strategic nature of your implementations if you have, and to start planning for the new directions and new benefits. Above all, bear in mind that these benefits lie not just in traditional data virtualization software uses – but in areas of IT well beyond these.

Wednesday, October 9, 2013

It’s Time to Finally Begin to Create An Enterprise Information Architecture

The sad fact is that, imho, neither vendors nor users are really supporting building a real-world enterprise information architecture – and yet, the crying need for such an architecture and such support was apparent to me eight years ago. The occasion for such musings is a Composite Software/Cisco briefing I am attending today, in which users are recognizing as never before the need and prerequisites for an enterprise information architecture, and Composite Software is taking a significant step forward in handling those needs. And yet, this news fills me with frustration rather than anticipation.

This one requires, unfortunately, a fair bit of explanation that I wish was not still necessary. Let’s start by saying what I mean by an enterprise information architecture, and what it requires.

The Enterprise Information – Not Data – Architecture

What theory says is that an enterprise information architecture gets its hands around what data and types of data exists all over the organization (and often needed data outside the organization) and also what that data means to the organization – what information the data conveys. Moreover, that “meta-information” can’t just be a one-shot, else what is an enterprise information architecture today quickly turns back into an enterprise data architecture tomorrow. No, the enterprise information architecture has to constantly evolve in order to stay an enterprise information architecture. So theory says an enterprise information architecture has to have a global semantics-rich metadata repository and the mechanisms in place to change it constantly, semi-automatically, as new data and data types arrive.

Now the real world intrudes, as it has over the past 15 years in just about every major organization I know of. To the extent that users felt the need for an enterprise information architecture, they adopted one of two tactics:

Copy everything into one gigantic data warehouse, and put the repository on top of that (with variants of this tactic having to do with proliferating data marts coordinating with the central data warehouse), or
“Muddle through” by responding reactively to every new data need with just enough to satisfy end users, and then trying to do a little linking of existing systems via metadata ad-hoc or on a per-project basis.

As early as 10 years ago, it was apparent to me that (1) was failing. I could see existing systems in which the more the data warehouse types tried to stuff everything into the global data-warehouse data store, the further behind the proliferation of data stores in the lines of business and regional centers (not to mention data on the Internet) they fell. That trend has continued up to now, and was testified to, amply, by two presenters at major financial firms at today’s briefing, with attendees’ questions further confirming this. Likewise, I saw (2) among initial users of data virtualization software 8-5 years ago, and today I overheard a conversation in which two IT types were sharing the news that there were lots of copies of the same data out there and they needed to get a handle on it, as if this was some startling revelation.

The long-term answer to this – the thing that makes an enterprise data architecture an enterprise information architecture, and keeps it that way – is acceptance that some data should be moved and/or copied to the right, more central physical location, and some data should be accessed where it presently resides. The costs of not doing this, I should note, are not just massive confusion on the part of IT and end users leading to massive added operational costs and inability to determine just where the data is, much less what information it represents; these costs are also, in a related way, performance and scalability costs – you can’t scale in response to Big Data demands, or it costs far more.

The answer to this is as clear as it was 8 years ago: an architecture that semi-automatically, dynamically, determines to correct location of data to optimize performance on an ongoing basis. An enterprise information architecture must have the ability to constantly optimize and re-optimize the physical location of the data and the number of copies of each datum.

The Sad State of the Art in Enterprise Information Architectures

Today’s briefing is reminding me, if I needed reminding, that the tools for such a global meta-information architecture are pretty well advanced, and that users are beginning to recognize the need to create such a repository and to create it. There was even the recognition of the Web equivalent of the repository problem, as Composite tackles the fact that users are getting their “cloud information” from multiple providers, and this information must be coordinated via metadata between cloud providers and with internal enterprise information. All very nice.

And yet, even in this, a conference of the “enlightened” as to the virtues of a cross-database architecture, there was very little recognition of what seemed to me to scream from the presentations and conversations: there is a crying need for dynamic optimization of the location of data. Those who think that the cloud proves that simply putting a transparent veneer over physically farflung data archipelagoes solves the problem should be aware that since the advent of public clouds, infrastructure folks have been frantically putting in kludges to cope with the fact that petabyte databases with terabyte-per-minute additions simply can’t be copied from Beijing to Boston in real time to satisfy an American query.

And if the Composite attendees don’t see this, afaik, just about every other vendor I know about, from IBM to Oracle to Microsoft to HP to SAP to yada, sees even less and is doing even less. I know, from conversations with them, that many of them are intellectually aware that this would be a very good thing to implement; but the users don’t push them, and they don’t ask the users, and so it never seems to be top of mind.

An Action Item – If You Can Do It

I am echoing one of the American Founding Fathers, who, when asked what they were crafting, replied: “A republic – if you can keep it.” An enterprise information architecture is not only very valuable, now as then, but also very doable – if vendors have the will to support it, and users have the will to implement it with the additional support.

For vendors, that means simply creating the administrative software to track data location, determine optimal data location and number of copies, and change locations to move towards optimal allocation, over and over – because optimal allocation is a constantly changing target, with obvious long-term trends. For users, that means using this support to the hilt, in concert with the global metadata repository, and translating the major benefits accruing from more optimal data allocation to terms the CEO can understand.

For now, we can measure those benefits by just how bad things are right now. One telling factoid at today’s conference: in the typical query in Composite’s highly location-optimized software, 90% of the performance hit was in passing data/results over the network. Yes, optimizing the network as Cisco has suggested will help; but, fundamentally, that’s a bit like saying your football team has to block and tackle better, while requiring that they always start a play in the same positions on the field. You tell me what doubled to 10 times the response time, endless queries from hell, massive administrative time to retrofit data to get it physically close to the user, and the like are costing you.

I would hope that, now that people are finally actually recognizing location problems, that we can start beginning to implement real enterprise information architectures. At the least, your action item, vendor or user, should be to start considering it in earnest.

Wednesday, October 2, 2013

Composite Software, Cisco, and the Potential of Web Data in Motion

The long-term customer benefits of the acquisition of Composite Software, one of the pre-eminent data virtualization vendors, by Cisco, long known primarily for its communications prowess, aren’t obvious at first sight – but I believe that in one area, there is indeed major potential for highly useful new technology. Specifically, I believe that Cisco is well positioned to use Composite Software to handle event-driven processing of “data in motion” over the Web.

Why should this matter to the average IT person? Let’s start with the fact that enormous amounts of data (Big Data, especially social-media data) passes between smartphone/tablet/computer and computer on a minute-by-minute and second-by-second basis on the Internet – effectively, outside of corporate boundaries and firewalls. This data is typically user data; unlike much of corporate data, it is semi-structured (text) or unstructured (graphics, audio, video, pictures) or “mixed”. In fact, the key to this data is that it is not only unusually large-chunk but also unusually variant in type: what passes over the Internet at any one time is not only a mix of images and text, but also a mix that also changes from second to second.

Up to now, customers have been content with an arrangement in which much of the data eventually winds up in huge repositories in large server farms at public cloud provider facilities. In turn, enterprises dip into these repositories via Hadoop or mass downloads. The inevitable delays in data access inherent in such arrangements are seen as much less important than the improvements in social-data and Big-Data access that such an architecture provides.

Now suppose that we could add an “event processor” to “strain”, redirect, and preliminarily interpret this data well before it arrives at a repository, much less before the remote, over-stressed repository finally delivers the data to the enterprise. It would not replace the public cloud repository; but it would provide a clear alternative for a wide swath of cases with far superior information delivery speed.
This would be especially valuable for what I have called “sensor” data. This is the mass of smartphone pictures and video that reports a news event, or the satellite and GPS data that captures the locations and movement of people and packages in real time. From this, the event processor could distil and deliver alerts of risks and buying-pattern changes, key changes on a daily or hourly basis of the rhythms of daily commerce and customer preferences beyond those typically viewed by the enterprise itself, and opportunities available to fast responders.
Does such an event processor exist now? No, and that’s the point. To fulfill its full potential, that event processor would need to be (1) just about ubiquitous, (2) highly performant, and (3) able to analyze disparate data effectively. No event processor out there truly meets any but the second of these requirements.

"It … Could … Be … Done!”

Those old enough to remember recognize these words from Mel Brooks’ movie Young Frankenstein, when the hero is shocked to recognize that his father’s work was not, in fact, as he had put it, “complete doo-doo.” My point in echoing them here is to say that, in fact, the combination of Cisco and Composite Software is surprisingly close to fulfilling all the of the requirements cited above.

Let’s start with “just about ubiquitous.” As regards “data in motion”, Cisco with its routers fills the bill as well as anyone. Of course, event processors on each router would need to be coordinated (that is, one would prefer not to send spurious alerts when data flowing over an alternate route and reunited at the destination might cause us to say “oops, never mind”). However, both Cisco and Composite Software have a great deal of experience in handling in a coordinated fashion, in parallel, multiple streams of data. We do not have to achieve data integrity across millions of routers, merely local coordination centers that adequately combine the data into a composite picture (pardon the pun) – which Composite Software is well experienced in doing.

How about “able to analyze disparate data fast”? Here is where Composite Software really shines, with its multi-decade fine-tuning of cross-data-type distributed data analysis. Better than most if not all conventional databases, Composite Server provides a “database veneer” that offers transparent performance optimization of distributed data access over all the data types prevalent both in the enterprise and on the Internet.

It is indeed, the “highly performant” criterion where Composite Software plus Cisco is most questionable right now. Neither Composite Server nor Cisco’s pre-existing software was designed to handle event processing as we know it today. However, it could be said that today’s event processors conceptually could be split into two parts: (a) a pre-processor that makes initial decisions that don’t require much cross-data analysis, and (b) a conventional database that uses a “cache” data store (still in almost real time) for deeper analysis before the final action is taken. Composite Server probably can handle (b) with some cross-router or cross-machine processing thrown in, while a conventional event processor could be inserted to handle (a).

The IT Bottom Line: Making Do Until Nirvana

Is there nothing that can be done, then, except wait and hope that Composite Software and Cisco recognize the opportunity and fill in the pieces, or some other vendor spends the time to reproduce what they already have? Actually, I think there may be. It’s not the long-term solution; but it mimics to some extent a ubiquitous Web event processor.

I am talking about setting up Composite Software as a front end rather than a back end to public cloud provider databases. A simple multiplexer could “strain” and feed data to multiple data stores using multiple conventional operational databases for the constant stream of updates, as well as to backend Hadoop/MapReduce file systems and traditional databases. Composite Server would then carry out queries across these “data type specialists”, in much the same way it operates now. The main difference between this approach and what is happening now is that Composite Software will get a much smaller subset of provider data at the same time as the file system – and hence will at least deliver alerts on some key “sensor” data well ahead of the stressed-out Big-Data Hadoop data store.

My suggested action item for IT, therefore, is to start conceptualizing such a means of handling Web “data in motion,” and possibly to set up a Composite-Server testbed, to lead on to implementation of an interim solution. I would also appreciate it if IT would gently indicate to Cisco that they would find a full-fledged solution highly desirable. A Web “data in motion” event processor would be a Big new extension of Big Data, with Big benefits, and it seems to me that Composite Software and Cisco are best positioned to make such a solution available sooner rather than later.

It … could … be … done! Let’s … do … it … now!

Monday, September 9, 2013

All Us Drowned, Burned, Starved Rats

Recently I came across a purported map by National Geographic of what the world would look like if all the ice melted. They got stuff wrong (they failed to note that higher ocean temperatures would expand the water, leading to about another 22 feet of rise beyond the 216 feet of land ice melting), and they failed to make clear that higher energy in storms means higher storm surges (like about 40-50 feet instead of 10-20), effectively making the real area affected by ocean salt water (read: it gets into the fresh-water supply, so you can’t live there) close to 300 feet above where it is now. They did note that there would be a vast increase in deserts, and places uninhabitable due to heat (according to Joe Romm’s citation of a study, there’s a 2% decrease in productivity for every degree above 77 F, meaning that when you get to 110 degrees, you can’t do much). Still, the general outlines are clear.

So here’s a brief summary, in no particular order:

• Florida, gone. Louisiana, gone. Mississippi, mostly gone.
• Major US cities on the coast from Boston to NYC to Charleston to Houston to ILA to much of San Francisco (plus a big inland sea), gone.
• Greenland (they got this wrong), an archipelago about ½ its present size. Same for Antarctica.
• Holland, Belgium, Denmark, gone. Eastern England and London, northern Germany, Venice, gone.
• Caspian Sea doubles in size, connects with Black Sea.
• Habitable Libya on the seacoast becomes an island.
• Most of eastern China from Beijing to Shanghai goes underwater, and so does most of Southern Vietnam and Cambodia.
• Bangladesh and much of Pakistan (the Indus valley) is beneath the wave.
• Australia loses much of the coastal strip where four of five Aussies now live, and there’s a new big inland sea or maybe an inlet of the ocean in the center.
• It goes without saying that most of the Pacific islands are gone, gone, gone.
• And, of course, the world’s deltas, where a large percentage of food is grown today, are gone.

All this is for our descendants – like, 2200, not 2030. But we are locking in its inevitability right now. And, as noted, we will also be burned by the heat and starved by the destruction of crop-growing land.

Not a pretty picture.

Monday, August 26, 2013

A Lousy Job Of Embalming Microsoft

So now that Steve Ballmer has announced he’s leaving, the usual suspects – and some on-high commentators who also seem to be missing the mark – have gathered to criticize Microsoft via Ballmer. I am no unthinking fan of Microsoft, and I have had my share of strong disagreements with their strategies and solutions over the years. However, I have found myself in the last decade defending them frequently, simply because their critics seemed to fundamentally misunderstand them. And now, here we go again.

Let me start by a quick thumb-nail sketch of Microsoft’s history and nature as I see it – not at all the typical view presented these days. First of all, Microsoft started by finding itself (by virtue of a lucky contract with IBM, then dominant in PC hardware) with an incipient monopoly in operating system software. The key word here is software – that uniquely flexible foundation for building “meta-solutions” that adapt more rapidly than anything else as technology and customer needs evolve. Microsoft, even then, had two valuable characteristics: it could create foundational software like Microsoft Word that was “orthogonal” – it was relatively user-friendly because the commands it offered were relatively compact and powerful – and it could keep at a solution until it got it right. As a result, Microsoft was able to use its favored position in operating systems to create an effective monopoly in office-system suites, especially when the Windows version of its operating system led to the success of the mouse-using “graphical user interface.” All of this pretty much happened in the 1980s.

Also in the 1980s, Microsoft took the tack of “rough and ready” versions of the operating system open to developers, and it took great pains to nurture those developers. Here, I am talking about the ecosystem of developers in their time either outside of their employer, or in small companies dedicated to producing Windows versions of applications. Apple, by contrast, wanted to control the “quality” of the operating system and sell hardware, so Apple’s share rapidly shrank to the entertainment and education markets, while Microsoft took a larger and larger software share of PCs that were now becoming ubiquitous and the machine of choice as cheap scale-out servers. And so, by the end of the 1990s, Microsoft had adapted enough to the Internet via Internet Explorer to ensure a strong presence in scale-out server farms (the ancestors of the cloud), in businesses, and in the PCs usually used to access the Internet.

What followed in the decade of the 2000s was not so much the encroachment of competition from the likes of Google and Apple as saturation of existing markets. Under Ballmer, Microsoft adapted to federal regulation by making a sort of peace with its enemies, from Sun to Apple to IBM, so that Microsoft joined the pervading “coopetition” approach to the market. Far more importantly, Microsoft got business customers in its “DNA” by hires and investment, so that Microsoft was able to balance its customer markets with a secondary but still very important business market relatively impervious to saturation of the customer market.

Over the last five years or so, however, Microsoft has finally begun to stop seemingly defying gravity in its rapid revenue growth. Microsoft could see the handwriting on the wall, and has been trying to establish a strong position – necessarily, via innovation – in new markets. It has had some success in innovating, and therefore reaping a strong market position, with video-game Kinect, and Windows 8 does represent some needed innovation in its present markets. However, this is too little to move the revenues dramatically in what is a very-large-revenue company. And it’s not clear what will change that.

What’s blocking Microsoft is fundamentally the hold of Apple and Google on developers of smartphone apps. It’s not that Microsoft has lost its own developers, but if it wants a good revenue jump it needs an equal or greater mass of developers in some other market where it is a credible competitor. And recent moves to establish its own app-development encouragement suggest that it’s not doing enough to create a relatively friendly app-dev environment in mobile smartphones/tablets.

The Usual Misunderstandings

You see very little of this in today’s commentary on Steve Ballmer. There’s notes about how he didn’t understand Microsoft’s friction with US and EU governments about its monopolies – actually, that has died down, precisely because he did enough to defang much of it. There’s the “dominance of iOS on mobile” – granted, Apple’s laptops have also made major inroads in the PC/laptop market, but that still leaves Windows with the overwhelming majority of sales and existing boxes, and, as noted, the non-smartphone/tablet market is saturated but not in a major decline, especially if you look at IBM Windows-PC sales compared to Unix/Linux. There’s “it’s all about Apple design”, while a cursory inspection reveals that Samsung Galaxy based on Google Android is more than competitive with Apple design these days, and Samsung is seeing the results in its sales. I’ve already noted the exceptions to the “Microsoft can’t innovate” mantra.

Then there are the more global economic commentators – now that they’ve finally realized that computing software matters, although they’re still underestimating the importance of software compared to hardware. Prof. Krugman posited that Microsoft might do OK despite Apple’s seemingly obvious and eternal dominance, because it is now focused on business. Um, no, Microsoft’s revenues and employees are still not primarily focused on business. Microsoft will do OK because its present markets are saturated, not defunct, because Apple’s relative “innovative” status is going away where it counts, in the actual products, and because Apple is not doing enough for its own developer ecosystem. Hence, as the need for larger-form-factor products for the enduring popularity of longer blog posts (like this!) persists, Microsoft will get its share via not-dead-yet laptops and their hybrid variants.

Or, we can cite Alex Tabarrok as simply noting that Microsoft’s stock jumped when Ballmer announced his resignation. Sorry, the stock market is notorious for predicting 9 out of the last 3 recessions, as the old saying goes.

Above all, commentators continue to misunderstand the ongoing global economic contribution of software to such factors as productivity and “technological” advances. Microsoft’s commitment to a developer ecosystem and its “orthogonal” designs (in the best cases) are simply examples of how the software industry enables a welter of new applications that actually do, once the dust finally settles, produce fundamental innovation that leads to productivity improvement as measured by our economic statistics.

Resignation? Who cares? I don’t. Microsoft? Who cares any more? I do; and you should. For all the reasons no one seems to be talking about.

Wednesday, August 14, 2013

We Have Blamed the Vendors and IT; Is It Now Time to Blame the Business Types?

Reflecting on the recent financial reports of the giants of computing infrastructure – companies like IBM, HP, Oracle, and Microsoft – I am struck by the continuation of a troubling trend: The technology and its usefulness is moving ahead as fast as, if not faster than, ever before; IT has made some striking adjustments; but the revenues to vendors from this, and therefore IT spend, are flat if not falling. Why?

Let’s see. In databases, Oracle and IBM are almost flat, despite (at least in IBM’s case) two technologies in two years that potentially deliver performance and price-performance gains well ahead of Moore’s Law. I find plenty to criticize in Windows 8, but its implementation with regard to business needs, in Windows and Office usage, should deliver benefits comparable to that of the introduction of the mouse-based user interface – and Microsoft enterprise revenues are flat. Cloud technology is not replacing, but rather complementing, in-house architectures, and therefore should require, initially, more rather than less total spending – and yet, increases in cloud spending are instead paired with overall flat or lower user computing spend. No, I don’t buy the “public cloud saves money” rationale – the benefits come later rather than sooner, and involve mainly flexibility and to a much lesser extent multi-tenancy and grid savings, especially when many users are employing multiple cloud providers that must be coordinated. And the IBM and Oracle/SPARC data suggest the biggest hardware decline is in Unix/Linux hardware, the darling of the cloud set, not in mainframes or PCs.

In the past, a typical explanation for this has been the need for the business to cut back IT spend in response to decreasing profits. Of course, that did not prevent IT and computing from becoming a larger and larger part of overall business spending during the 1980s and 1990s. In the early 2000s, IT spend actually decreased proportionally to the cost-cutting of the rest of the business – and yet, despite an initially tepid overall business recovery, computing spending from about 2003 to 2008 rebounded and grew quite briskly. So this pattern of user computing spending is unprecedented in three ways: its length, its unremitting focus on cost-cutting apparently comparable in size to the rest of the business, and its seeming inability to reflect major technology advances with a good claim to deliver major cost or other benefits to the rest of the business.

The Blame Game

In the past, I have found, when things go wrong in computing spending the first reaction is to blame the vendors. In this case, the obvious critique is that they have failed to communicate the benefits of the technology, cost and otherwise. Except that, as I can attest from my experience as an analyst, both the technology advances and the communications from the vendors are as good as, if not in many cases better than, those of most or all of the past. Big Data, for example, is not pure hyperbole, and vendors have done a reasonable job by traditional standards of highlighting the ability of targeted Big-Data analytics to explain and bind the customer as never before, in a cost-effective way.

OK, then the next line of analysis is to blame IT, typically for failing to align themselves with the business’ strategy, or (when times are tough) identify ways of lifting the dead hand of older systems and other seemingly outdated costs. To this, I have one response: agile IT. As never before, IT is aligning itself with agile development, by supporting a software lifecycle that includes operational feedback integrated with development, and by encouraging automation of functions that provides a basis for rapid response and identification of architectural problems and opportunities at the administrator level. As my studies have shown, these plus the increasingly agile nature of the software that IT makes available to the business translate into major cost-cutting and major benefits beyond what traditional IT approaches can deliver. And as for aligning itself with business strategy, IT is well aware of Big Data as a hot topic, hence the sudden and somewhat odd demand for Hadoop experts. No, whatever IT may have done in the past, it seems clear that it has upped its game.

OK, then, who does that leave? And yet, how can we blame the business types? Aren’t these the same folk who drove those increases in IT percentage of spending over the 1980s and 1990s? Aren’t these the most receptive listeners when we talk about the power of embedded analytics to improve business processes and the importance of analyzing the “customer of one”?

There’s Something Not Going On Here …

To understand why there might be cause for concern and, yes, for blame in business strategy – driven by the CEO, the CFO, and their staff – let’s look in very broad-brush terms at what has been going on, not just for the last five years, but for the last three decades. Here are a few of the highlights, as suggested by various studies:

1. Advances in productivity have been accruing 90% to the CEO – ½ in his income as CEO, and ½ in his increases in wealth as an investor. While this still represents a fairly small percentage of overall expense, it does unnecessarily increase the focus on cutting costs for all except the CEO.

2. Hedge funds that sometimes act as turnaround artists or “shadow banks” have taken a good 90% or so of the funds’ investment profits for themselves, and thus receive compensation not just in the tens of millions of dollars but, at the top, in billions of dollars, despite the fact that they deliver less to the investor than an index fund in most if not all cases. While it is difficult to see direct effects of this “fleecing of the rich” on corporate strategies, one significant effect is to increase the fleeced CEO’s focus on growing profits uber alles, to at least get some return from their investments – with the Googles of the world the few happy exceptions.

3. The present long-lasting recession/stagnation has effectively removed perhaps 10% of the workforce from work over a long period of time. This has the effect, since businesses often discriminate against the long-term underemployed, of creating a permanent gap between “potential” GNP and real GNP. To put it another way, even if the economy grows at a reasonable pace, it will still be much smaller it should be. That means smaller markets and, again, more relative emphasis on cost-cutting to achieve profit growth rather than growing revenues to grow profits.

There are three concerns about this excessive focus on cost-cutting. The first, which probably is not serious right now but is far more troubling than five years ago, is that cost-cutting eventually runs out as a strategy if revenues continue to be flat (and, don’t forget, revenues are pretty flat for the bulk of businesses, hence the 1% growth in GNP over the first part of this year). The second concern is that this cost-cutting, applied economy-wide (as it seems to be doing), reinforces and cements smaller markets by eliminating the part of the market funded by the now underemployed. Ordinarily, recessions are too short for this to happen; but it seems to be happening now, as new jobs continue to make no dent in the workforce/working-age-population ratio.

The third concern, which I for one find the most troubling, is the possibility that reflexive cost-cutting becomes the answer to every situation, the reflex of the business. If that is true, it would suggest that businesses are blowing opportunities for savvy increases in computing technology spend because it’s all about cost-cutting, not strategic investment.

Blame or Not, What Might Business Strategists Do Better?

First, I would suggest that businesses set up a long-term plan and process for using analytics to better understand and improve the relationship with the customer. This means acquiring data virtualization software and the like, to allow aggressive searching out and combination of all the new customer information constantly being created. It also means buying the new infrastructure tools (database and information architecture) that can scale to handle aggregation of social-media data on the Web and in-house customer-experience data.

Second, I would argue (not that I expect many businesses to listen) that in order to grow revenues as well as profits (and probably grow profits faster), business types need to acquire and foster use of agile tools and processes – such as agile marketing, with its emphasis on data-driven understanding of the customer and prospect rather than the anecdotal, top-down opinion that has been all too common in the past. As I noted in a recent blog post, an agile business is a possibility, if the CEO and his lieutenants really commit to it – and it typically means not only the CEO using a Scrum-type planning process, but also constant modification and constant feedback up and down the organization about plans. I should also note that an agile process specifically builds in slack for learning and review, and yet (according to my studies) it produces much better cost-saving and profit results than a “cram as much work as possible in, be as cost-efficient as possible” approach, for the CEO as well as the developer.

And then there is the really controversial stuff. Sustainability efforts are, by and large, really underperforming, as a recent article in www.thinkprogress.com confirms my impression that gains in the US in carbon emissions are effectively coming from offshoring the problem. Businesses need to acquire effective global carbon-tracking software, and use it. Businesses need to join to push shared quality regulations that will move all towards new computer and software technology that is more productive, rather than clinging to the infrastructure of the past that in the long run costs more, as in the electrical/energy grid vs. software-coordinated regional combined solar/wind – and rather than enabling the block-all-regulation political excesses of the U.S. Chamber of Commerce and its ilk.

I’m not saying that all this is doable, or that the case for questioning the approach of the overall business to computing technology is clear. I am saying that it’s time we recognize the seriousness of the overall problem of which computing spend is a part, and we stop blaming the usual suspects. Five years is enough. Thirty years is enough. In the long run, a cost-cutting strategy is not enough – and the long run is beginning to arrive.