Monday, October 20, 2014

Data Virtualization Day 2: Looking Ahead


 

One of the most interesting questions raised during DV Day was asked by a speaker of the audience:  We have presented our vision of the future, now what would you think would be a visionary new idea for using data virtualization?  Pretty much on the spot, I came up with two interrelated ideas:
1.       Metadata mining, and

2.       Trend discovery via data-virtualization’s data-discovery features.

Metadata Mining

As data mining spreads increasingly outside the enterprise, via mining of social-media and sensor data, the value of a global metadata repository such as those provided by Cisco/Composite and other DV vendors lies not merely in the ability to coordinate and correct the inconsistencies in internal data, but also to see the connections between internal and external data – for example, when these represent information about the same person or thing – as well as to understand how certain metadata changes in frequency of use over time.  For example, if less data is video and more is texting, that set of facts, surfaced by access counts for video and text metadata, tells us about our customers’ changes in media use – and does so immediately, while being based on actual behavior rather than self-report surveys.

I conjecture that the main value of such metadata mining in the long run will lie in one of two areas: (a) providing new ways to slice the data underlying the metadata, and (b) offering new, broader, and more flexible ways of aggregating the data.  I see (a) happening because while new applications for BI tend today to be generated by new product ideas or new customer trends as publicized on the Web, new data-mining insights from metadata come from actual behavior and information that can’t be easily fit into existing categories.  I see (b) occurring due to metadata’s natural function of fitting data into broad categories and detecting connections between nominally different sets of data, constantly and semi-automatically.

Trend Discovery

In a similar way, the data-discovery features of data virtualization tools could be used to detect new trends relevant to the business by seeing their “footprint” on the Web.  That is, data discovery can constantly monitor the Web for new types of metadata that don’t fit easily into the old categories, and surface it to data scientists for alerting to new types of customer or prospect behavior.

This is particularly attractive because a survey I did 4 years ago indicated that at that time, business execs tended to learn about key new trends on the Web 6 months or more after they first arrived.  That’s a long time these days, in product-version lifecycle terms – and I see no clear signs of major speedup since then. 
Anyway, just some visionary thoughts, for fun …

Thursday, October 16, 2014

Climate Change Odds and Ends

Recently, I have not been commenting on climate change news, mostly because either (a) developments and new research don’t change the overall picture significantly, or (b) because it wasn’t clear whether some developments did indeed constitute significant good or bad news.  At this point, I feel comfortable putting most of the news in category (a) – because for every piece of good news, there’s a piece of bad news.

So, in no particular order, here are some of the things worth noting as extending our knowledge but not changing the overall trend.

First, in the last two years Arctic sea ice minimum has seen a rebound from the awful 2012 season, to a point somewhere around the 2007 (initial plunge) state.  This appears to be a result of a prolonged negative North Atlantic Oscillation (NAO) during melting season, which has allowed ice volume in the Central Arctic to recover in 2013 and then stay the same in 2014.  The NAO, in turn, seems to be related to the lack of an el Nino event – of which, more later.

This would be good news except that it is counterbalanced by two pieces of bad news.  Second (that is, the second piece of news), it turns out that southern seas were undermeasured until now and the amount of heat they have absorbed is much greater than originally thought.  Thus, the “momentum” towards global warming already in the system is greater than we thought.  Third, it now appears that 2014 may well be the warmest year on record, if present trends hold, easily beating 1998, that old standby of climate change deniers. 

The fourth piece of news again is positive:  the el Nino event that seemed very likely to have unprecedented force (and therefore leading to a giant leap in temperature as in 1998 is now projected to be weak and to start between October and December.  Fifth, the bad news associated with that is that the weakening of el Nino is related to unprecedented warming in the waters off Asia that spawn el Ninos – warming that again adds “momentum” to the underlying global warming trend.

The sixth piece of news is that the slowdown in US carbon emissions does now appear to be real, and related to a significantly faster uptake of solar energy than anticipated.  The seventh bad piece of news is that this has had zero effect on overall emissions, due especially to China’s increases in use of coal, plus effective exporting of other countries’ emissions to China via outsourcing. 

Finally, we may note an interesting argument recently put forward in Daily Kos:  fracking is not really economical, since each source is much more rapidly exhausted than in the case of conventional oil.  However, it seems clear that, even were this so, we are still in for several years of carbon pollution from that source, as natural-gas fracking companies seek to hide their losses by sweetheart deals with state and local governments that reduce the cost of drilling for new sources and taking care of the side effects of the old exhausted ones. 

Meanwhile, the rate of carbon emissions rise apparently continues to increase.  Happy Halloween.

Data Virtualization Day: A New, Useful Way To Migrate Legacy Databases

Twenty-odd years ago I compared vendor databases to boat anchors:  they (and the business-critical apps depending on them) are very difficult to sunset, and they tend to act as a drag on needed performance improvements in everything data-related.  Moreover, the basic technologies for performing “database migration” seem to be much the same as they were fifteen years ago:  conversion of dependent apps to the new interface by SQL “veneers” plus modification, reverse engineering, or full rewrite of the underlying app, one by painful one.  You can imagine my happy surprise, therefore, when a customer at this year’s Cisco/Composite Software’s DV Day testified that they were beginning to use a new technique for legacy database migrations, one that should significantly speed up and improve the safety of these migrations.

Before I go into detail about this new technique, I should mention some of the other great and useful ideas that, as usual, surfaced at DV Day.  Among these I might number:

1.       Use of data virtualization to combine traditional Big Data and the streaming events typical of the sensor-driven Web;
2.       A “sandbox” to allow real-world testing of DV apps before rollout;
3.       Additional hybrid-cloud support.

I hope to go into more detail on these in a later post.  The legacy migration idea, however, is worth its own post – indeed, worth attention from all large enterprises jaded by legacy-migration solutions that advance incrementally while the database-dependent app “legacy problem” grows apace.

The Previous State of the Art

Briefly, the problem of database migration – especially across database vendors – is typically much more a problem of migrating the applications written to take advantage of it than of migrating the data itself.  The typical such application, whether it be written using the commands of IBM IMS, CCA MODEL 204, DATACOM DB, Pervasive SQL, Sybase SQL Server, or any other such database, is 10-30 years old and not always that well documented, is partly written using database-specific commands or “tricks” in order to maximize performance, and does not share code with the tens or hundreds of other apps on the same database.  Therefore, migration time will often be projected at more than a year, no matter how much person-power one throws at it.

Broadly speaking, until now each such app migration has involved one of three approaches:

1.       Rip and replace, in which the entire app is rewritten for the new database;
2.       Emulation, in which an “old-database” veneer is placed over the new database, with rewrites only applied where this fails;
3.       Reverse engineering, in which the function of the app is described and then a new, supposedly identical app is generated from this “model” for the new database.

None of these approaches is one-size-fits-all.  Rip and replace runs the risk of missing key functionality in the old application.  Emulation often produces performance problems, as the tricks used to maximize performance in the old database can have the opposite effect in the new one.  Because of the lack of documentation, reverse engineering may be impossible to do, and it also may miss key functionality – although it often pays for itself by “doing it right the first time” in the new environment.

The New Data Virtualization Approach

As Anderson of HSBC described it at DV Day, the data virtualization approach uses a DV tool both as an integral part of a “sandbox” for apps being migrated, but also as a “recorder” of transactions fired at the old database, which serves as a test scenario for the new app.  Separately, none of these “innovations” is new; it is the combination that makes the approach novel and more useful.
Specifically, I see the “data virtualization” legacy-migration approach as new in several ways:

·         Data virtualization already creates database portability for apps, if one writes all apps to its “veneer” API.  The new approach allows the migrating app to join this ultra-portable crowd – and, don’t forget, DV has had almost 15 years of “embedded” experience in providing such a common interface to all sorts of data types and data-management interfaces.
·         The new approach allows a more flexible, staged approach to migrating hundreds of apps.  That is, the DV tool semi-automates the process of creating an performance-considering emulation, and the test scenarios then allow rollout when they indicate the app is ready for prime time, rather than when a separate full-scale test is run with fingers crossed.
·         The DV-tool process of “metadata discovery” means that migration often comes with additional knowledge – in effect, better documentation -- of the apps.

The net of these novelties, I conjecture, is faster (more parallel and more automated) migration of legacy apps, with better performance (counterintuitively, using DV can actually improve app transactional performance), with better future portability, better documentation, and better ability to share data with other apps in the enterprise’s BI-app portfolio via a global metadata repository.  Not bad at all.

The Net-Net


Is short and sweet.  I urge all large enterprises with significant legacy-database concerns to consider this new DV approach and kick the tires.  It is early days yet, but its value-add can hardly help but be significant.

Wednesday, October 1, 2014

How Software Makes Nonsense of Ayn Rand

It amazes me, nowadays, just how many political and business figures – from Rand Paul in the U.S. Congress to Alan Greenspan, the former Fed chairman – cite Ayn Rand as an inspiration.  I first read The Fountainhead more than 50 years ago, and was interested until the climactic trial, in which the prosecution gave a speech unlike any real-life prosecutorial speech I have ever heard.  I have later heard excerpts from the climactic speech in Atlas Shrugged, which suggest that Ayn Rand’s themes in the two books are part of an overall effort to define “libertarianism” as an attempt to distinguish the “makers” from the “takers” in a typical society. 

Thus, for example, in The Fountainhead the “maker” is an architect who is put on trial because he destroys the corrupted version the contractor (apparently, local or state government) has made of his quality work.  In Atlas Shrugged, apparently, similar indignities drive Dagny Taggart and like-minded individuals to remove from the “takers” and their governments the “makers’” technical ability, needed to use modern machinery such as a radio, and to withdraw from society until it realizes its need for them.

Out of such writings, it seems, are political and economic philosophies made.  Let us also note that at the time Ayn Rand wrote, it was barely possible to see the work world and national economies much as Rand saw them:  inventors and hands-on builders who were also heads of larger enterprises, like Henry Ford and Thomas Edison.  Even in the post-war period, with the advent of the veteran CEO who knew how to get things done from executive training in the hands-on Army or Navy, there were possible examples of “makers” to cite as the ones who really made the world work, and made the world a better place.

Today, however, more and more of our products, our solutions, our processes, and our lives are infused with computer software.  And as a result, it is impossible to make Ayn Rand’s vision a reality.  In fact, today, computer software makes what Ayn Rand wrote complete nonsense.

Joining the Few Who Are Far Too Many

 An old British comedy routine spoofs WWII newsreels by intoning, “People flocked to join the few.”

“Please, sir, I want to join the few.”

“I’m sorry, there are far too many.”

That is the first bit of nonsense in Ayn Rand’s philosophy:  that relatively few “makers”, like Atlas lifting the world on his shoulders, run businesses, make things work, and keep progress moving forward.  No:  take any product, from a smartphone to an oil rig, and very few people actually build or design the new hardware.  On the other hand, many programmers over the years have built the software that connects the smartphone to the Web and to others (just like radio!) or monitors the performance of the oil rig, reporting that performance to remote sites looking for promising well sites and prepared for reaction in the case of disaster.  We need hardly add that a vanishingly small percentage of this software was made by executives. 

In other words, over the last fifty years, programmers (and make no mistake, programmers are creative and do their own thing) that are more and more prevalent have taken over more and more of what makes the world run and “progress.”  Take a vacation from making the world run?  How would you contact all those programmers, all over the world, much less get them to agree?  And where would they go to take a vacation until the world needs them?

That leads on to the second bit of nonsense from Ayn Rand:  That “makers” work harder, while “takers” seek to parasitically feed for their own benefit off “makers’” work.  Hedge fund executives work hard, and think creatively about investments; but do they work harder than anyone else?  Heck, no.  Programmers work just as long hours (as do, say, some police or construction workers, to cite a few examples), are just as creative, and their creativity has far more to do with how well the world runs than the creativity of the private equity firm. 

The key point is that if anyone is parasitically feeding off anyone, it’s the CEO or the program-trading hedge fund manager feeding off the programmer.  Moreover, if we remove the manager from the equation, it is relatively easy to substitute other wannabe rich folk for them, while if we remove all programmers, things would indeed slow down quite a bit.  And yet, in Ayn Rand’s definition, programmers are not “makers”, because they don’t know how the underlying hardware works, nor how to run a business.  Summary:  if programmers aren’t “makers”, then all those execs she was glorifying (a) don’t work harder (and aren’t more creative or more impactful on moving the world forward) than programmer “takers”; and if programmers are “makers” and execs are also (according to Rand) “makers”, then the main people feeding off programmers to the tune of billions in cash are other “makers” (i.e., execs).

That leads to the third bit of nonsense:  that government of all stripes is on the side of the “takers.”  We have seen plenty of examples of “maker” programmers inside and outside of government – or, for that matter, hackers.  We should also note the “open source” movement, that seeks to ensure the free availability of software outside of both business and government, and which both government and business are now “parasiting.”  We could cite further examples ad nauseam; but this Randism, today, just doesn’t come close to being true. 

It’s Okay, Honey, You Fulfill a Vital Function

So why don’t the Alan Greenspans and Rand Pauls of the world recognize how completely Ayn Rand fails to fit today’s software world?  My dark suspicion is that they want to believe that somehow they are doing something that makes the world run better, and “the gummint” (due credit to Walt Kelley’s Pogo) is the faceless entity that represents those who want to take well-earned money gained from hard, creative work away. 

I like the way Douglas Adams puts it in one of the Hitchhiker books.  One of his characters, on the run, bumps into a prostitute in a back alley.  Only she isn’t selling sex; she’s selling making businessmen feel better about themselves.  “It’s OK, honey,” she croons to a possible depressed-businessman John , “You’re really needed; you fulfill a vital function; our economy wouldn’t work without you.”

However, because of software, CEOs and investment firms are less and less vital to the functioning of an economy.   Yes, as we’ve seen, their “creativity” can screw things up; but it is less and less important to moving the technology forward.  Yes, they work hard and can be creative; but they simply don’t have the positive impact that they used to.  And so, I wonder if much of the buzz, past and present, about really earning outsized salaries and stock options, sweetheart deals after you leave the political arena with lobbyists, paeans to the “free market” and “laissez-faire” (according to economic theory, these don’t exist in the real world, by definition, because real-world markets don’t have perfect information equally shared), and the like are the output just as much of the prostitute seeking some share of the billionaire’s billions as of the income-unequal CEO himself or herself.

And so, thanks to software, Ayn Rand’s philosophical justification has become real-world nonsense.  In fact, I wonder if what’s going on isn’t the ultimate irony: to see Rand used to justify “taking” by non-governmental CEOs and the like?

But then, I still haven’t forgiven her for that prosecutorial speech.   I hope she’s being forced to code in FORTRAN down in the nether regions.