Like some Composite Software users represented at their annual “Data Virtualization Day” today, my concerns about the future of Composite as a Cisco acquisition had not been completely allayed before the conference – and yet, by the end, I can say that my original concerns have been replaced by a hope that Composite Software will deliver user benefits well beyond what I had anticipated from Composite Software going it alone, over the next few years. In fact – and this I really did not expect – I believe that some of these benefits will lie well outside of the traditional turf of data virtualization.
Of course, with hope comes new concerns. Specifically, Composite Software’s roadmap now involves an ambitious expansion of their solutions, and therefore of product-development tasks. With Composite Software’s track record and intellectual capital, I have little doubt that these tasks will be accomplished; with new folks to be brought on board, I am not sure how long full implementation will take. And, as an analyst greedy on behalf of users, I would argue that an implementation of most of the goals set forth, within the next two years, would be far more valuable to IT. But this is far more of a nit than questioning the future of data virtualization without the impetus of its typical technology leader.
My change of mind happened with a speech by Jim Green, long-time technology driver at Composite and now General Manager of his own Business Unit within Cisco. It was, imho, the best speech, for breadth and accuracy of vision, I have heard him give. Enough of the lead-in; let’s go on to my analysis of what I think it all means.
Business As Unusual Plus Three New Directions
When I say “business as unusual” I mean that many of the upcoming products and aims that Jim or others have mentioned fall firmly in the category of extensions of already evident technology improvements – e.g., continued performance fine-tuning, and support for more Web use cases such as those involving Hadoop. I don’t want to call this “business as usual”, because I don’t see too many other infrastructure-software companies out there that continue to anticipate as well as reactively fulfil the expressed needs of users dealing with Web Big Data. Hence, what seems usual Composite-Software practice strikes me as unusual for many other companies. And so, when Jim Green talks about extending data-virtualization support from the cloud to “global” situations, I see business as unusual.
Beyond this, I hear three major new directions:
- Software/app-driven transactional network optimization;
- The “virtual data sandbox”; and
- “composite clouds”.
Let’s take each in turn.
Software/app-driven transactional network optimization
It has been obvious that a driver of the acquisition was the hope on the part of both Composite Software and Cisco that Composite could use Cisco’s network dominance to do Good Stuff. The questions were, specifically what Good Stuff, and how can it be implemented effectively without breaking Composite’s “we handle any data from anyone in an open way” model.
Here’s the way I read Jim Green’s answer to What Good Stuff? As he pointed out, the typical Composite cross-database query takes up 90% of its time passing data back and forth over the network – and we should note that Composite has done quite a bit of performance optimization over the years via “driving querying to the best vendor database instance” and thus minimizing data transmission. The answer, he suggested, was to surface the network’s decisions on data routing and prioritization, and allow software to drive those scheduling decisions – specifically, software that is deciding routing/prioritization based on transactional optimization, not on a snapshot of an array of heterogeneous packet transmission demands. To put it another way, your app uses software to demand results of a query, Composite software tells the network the prioritization of the transmissions involved in the resulting transactions from you and other users, and Cisco aids the Composite software in this optimization by telling it what the state of the network is and what the pros and cons of various routes are.
The answer to avoiding breaking Composite’s open stance is, apparently, to use Cisco’s open network software and protocols. As for implementation, it appears that Cisco surfacing the network data via its router and other network software (as other networking vendors can do as well), plus Composite embedding both transactional network optimization and support for app-developer network optimization in its developer-facing software, is a straightforward way to do the job.
What is relatively straightforward in implementation should not obscure a fundamentally fairly novel approach to network optimization. As in the storage area, it used to be the job of the bottom-of-the-stack distributed devices to optimize network performance. If we now give the top-of-the-stack applications the power to determine priorities, we are (a) drawing a much more direct line between corporate user needs and network operation, and (b) squarely facing the need to load-balance network usage between competing applications. It’s not just a data-virtualization optimization; it’s a change (and a very beneficial one) in overall administrative mindset and network architecture, useful well beyond the traditional sphere of data virtualization software.
The”Virtual Data Sandbox”
Jim described a Collage product that allowed self-service BI users to create their own spaces in which to carry out queries, and administrators to support them. More broadly, the idea is to isolate the data with which the ad-hoc BI end user is playing, where appropriate, by copying it elsewhere, while still allowing self-service-user queries on operational databases and data warehouses where it is not too impactful. More broadly, the idea is to semi-automatically set up a “virtual data sandbox” in which the data analyst can play, allowing IT to focus on being “data curators” or managers rather than putting out unexpected self-service-user “query from hell” fires all the time.
My comment from the peanut gallery is that this, like the software-driven transactional optimization described in the previous section, will take Composite well beyond its traditional data-virtualization turf, and that will turn out to be good for both Composite/Cisco and the end user. Necessarily, evolving Collage will mean supporting more ad-hoc, more exploratory BI – a business-user app rather than an IT infrastructure solution. This should mean such features as the “virtual metadata sandbox”, in which the analyst not only searches for answers to initial questions but then explores what new data types might be available for further exploration – without the need for administrator hand-holding, and allowing administrators to do role-based view limitation semi-automatically. Meanwhile, Composite and Cisco will be talking more directly with the ultimate end user of their software and hardware, rather than an endless series of IT and business mediators.
The “Composite Cloud”
Finally, Jim briefly alluded to software to provide a single data-virtualization view and database veneer for heterogeneous data (e.g., social-media data and Hadoop file systems) from multiple cloud providers – the so-called “composite cloud.” This is a more straightforward extension of data virtualization – but it’s a need that I have been talking about and users have been recognizing for a couple of years at least, and I don’t hear most if not all other Big Data vendors talking about it.
It is also a welcome break in the hype about cloud technology. No, cloud technology does not make everything into one fuzzy “ball” in which anything physical is transparent to the user, administrator, and developer. Location still matters a lot, and so does which public cloud or public clouds you get your data from. Thus, creation of a “composite cloud” to deal with multiple-cloud data access represents an important step forward in real-world use of the cloud.
Interlude: The Users Evolve
I should also note striking differences in user reports of usage of data virtualization software, compared with the last few years I’ve attended Data Virtualization Day and spoken with them. For one thing, users were talking about implementing global metadata repositories or “logical data models” filled with semantic information on top of Composite, and it was quite clearly a major strategic direction for the firms – e.g., Goldman Sachs and Sky, among the largest of financial-service and TV/entertainment companies. Moreover, the questions from the audience centered on “how to”, indicating corresponding strategic efforts or plans among plenty of other companies. What I among others envisioned as a strategic global metadata repository based on data-virtualization software more than a decade ago has now arrived.
Moreover, the discussion showed that users now “get it” in implementation of such repositories. There is always a tradeoff between defining corporate metadata and hence constraining users’ ability to use new data sources within the organization, and a Wild West in which no one but you realizes that there’s this valuable information in the organization, and IT is expected to pick up after you when you misuse it. Users are now aware of the need to balance the two, and it is not deterring them in the slightest from seeing and seizing the benefits of the global metadata repository. In effect, global metadata repositories are now pretty much mature technology.
The other striking difference was the degree to which users were taking up the idea of routing all their data-accessing applications through a data virtualization layer. The benefits of this are so great in terms of allowing data movement and redefinition without needing to rewrite hundreds of ill-documented applications (and, of course, loss of performance due to the added layer continues to be minimal or an actual performance gain in some cases), as I also wrote a decade ago, that it still surprises me that it took this long for users to “get it”; but get it they apparently have. And so, now, users see the benefits of data virtualization not only for the end user (originally) and the administrator (more recently), but the developer as well.
Conclusion: The IT Bottom Line
It remains true that good data virtualization solutions are thin on the ground, hence my original worry about the Cisco acquisition. The message of Data Virtualization Day to customers and prospects should be that not only Composite Software’s solutions, but also data virtualization solutions in general, are set for the near and medium-term future on their present course. Moreover, not only are the potential benefits as great as they ever were, but now, in just about every area, there is mature, user-tested technology to back up that potential.
So now we can move on to the next concern, about new potential benefits. How important are software/app-driven transactional network optimization, the “virtual data sandbox”, and “composite clouds”, and how “real” is the prospect of near-term or medium-term benefits from these, from Composite Software or anyone else? My answer to each of these questions, respectively, is “the first two are likely to be very important in the medium term, the third in the short term”, and “Composite Software should deliver; the only question is how long it takes them to get there.”
My action items, therefore, for IT, are to check out Composite Software if you haven’t done so, to continue to ramp up the strategic nature of your implementations if you have, and to start planning for the new directions and new benefits. Above all, bear in mind that these benefits lie not just in traditional data virtualization software uses – but in areas of IT well beyond these.