According to Barry Brunelli of Techtarget, a recent
Forrester Research report places IBM and Informatica at the top of the heap,
ahead of Composite Software and Denodo – and I disagree. I do, in fact, have a lot more respect for
the data management folks at Forrester than I do for their development folks,
who produced a report a couple of years
back with a very poor (imho) understanding of the nature of agile
development. And I do believe that Forrester
deserves credit, compared apparently to Gartner, for recognizing both the
increasing importance and the ongoing potential for business benefits of data
virtualization. However, I would
continue to put Composite Software (now under Cisco) and Denodo ahead of IBM
and Informatica in functionality, fit to customer need, and ongoing value-add
in the immediate future. Why?
The Importance Of Paying One’s Dues In Data Virtualization
The understanding of Composite’s and Denodo’s advantages
begins with the fact that since the beginning, it has often been confused with
a technology originally called EAI, or Enterprise Application Integration. Both integrate data; but they start from a
foundation that aims data integration at very different purposes. EAI originally aimed (and still does, in some
cases), to pass data between two or more enterprise applications, such as SAP
and Oracle Apps. As a result, they
created gateways that converted this (usually bulk) data to a common format,
and then retranslated as necessary to pass to the target enterprise app. As it turned out, this conversion to a common
format is exactly what is needed to provide a front end to handle data
streaming to a data warehouse – and thus, EAI and ETL (extract, transform,
load) tools share a fair amount of functionality. However, there is no sense of urgency about
this conversion; it is for populating a database, not for immediately providing
an answer to a query.
By contrast, data virtualization from the start aimed to
provide querying (and, eventually, updates) across multiple databases and data
management tools. That, in turn, meant
leaving most of the data on the device on which it already resided, and
converting and combining only those parts of the data needed for a result – and
so, high-performance querying became part of the package from the get-go. Moreover, figuring out how to optimize
queries in this way effectively takes quite a while, and new data types (e.g.,
social media, Hadoop) and data stores (e.g., data from multiple clouds) keep
coming along and must be handled.
As I recall, Composite Software have been continually refining
their software since at least 2003. IBM
originally had a matching product (now apparently part of InfoSphere). However, in the mid-2000s, IBM chose to focus
on the newly-acquired Ascential (more of an EAI-type product) instead, and only
recently have they begun to re-focus on data virtualization technologies, with
the acquisition of an unstructured-data virtualization company and with
increased (and welcome!) attention paid, notably during the recent Information
Management conference. Based on
my last conversations with IBM, I suspect that they have a fair amount of work
still to do to upgrade the unstructured-data acquisition’s cross-database
querying with many more use cases, from cloud data to object, streaming/sensor,
data-warehouse, and IMS/Informix data types – not to mention integrating it
with Master Data Management, operational-data querying needs, and features such
as information governance. And, of
course, I’ve left out such newer functionality as cross-database updates,
cross-database access control, developer support, and administrator support.
Informatica, apparently, is starting from behind what IBM
has. For most of the last decade, it has
been playing in the EAI and “data integration” (including ETL) space, but only
over the last two or three years has it publicized its “data virtualization” capabilities
– nor it is clear where it got its cross-database querying chops. Certainly, most of the smaller players from
10 years ago are already acquired, and suffering under the negligent hand of
their masters – Oracle, for example, acquiring an already-neglected AquaLogic product
with its buyout of BEA. In similar
fashion, SAP has wound up with a Sybase-acquired product, and Red Hat with the
granddaddy of data virtualization, MetaMatrix.
In any case, large marketing claims do not substitute for a demonstrated
pedigree of functional development.
Lessons For Users
So where do I view Forrester as having gone wrong, and how
can IT buyers avoid buying less than the needed functionality? I don’t know for sure, but I suspect that
underlying the Forrester take was (a) confusion between EAI-type and
data-virtualization-type “data integration” as well as a misunderstanding of
what “data virtualization” really means, and (b) a subconscious belief that
when a large and a small company say they have something, typically the large
company wins because of breadth of features and support.
Let’s take the confusion first. I am one who wonders if “data virtualization”
hasn’t caused as much confusion as attention. Originally, the technology was
called Enterprise Information Integration, which at least gets across the idea
that the technology delivers value-add (timely, cross-data-type-contexted “information”). “Data virtualization”, however, suggests that
the main value of the technology, like that of storage and server virtualization,
is to provide a single view that allows better load balancing. On the contrary, data virtualization products
also provide the basis for global metadata repositories, distributed master
data management data-store query optimization, cross-the-hybrid-cloud data
discovery, developer data abstraction for longer-lasting code, single-key
cross-database administration, and semi-automated data governance, not to
mention cross-cloud querying. Given
these additional features, users, unlike Forrester, must carefully probe
whether vendors aside from Composite Software and Denodo are really walking the
walk.
For the same reason, (b) doesn't apply: you can’t simply feel
comfortable with the large company’s features and support, because the features
and support may very well not cover the types of things that data
virtualization does well out of the box.
To put it another way, at present, IBM and Informatica have excellent and
extensive data-integration and EAI features; but trying to do flexible data management, Web data
discovery and ad-hoc querying, near-realtime data warehousing, and global
metadata repositories for data governance without a well-optimized data virtualization product is like trying to fight with one hand
tied behind one’s back.
Data virtualization now matters more than ever to you, the
IT buyer. Forrester admits it, IBM
admits it, and it seems that folks like Microsoft are now beginning to admit
it. If you don’t get 90% of the
potential benefit because someone told you to use flawed criteria, you will
therefore be missing out on the things that make companies like Qualcomm achieve
real value-add, not just now but well into the future. Whether I’m right about Forrester or not, the
important thing is not to sell data virtualization short. Now, go out and kick
those tires – the right way.
1 comment:
Cisco Data Virtualization
Unified provides an enhanced collaboration experience with Cisco’s Collaboration Room End Points for a wide set of business purposes. Cisco’s Collaboration Room EndPoints gives a steady experience to the users when grouped with customer partnership, discussion solutions and integrated communication.
Post a Comment