Wednesday, August 6, 2014

In Shocking Praise of Cisco’s Hadoop-Using Data Warehouse

In the past, I have been quite critical both of some Cisco forays into the server space and and of user over-use of Hadoop. I find, however, to my own surprise, that Cisco’s new Hadoop-using approach to data warehousing is potentially very useful in Big Data warehouses.  Here is a short thought piece as to why this might be so.

First, a brief description of some of the key aspects of the solution, as I see them.  The Cisco approach is to view both a traditional data warehouse and the rest of the Big Data needed to provide fairly quick answers to business-critical data-scientist questions as one “virtual warehouse”, with Cisco’s data virtualization solution (based on Composite Software’s solution) as the veneer/umbrella.  Once you view all of these piece parts as part of a data-warehouse whole, it becomes possible to use not only lower-cost storage for “less-used” Big Data, but also different databases, including access to operational OLTP data stores and “mixed” query/update enterprise-app data stores.  These, however, can traditionally handle queries on much smaller data stores, because of their dual purpose and competition from updates. Even master-data-management systems, because it can be too constraining to rigidly copy to a central data store, suffer from this type of dual-purpose limitation.

The potential of a Hadoop database as a kind of “overload” locus, it seems to me, is that one takes a database optimized for querying data that is so Big that relational approaches alone cannot process it, and use it as “overflow” space for handling data that is so Big that a traditional data warehouse cannot handle it.  A potential side benefit is that, these days, much of the massive “overflow” data may very well be social-media information – the type of information on which Hadoop, MapReduce, and Hive cut their teeth.  And, of course, however inefficient in-house Hadoop has been, here at least is one area in which IT Hadoop experience allows better optimization of the Hadoop side of the virtual data warehouse.

Likewise, I am prepared to cut Cisco some slack when it says it intends to use its scale-out UCS servers in Hadoop “clusters”.  Despite the hype, it appears that no scale-out solution is coming close to the cost efficiency of scale-up servers in either public or private clouds, but if you’re going to go the scale-out route, UCS servers don’t stick out as especially cost-ineffective, and they have the benefit of Cisco’s networking strengths in their clustering. 

Above all, Cisco’s solution is nice because it adds a major new option to the information architecture.  When that has happened before, savvy users have usually found a way to make it work for their needs better than the old set of choices.  Again, I say to my surprise – check out Cisco’s new Hadoop-using approach to data warehousing.  I believe it’s worth a close look.


2 comments:

John Dudley said...

Acetech has many years of experience in custom software development. Find out more about custom software development at http://www.acetechindia.com

jake george said...

Hadoop Developer online training| Hadoop Developer ...
http://www.21cssindia.com/courses/hadoop-online-training-182.html
ఈ పేజీని అనువదించు
hadoop developer online training, hadoop developer training, hadoop developer course contents, hadoop developer, hadoop developer enquiry, hadoop ...
Courses at 21st Century Software Solutions
Talend Online Training -Hyperion Online Training - IBM Unica Online Training -
Siteminder Online Training - SharePoint Online Training - Informatica Online Training
SalesForce Online Training - Many more… | Call Us +917386622889