Monday, October 20, 2014

Data Virtualization Day 2: Looking Ahead


 

One of the most interesting questions raised during DV Day was asked by a speaker of the audience:  We have presented our vision of the future, now what would you think would be a visionary new idea for using data virtualization?  Pretty much on the spot, I came up with two interrelated ideas:
1.       Metadata mining, and

2.       Trend discovery via data-virtualization’s data-discovery features.

Metadata Mining

As data mining spreads increasingly outside the enterprise, via mining of social-media and sensor data, the value of a global metadata repository such as those provided by Cisco/Composite and other DV vendors lies not merely in the ability to coordinate and correct the inconsistencies in internal data, but also to see the connections between internal and external data – for example, when these represent information about the same person or thing – as well as to understand how certain metadata changes in frequency of use over time.  For example, if less data is video and more is texting, that set of facts, surfaced by access counts for video and text metadata, tells us about our customers’ changes in media use – and does so immediately, while being based on actual behavior rather than self-report surveys.

I conjecture that the main value of such metadata mining in the long run will lie in one of two areas: (a) providing new ways to slice the data underlying the metadata, and (b) offering new, broader, and more flexible ways of aggregating the data.  I see (a) happening because while new applications for BI tend today to be generated by new product ideas or new customer trends as publicized on the Web, new data-mining insights from metadata come from actual behavior and information that can’t be easily fit into existing categories.  I see (b) occurring due to metadata’s natural function of fitting data into broad categories and detecting connections between nominally different sets of data, constantly and semi-automatically.

Trend Discovery

In a similar way, the data-discovery features of data virtualization tools could be used to detect new trends relevant to the business by seeing their “footprint” on the Web.  That is, data discovery can constantly monitor the Web for new types of metadata that don’t fit easily into the old categories, and surface it to data scientists for alerting to new types of customer or prospect behavior.

This is particularly attractive because a survey I did 4 years ago indicated that at that time, business execs tended to learn about key new trends on the Web 6 months or more after they first arrived.  That’s a long time these days, in product-version lifecycle terms – and I see no clear signs of major speedup since then. 
Anyway, just some visionary thoughts, for fun …

No comments: