Wednesday, January 25, 2012

The Other BI: Oracle TimesTen and In-Memory-Database Streaming BI

This blog post highlights a software company and technology that I view as potentially useful to organizations investing in business intelligence (BI) and analytics in the next few years. Note that, in my opinion, this company and solution are not typically “top of the mind” when we talk about BI today.

The Importance of TimesTen-Type In-Memory Database Technology to BI

All right, now I’m really stretching the definition of “other”. Let’s face it, Oracle is “top of the mind” when we talk about BI, and they recently announced a TimesTen appliance, so TimesTen is not an invisible product, either. And finally, the hoopla about SAP HANA means that in-memory database technology itself is probably presently pretty close to the center of IT’s radar screen.

So why do I think Oracle’s TimesTen is in some sense not “top of the mind”? Answer: because there are potential applications of in-memory databases in BI for which the technology itself, much less any vendor’s in-memory database solution, is not a visible presence. In particular, I am talking about in-memory streaming databases.

To understand the relevance of in-memory databases to complex event processing and BI, let’s review the present use cases of in-memory databases. Originally, in-memory technology was just the thing for analyzing medium-scale amounts of financial-market information in real time, information such as constantly changing stock prices. Lately, in-memory databases have added two more BI duties: (a) serving as a “cache” database for enterprise databases, to speed up massive BI where smaller chunks of data could be localized, and (b) serving as a really-high-performance platform for mission-critical small-to-medium-scale BI applications that require less scaling year-to-year, such as some SMB reporting. These new tasks have arrived because rapid growth in main-memory storage has inevitably allowed in-memory databases to tackle a greater share of existing IT data-processing needs. To put it another way, when you have an application that is always going to require 100 GB of storage, sooner or later it makes sense to use an in-memory database and drop the old disk-based one, because in-memory database performance will typically be up to 10-100 times faster.

Now let’s consider event-processing or “streaming” databases. Their main constraint today in many cases is how much historical context they can access in real-time in order to deepen their analysis of incoming data before they have to make a routing or alerting decision. If that data can be accessed in main memory instead of disk, effectively up to 10-100 times the amount of “context” information can be brought to bear in the analysis in the same amount of time.

In other words, for streaming BI, IT potentially has two choices – a traditional event-processing database that is often entirely separate from a back-end disk-based database, or (2) a traditional main-memory database already pre-optimized for in-depth main-memory analysis and usually pre-integrated with a disk-based database (as TimesTen is with Oracle Database) as a “cache database” in cases where disk must be accessed. How to choose between the two? Well, if you don’t need much historical context for analysis, the event-processing database probably has the edge – but if you’re looking to upgrade your streaming BI, that’s not likely to be the case. In other cases, such as those where the processing is “routing-light” and “analysis-heavy”, an in-memory database not yet optimized for routing but far more optimized for in-depth analytics performance would seem to make more sense.

Thus, one way of looking at the use case of in-memory database event processing is to distinguish between in-enterprise and extra-enterprise data streams (more or less). Big Data is an example of an extra-enterprise stream, and can involve a fire hose of “sensor-driven Web” (GPS) and social media data that needs routing and alerting as much as it needs analytics. Business-critical-application-destined and embedded-analytics data streams are an example of in-enterprise data, even if admixed with a little extra-enterprise data; they require heavier-duty cross-analysis of smaller data streams. For these, the in-memory database’s deeper analysis before a split-second decision is made is probably worth its weight in gold, as it is in the traditional financial in-memory-database use case.

Won’t having two databases carrying out the general task of handling streaming data complicate the enterprise architecture? Not really. Past experience shows us that using multiple databases for finer-grained performance optimization actually decreases administrative costs, since the second database, at least, is typically much more “near-lights-out,” while switching between databases doesn’t affect users at all, because a database is infrastructure software that presents the same standard SQL-derivative interfaces no matter what the variant. And, of course, the boundary between event-processing database use cases and in-memory ones is flexible, allowing new ways of evolving performance optimization as user needs change.

The Relevance of Oracle TimesTen to Streaming BI

In many ways, TimesTen is the granddaddy of in-memory databases, a solution that I have been following for fifteen years. It therefore has leadership status in in-memory database use-case experience, and especially in the financial-industry stock-market-data applications that resemble my streaming-BI use case as described above. What Oracle has added since the acquisition is database-cache implementation and experience, especially integrated with Oracle Database. At the same time, TimesTen remains separable at need from other Oracle database products, as in the new TimesTen Appliance.

These characteristics make TimesTen a prime contender for the potential in-memory streaming BI market. Where SAP HANA is a work in progress, and approaches like Volt are perhaps less well integrated with enterprise databases, TimeTen and IBM’s solidDB stand out as combining both in-memory original design and database-cache experience – and of these two, TimesTen has the longer in-memory-database pedigree.

It may seem odd of me to say nice things about Oracle TimesTen, after recent events have raised questions in my mind about Oracle BI pricing, long-term hardware growth path, and possible over-reliance on appliances. However, inherently an in-memory database is much less expensive than an enterprise database. Thus, users appear to have full flexibility to use TimesTen separately from other Oracle solutions, free from worries about possible long-term effects of vendor lock-in.

Potential Uses of TimesTen-Type In-Memory Streaming BI for IT

As noted above, the obvious IT use cases for TimesTen-type streaming BI lie in driving deeper analysis in in-enterprise streaming applications. In particular, in the embedded-analytics area, in-memory performance speedups can allow consideration of a wider array of systems-management data in fine-tuning downtime-threat and performance-slowdown detection. In the real-time analytics area, an in-memory database might be of particular use in avoiding “over-steering”, as when predictable variations in inventories cause overstocking because of lack of historical context. In the Big Data area, an in-memory database might apply where the data has been pre-winnowed to certain customers, and a deeper analysis of those customers fine-tunes an ad campaign. For example, within a half-hour of the end of the game, Dick’s Sporting Goods had sent me an offer of a Patriots’ AFC Championship T-shirt, complete with visualization of the actual T-shirt – a reasonably well-targeted email. That’s something that’s far easier to do with an in-memory database.

IT should also consider the likely evolution of both event-processing and in-memory databases over the next few years, as their capabilities will likely become more similar. Here, the point is that event-processing databases often started out not with data-management tools, but with file-management ones – making them significantly less optimized “from the ground up” for analysis of data in main memory. Still, event-processing databases such as Progress Apama may retain their event-handling, routing, and alerting advantages, and thus the situation in which in-memory is better for in-enterprise and event-processing is better for extra-enterprise is likely to continue. In the meanwhile, increasing use of in-memory databases for the older use cases cited above means that in-memory streaming-BI databases offer an excellent way of gaining experience in their use, before they become ubiquitous. That, in turn, means that narrow initial “targets of opportunity” in one of the situations cited in the previous paragraph are a good idea, whatever the scope of one’s overall in-memory database commitment right now.

The Bottom Line for IT Buyers

In some ways, this is the least urgent and most speculative of the “other BI” solutions I have discussed so far. We are, after all, discussing additional performance and deeper analytics in a particular subset of IT’s needs, and in an area where the technology of in-memory databases and their event-processor alternatives is moving ahead rapidly. In a sense, this is really an opportunity for those IT shops that specialize in applying a little extra effort and “designing smarter” across multiple new technologies to provide a nice ongoing competitive advantage. For the rest, if the shoe can easily be made to fit, why not wear it?

My suggestion for most IT buyers, therefore, is therefore to have a “back-pocket” in-memory-database-for-streaming-BI short list that can be whipped out at the appropriate time. Imho, Oracle TimesTen right now should be on that list.

I hate to close without noting the overall long-term BI potential of in-memory databases. The future of in-memory databases is not, in my firm opinion, to supersede the IBM DB2s, Oracle Databases, and Microsoft SQL Servers of the world, at any time in the next four years. The hardware technologies to enable such a thing are not yet clear, much less competitive. Rather, the value of in-memory databases is to allow us to optimize our querying for both main-memory and disk storage – which are two very different things, and which will both apply appropriately to many key customer needs over the next few years. Overall, the effect will be another major ongoing jump in data-processing performance. As we enter this new database-technology era, those who initially kick the tires in a wider variety of BI projects will find themselves with a significant “experience” advantage over the rest, especially because the key to outstanding success will be determining the appropriate boundary between disk-based and in-memory database usage. Don’t force in-memory streaming BI into the organization. Do keep checking to see if it will fit your immediate needs. Sooner or later, it probably will.

1 comment:

Jerome said...

TimesTen is an in-memory database management technology that speeds access to data. Indexing, query optimization, and storage management.