From Data Virtualization to Data Services
Posted in BI and Analytics, Data Governance & Policy, Information Management, Virtualization on January 19th, 2011 by DStodder – Be the first to commentWith margins for transactional operations getting thinner, organizations in many industries are focused on leveraging more value from their data. This could not be truer than in the financial services industry, where the onrushing spread of algorithmic trading is changing…well, everything, including the role of information in guiding trading and investment decisions. This rather unnerving article in Wired by Felix Salmon and Jon Stokes captures what’s happening quite well. The speed with which organizations need to turn data into both better business insight and marketable data services has many looking closely at data virtualization.
As an analyst, I am fortunate to be a (remote) participant in the Boulder BI Brain Trust (BBBT), the Boulder, Colorado-based brainchild of Claudia Imhoff. A couple of times a month, this illustrious group of experts gathers for half-day briefings with vendors in the business intelligence, analytics and data warehousing space; the briefings are always highly informative (on Twitter, watch for the hash tag #BBBT). On Friday, January 14, the topic was data virtualization: the BBBT met with Composite Software, a leading vendor of tools and solutions for data virtualization. Composite brought along a customer – and no small customer, either – NYSE Euronext. Emile Werr, NYSE Euronext’s VP of Global Data Services and head of Enterprise Data Architecture gave us a briefing on how the company is developing a data virtualization layer using Composite’s products.
Wikipedia has a good definition of data virtualization: “to integrate data from multiple, disparate sources – anywhere across the extended enterprise – in a unified, logically virtualized manner for consumption by nearly any front-end business solution, including portals, reports, applications, search and more.” As the Wiki entry notes, data virtualization (or “data federation”) is an alternative to data consolidation or replication into data warehouses and data marts. When this concept was first introduced, it was the cause of fiery debates at TDWI events and elsewhere. Now, it has settled in as a complement to the other more entrenched approaches.
Virtualization helps when organizations can’t really wait for standard data integration and loading into a data warehouse. That is very much the challenge facing NYSE Euronext, which is using Composite tools to develop a virtual layer to improve data access for internal executives and to establish a platform for creating data services. “We have so many companies trying to connect into us, and we want to serve standardized information out to companies around the world,” Werr said. NYSE Euronext is moving away from its old method of dumping transaction data into a warehouse; it wants to put more intelligence into the virtual layer. And to help build this layer, it is hiring people with business skills who understand processes and how to derive business value from data. “[These professionals] are the most productive people on my team right now,” he said.
The BBBT session featured an interesting debate about how data governance fits with data virtualization. Can data quality and governance rules be managed and invoked from the virtual layer? Should they be managed at the source or as part of extract, transformation and loading (ETL) processes, as many organizations do now? The discussion began to turn toward master data management and the option of creating a central hub or registry to implement governance for access to multiple sources. Highly regulated industries such as financial services and healthcare should consider this approach because of the need to invoke regulatory provisions for data access and sharing. Werr discussed these requirements and how his organization hopes to use the Composite virtual layer to support metadata governance and access from multiple BI tools.
Putting intelligence into a virtual layer fits with the IT infrastructure trend toward virtualization and cloud computing, and may become even more important because of this trend. Service-oriented applications running on cloud and virtual platforms frequently require access to multiple, disparate data sources. From a business standpoint, data virtualization is going to be critical to moving data quickly from the back office outward, to where it can be packaged into “information-as-a-service” offerings that customers will buy – and that will improve the seller’s profit margins.