Posts Tagged ‘master data management’

From Data Virtualization to Data Services

Posted in BI and Analytics, Data Governance & Policy, Information Management, Virtualization on January 19th, 2011 by DStodder – Be the first to comment

With margins for transactional operations getting thinner, organizations in many industries are focused on leveraging more value from their data. This could not be truer than in the financial services industry, where the onrushing spread of algorithmic trading is changing…well, everything, including the role of information in guiding trading and investment decisions. This rather unnerving article in Wired by Felix Salmon and Jon Stokes captures what’s happening quite well. The speed with which organizations need to turn data into both better business insight and marketable data services has many looking closely at data virtualization.

As an analyst, I am fortunate to be a (remote) participant in the Boulder BI Brain Trust (BBBT), the Boulder, Colorado-based brainchild of Claudia Imhoff. A couple of times a month, this illustrious group of experts gathers for half-day briefings with vendors in the business intelligence, analytics and data warehousing space; the briefings are always highly informative (on Twitter, watch for the hash tag #BBBT). On Friday, January 14, the topic was data virtualization: the BBBT met with Composite Software, a leading vendor of tools and solutions for data virtualization. Composite brought along a customer – and no small customer, either – NYSE Euronext. Emile Werr, NYSE Euronext’s VP of Global Data Services and head of Enterprise Data Architecture gave us a briefing on how the company is developing a data virtualization layer using Composite’s products.

Wikipedia has a good definition of data virtualization: “to integrate data from multiple, disparate sources – anywhere across the extended enterprise – in a unified, logically virtualized manner for consumption by nearly any front-end business solution, including portals, reports, applications, search and more.” As the Wiki entry notes, data virtualization (or “data federation”) is an alternative to data consolidation or replication into data warehouses and data marts. When this concept was first introduced, it was the cause of fiery debates at TDWI events and elsewhere. Now, it has settled in as a complement to the other more entrenched approaches.

Virtualization helps when organizations can’t really wait for standard data integration and loading into a data warehouse. That is very much the challenge facing NYSE Euronext, which is using Composite tools to develop a virtual layer to improve data access for internal executives and to establish a platform for creating data services. “We have so many companies trying to connect into us, and we want to serve standardized information out to companies around the world,” Werr said. NYSE Euronext is moving away from its old method of dumping transaction data into a warehouse; it wants to put more intelligence into the virtual layer. And to help build this layer, it is hiring people with business skills who understand processes and how to derive business value from data. “[These professionals] are the most productive people on my team right now,” he said.

The BBBT session featured an interesting debate about how data governance fits with data virtualization. Can data quality and governance rules be managed and invoked from the virtual layer? Should they be managed at the source or as part of extract, transformation and loading (ETL) processes, as many organizations do now? The discussion began to turn toward master data management and the option of creating a central hub or registry to implement governance for access to multiple sources. Highly regulated industries such as financial services and healthcare should consider this approach because of the need to invoke regulatory provisions for data access and sharing. Werr discussed these requirements and how his organization hopes to use the Composite virtual layer to support metadata governance and access from multiple BI tools.

Putting intelligence into a virtual layer fits with the IT infrastructure trend toward virtualization and cloud computing, and may become even more important because of this trend. Service-oriented applications running on cloud and virtual platforms frequently require access to multiple, disparate data sources. From a business standpoint, data virtualization is going to be critical to moving data quickly from the back office outward, to where it can be packaged into “information-as-a-service” offerings that customers will buy – and that will improve the seller’s profit margins.

Informatica and the Identity Opportunity

Posted in BI and Analytics, Data Governance & Policy, Information Management on March 8th, 2010 by admin – Be the first to comment

As we move further into our information-rich age of multiple sales and service channels, social media and surveillance, identity is becoming a hot topic. First, there’s identity theft: According to a recent study by the Ponemon Institute (sponsored by Experian’s ProtectMyID.com and reported by The Medical News), “nearly 1.5 million Americans have been victims of medical identity theft.” Credit fraud, reputation fraud and more are additional negative results of having sensitive information about ourselves spread across the information ecosphere.

Then, there’s identity surveillance. Law enforcement and intelligence services must deal every day with identity confusion as they try to work within legal constraints to find wanted criminals and potential terrorists. Adding complexity, law enforcement will need to determine identity not just from traditional data but multimedia as well; an example is this current caper reported by the Tallahassee (Florida) Democrat.

Identity surveillance and watch lists are rising as political and policy challenges. Canada and the United States are in the news here and here, tussling over implementation of Secure Flight, the plan to collect more passenger data for watch lists that will be implemented by the Transportation Security Administration of the U.S. Department of Homeland Security. See this Intelligent Enterprise blog from last June by Rajan Chandras for some background.

In the middle of all of this are software providers, primarily IBM InfoSphere Identity Insight Solutions, Infoglide (which is providing software for the DHS) and Informatica. In February, I attended the Informatica Analyst Conference and had a chance to talk to execs there about the Informatica Identity Resolution (IIR) solution and how it fits with other solutions and technologies such as master data management (MDM). I came away with a strong sense of how IIR is opening doors to new business opportunities for Informatica in government, but also potentially in areas where Informatica has greater market strength but where identity recognition and resolution software has not traditionally been applied.

Identity recognition and resolution systems enable organizations to use data matches to gain a better understanding of identity across multiple systems. This could include not just individual identities but also networks and relationships: that is, who people know and how they are connected. The tools generally apply algorithms and rules engines to automate and systematize steps that would obviously take gumshoe detectives far longer as they seek clues, patterns and a risk assessment about possible terrorists, fraudsters, money launderers and regulatory violators.

When Informatica acquired Identity Systems from Nokia in the spring of 2008, it looked like simply a smart addition to the company’s data quality toolbox. However, it is clear now that the acquisition was one of a series of decisive steps that have turned Informatica into a more broadly relevant information management (IM) solutions provider. The Identity Systems deal was followed in 2009 by the acquisition of AddressDoctor GmbH, a tool for postal address cleansing and verification. And of course, Informatica recently made its biggest move early this year by acquiring Siperian, a provider of MDM tools.

IIR is an important component of Informatica’s complete MDM solution, and will help organizations implementing MDM gain the much-sought single view of identities (customers, patients, criminals and more) across multiple data sources. A key capability to look for in identity recognition and resolution tools is functionality in multiple languages and countries. Combined with AddressDoctor, Informatica has tools for locating and matching identities around the world. And thinking beyond law enforcement uses, global corporations with diverse markets need better tools for identity network analysis to improve marketing, billing, service and more, especially in this age of social media.

IIR can also help internally, given that data is often hidden in applications and obscure databases. A healthcare firm at the Analyst conference described how it is using IIR for operations between its mainframes and users’ 30,000 Microsoft Access databases. Finally, one of the more interesting technology pairings I learned about at the conference was the real-time application of IIR for “identity-aware” event processing using Informatica’s CEP engine Agent Logic. Watch lists and other espionage uses are an obvious application of this combination, but it could also be applied in systems for financial services, healthcare, retail and other industries.

In the olden days, identity might have seemed a simpler, more innocent matter, although viewing film noir and reading detective novels from the ‘40s and ‘50s might make you wonder. Today, however, there’s no question that identity is a complex topic that includes sensitive political and privacy ramifications. Software providers such as Informatica should be in for a wild ride.

Data Syndicators, Brokers and Providers, Oh My!

Posted in BI and Analytics, Information Management on August 27th, 2009 by admin – Be the first to comment

I’m writing this blog in the late afternoon in an alcove of the Hyatt Regency San Francisco’s famous atrium. A couple of hours ago, the MDM Summit finished up, and now men are rolling big carts stacked with crates of beer across the tile floor. Colored lights make the anodized aluminum bars of Charles O. Perry’s “Eclipse” sculpture look like a giant musty rose. A tense-looking chef appears; buttoned up in his professional white smock, he walks with brisk steps toward the window to check his cell phone. Reception is bad in this hotel, which doesn’t look like it’s changed much since it opened rather spectacularly in around 1970. Outside, the summer sun radiates light into the blue sky, blue bay and everything that’s moving: walkers, skateboarders, trolley cars and buses. In the distance, the first thin, grey vapor of fog stretches in a line across Angel Island. There’s more to come.

Before I succumb to the charms of this place and order an Anchor Steam, I’d like to offer some thoughts about master data management (MDM) and data governance based on what I heard at the conference. It will probably take a couple of blog installments to wrap it up. Here is the first.

Data is flying everywhere these days, particularly as more businesses turn to the Web for external sources that might give them an edge in customer intelligence. Social networks and communities could be rich sources of information about how customers relate to each other and self-define their communities. However, organizations are fooling themselves if they do not exercise MDM processes to do some of the same modeling, integration and quality efforts for external sources that they would to integrate views of data from internal sources. MDM processes are those that help organizations improve the quality and consistency of their data across multiple sources, increase their understanding of data relationships and manage how it is accessed and distributed to users.

William McKnight, a partner with US-Analytics and Louie Torres, who is now director of Business Solutions after having been director of Information Systems at Forbes, offered a useful presentation on “Incorporating Syndicated Data into Your MDM Environment.” The speakers noted that IT often exercises little control over the use of external sources; business units go out on their own to engage data syndicators, brokers and providers, which can create new data silos in the organization. However, they do this for business reasons – and as Torres pointed out, they own the P&L. “We are very interested in ‘psychographics’ that tell us what people like to do,” he said. To be closer to business units’ decisions about data sources and more helpful to their use of them, Torres, as director of Business Solutions, has moved out of IT.

With data syndicators such as InfoUSA offering to send as many as 6,000 data points on customers, the speakers noted that it is important to look at what you really need – and whether you need to pay to have it updated. “How many birthday or gender updates do we really need?” said Torres. Forbes is looking for granular data about who has a yacht, who likes to play golf and other rather moneyed customer activities. So, the company will focus most closely on data points that deliver on those matters. The speakers suggested that companies should determine what they want, and not pay for what they don’t need. As well, they should not include data points that could offer suspect information and invite into the organization information management headaches that aren’t necessary.

In part, headaches come because data brokers and syndicators are buying and selling data behind the scenes like mad. You often don’t know what they have pulled together to create a sellable data package. This can create data quality and consistency problems, making it important for organizations to use MDM processes to ascertain and ensure the quality of customer views. It also means that organizations should develop policies for external sources as part of their data governance. I will cover this aspect in a later blog.

What about the MDM industry itself? I caught the part of conference chair Aaron Zornes’ talk where he was discussing the systems integrators (SIs) and the importance of their role in MDM. He said that his organization, The MDM Institute, will have a report out on this subject shortly. One important point he made: There’s been a lot of “moving and shaking” in the SI industry, particularly regarding information management practices. Thus, Zornes offered a buyer beware; organizations should make sure that the expertise they are being promised by the SI is still what the firm is capable of delivering.