Posts
DATA VIRTUALIZATION: TOP USE CASES THAT MAKE A DIFFERENCE
/by Dimitri MaesfranckxTraditional data warehouses entered on repositories are no longer sufficient to support today’s complex data and analytics landscape. The logical data warehouse (LDW) combines the strengths of traditional repository warehouses with alternative data management and access strategies to improve your agility, accelerate innovation, and respond more efficiently to changing business requirements.
Challenges include:
- A data services approach that separates data access from processing, processing from transformation, and transformation from delivery
- Diverse analytic tools and users
- Diverse data types and sources including traditional data repositories, distributed processing (big data), virtualized sources, and analytic sandboxes
- Unified business ontologies that resolve diverse IT taxonomies via common semantics
- Unified information governance including data quality, master data management, security, and more
- Service level agreement (SLA) driven operationalization
Data Virtualization provides a virtualization-centric LDW architecture solution.
With Data Virtualization you can:
- Access any source including traditional data repositories, distributed processing (big data), virtualized sources, and analytic sandboxes, both onpremises and in the cloud
- Model and transform data services quickly, in conformance with semantic standards
- Deliver data in support of a wide range of use cases via industry-standard APIs including ODBC, JDBC, SOAP, REST, and more
- Share and reuse data services across many applications
- Automatically allocate workloads to match SLA requirements
- Align data access and use with enterprise security and governance requirements
With Data Virtualization you get:
- One logical place to go for analytic datasets regardless of source or application
- Better analysis from broader data access and more complex transformations
- Faster analysis time-to-solution via agile data service development and reuse
- Higher quality analysis via consistent, well-understood data
- Higher SLAs via loose-coupling and optimization of access, processing and transformation
- Flexibility to add or change data sources or application consumers as required
- More complete and consistent enterprise data security and governance
The idea of having all of your data stored and available in a data lake sounds like a wonderful idea to everyone. Finally the promise of Big Data can become reality. But what is the reality behind implementing a data lake? Virtually all organizations already have numerous data repositories – data warehouses, operational data stores, data stored in files, etc. It is impossible to load all of this data into a data lake and give everyone access to it. Next to data volume, You should also think about other elements like different data formats, data quality and even data security.
All this complexity however doesn’t mean that you should forget about a data lake. Using Data Virtualization, it is possible to leave the data in its current environment and create a virtual or better a logical data lake.
Challenges include:
- Move all the data into a single location
- Move all the data in a timely manner to that single location
- Deliver the data as integrated data
- Deliver fresh data
- Where to store the date and how limit the impact when this storage mechanism needs to be changed
- Avoid and limit the impact of the ever changing technologies
Data Virtualization lets you:
- Isolate your data consumers from the underlying data architecture
- Be flexible in changing the underlying architecture without impacting your data consumers
- Deliver fresh data without the need to move it upfront
With Data Virtualization you get:
- Fresh data for your data consumers
- Agile data delivery in the shape the data consumers like to have it
- Deliver data independent whether it is located in traditional systems, multiple big data systems or a combination of all of these
- Agility to change the data architecture without impacting the data consumers
Physical integration is a proven approach to analytic data integration; However, long lead times associated with physical integration — on average 7+ weeks according to TDWI — can delay realizing business value. Further, physical integration requires significant data engineering efforts and a complex software development lifecycle.
Challenges include:
- Requirements. Business requirements are not always clear at the start of a project and thus can be difficult for business users to clearly communicate.
- Design. Identifying and associating new mappings, new ETLs, and schema changes is complex. Further, current data engineering staff may not understand older schemas and ETLs. This makes detailed technical specifications a key requirement.
- Development. Schema changes and ETL builds are required prior to end user validation. Resultant rework cycles often delay solution delivery.
- Deployment. Modifying existing warehouse / data mart schemas and ETLs can be difficult and/or risky.
Data Virtualization lets you:
- Interactively refine requirements, and based on actual data, build virtual data services side-by-side with business users.
- Quickly deploy agreed datasets into production to meet immediate business needs.
- Invest additional engineering efforts on physical integration later, only if required.
- If required, use mappings and destination schema within the proven dataset as a working prototype for physical integration ETLs and schema changes.
- Once physical integration is tested, transparently migrate from virtual to physical without loss of service.
With Data Virtualization you get:
- Faster time-to-solution than physical integration, and accelerated business benefits
- Less effort spent on upfront requirements definition and technical specification
- The right level of data engineering required to meet requirements, while avoiding unnecessary over-engineering
- Less disruption of existing physical repositories, schemas, and ETLs
Vendor-specific analytic semantic layers provide specialized data access and semantic transformation capabilities that simplify your analytic application development.
However, these vendor-specific semantic layer solutions have limitations including:
- Delayed support of new data sources and types
- Inability to share analytic datasets with other vendor’s analytic tools
- Federated query performance that is not well optimized
- Limited range of transformation capabilities and tools
Data Virtualization provides a vendor-agnostic solution to data access / semantic layer for analytics challenges.
Challenges include:
- Access any data source required
- Model and transform analytic datasets quickly
- Deliver analytic data to a wide range of analytics vendor tools via industrystandard APIs including ODBC, JDBC, SOAP, REST, and more
- Share and reuse analytic datasets across multiple vendors’ tools
- Automatically optimize queries
- Conform analytic data access and delivery to enterprise security and governance requirements
With Data Virtualization you get:
- One place to go for analytic datasets regardless of analytic tool vendor
- Better analysis from broader data access and more complex transformations
- Lower costs, with reuse of analytic datasets across diverse analytic tools and users
- Faster query performance
- Greater analytic data security and governance
Self-service data preparation has proven to be a great way for business users to quickly transform raw data into more analytic friendly datasets. However, some agile data preparation needs require data engineering skill and higher-level integration capabilities.
Challenges include:
- Support for increasingly diverse and distributed data sources and types
- Limited range of transformation capabilities and tools
- Constraints on securing, governing, sharing, reusing, and productionizing prepared datasets
Data Virtualization provides an agile data preparation solution for data engineers that complements business user data preparation tools.
Data Virtualization lets you:
- Interactively refine requirements and prepare datasets with business users based on actual data
- Prepare datasets that may require complex transformations or highperformance queries
- Leverage existing datasets when preparing new datasets
- Quickly deploy prepared datasets into production when appropriate
- Align data preparation activities with enterprise security and governance requirements
With Data Virtualization you get:
- Rapid, IT-grade datasets that meet analytic data needs
- The right level of data engineering required to meet requirements, while avoiding unnecessary over-engineering
- Less effort spent productionizing datasets
- More complete and consistent data security and governance
Physical operational data stores (ODS) have proven a useful compromise that balances operational data access needs with operational system SLAs.
However, replicating operational data in an ODS is not without it costs.
Challenges include:
- Significant development investments for ODS set up, and for integration projects that move data to them.
- Higher operating costs for managing the associated infrastructure.
- Integration workloads on the operational system.
- Often the operational source is not resource constrained, or operational queries may be light enough to not create significant workloads.
- When operational data is in an ODS, it may still require further transformations to make it useful for diverse analysis needs.
Data Virtualization lets you:
- Access any operational data or other sources as required
- Model and transform operational datasets quickly
- Deliver data to a wide range of operational applications via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
- Share and reuse analytic datasets across applications
- Reduce the impact on operational sources via query optimization and intelligent caching
- Conform operational data access and delivery to enterprise security and governance requirements
With Data Virtualization you get:
- One virtual place to go for operational data
- Better analysis from broader data access and more flexible transformations
- Lower costs due to less replicated data maintained in physical ODSs
- More than good enough query performance without impacting operational system SLAs
The data hub is logical architecture that enables data sharing by connecting producers of data (applications, processes, and teams) with consumers of data (other applications, processes, and teams). Master data hubs, logical data warehouses, customer data hubs, reference data stores, and more are examples of different kinds of data hubs. Data hub domains might be geographically focused, business process-focused, or application-focused.
Challenges include:
- The data hub provision data to and receive data from analytic and operational applications
- Hub data is governed and secure
- Data flows into and out of the hub are visible
Data Virtualization data hub solution delivers these requirements.
Data Virtualization lets you:
- Introspect sources and identify potential data hub entities and relationships
- Access any data hub data source
- Model and transform data hub datasets
- Deliver data hub datasets to diverse analytic and operational applications via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
- Share and reuse data hub datasets across multiple applications
- Conform data hub access and delivery to enterprise security and governance requirements
With Data Virtualization you get:
- A complete solution for data hub implementations
- Better analysis and business processes via consistent use of data hub datasets
- Higher analytic and operational application quality via consistent use of data hub datasets
- Greater agility when adding or changing data hub datasets
- Complete visibility into data hub data flows
- End-to-end data hub security and governance
Master Data Management (MDM)is an essential capability. Analyst firms such as Gartner have identified four MDM implementation styles (consolidation, registry, centralized, and coexistence) that you can deploy independently or combine to help enable successful MDM efforts.
Challenges include:
- Access to master and reference data from diverse sources
- A cross-reference table (index) that reconciles and links related master data entities and identifiers by source
- Data services that expose the cross-reference table to analytic and operational applications that require master data from one or more sources
- Data federation that leverages the cross-reference table when querying detailed data associated with master entities
Data Virtualization is a poven technology for registry-style MDM solutions.
Data Virtualization lets you:
- Introspect sources and identify potential master data entities and relationships
- Build a physical master data registry that relates and links master data across sources
- Cache registry copies adjacent to MDM user applications to accelerate frequent MDM queries
- Combine master, detail, and non-master data to provide more complete 360-degree views of key entities
With Data Virtualization you get:
- A complete solution for registry-style MDM implementations
- Better analysis via more complete views of master data entities across sources
- Higher analytic and data quality via consistent use of master and reference data
- Faster query performance and less disruption to master data sources
- Greater agility when adding or changing master and reference data sources
New technology provides more advanced capabilities and lower cost infrastructure. You want to take advantage. However, migrating legacy data
repositories to new ones or legacy applications to new applications technology is not easy.
Challenges include:
- Business continuity requires non-stop operations before, during, and after the migration.
- Applications and data repositories are often tightly coupled making them difficult to change.
- Big bang cutovers are problematic due to so many moving parts.
- Too often, testing and tuning only happen after the fact.
Data Virtualization provides a flexible solution for legacy system migration challenges.
Data Virtualization lets you:
- Create a loosely coupled, middle-tier of data services that mirror as-is data access, transformation, and delivery functionality
- Test and tune these data services on the sidelines without impacting current operations
- Modify the as-is data services to now support the future-state application or repository, then retest and retune
- Migrate the legacy application or repository
- Implement future-state data services to consume or deliver data to and from the new application or repository
With Data Virtualization you get:
- To take advantage of new technology opportunities that can improve your business and cut your costs
- Loose coupling you need to divide complex migration projects into more manageable phases
- Less risk by avoiding big bang migrations
- Reusable data services that are easy to modify and extend for additional applications and users
Your applications run on data. However, application data access can be difficult.
Challenges include:
- The need to understand and access increasingly diverse and distributed data sources and types
- Difficulty in sharing data assets with other applications
- Federated query performance that may require optimization
- Complex transformations that may require specialized tools and techniques
- Complex data and application security requirements that need to be enforced
Data Virtualization provides a powerful solution to these application data access challenges.
Data Virtualization lets you:
- Access any data source required
- Model and transform application datasets quickly
- Deliver data to a wide range of applications development tools via industry standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
- Share and reuse application datasets across multiple analytic and operational applications
- Automatically optimize queries
- Conform data access and delivery to enterprise security and governance requirements
With the rise of cloud-based applications and infrastructure, more data than ever resides outside your enterprise. As a result, your need to share data across your cloud and enterprise sources has grown significantly.
Challenges include:
- The need to understand and access increasingly diverse cloud data sources and APIs
- Diverse data consumers, each with their own data needs and application technologies
- Complex transformations that may require specialized tools and techniques
- Wide-area network (WAN) query performance that may require optimization
- Complex cloud data security requirements that need to be enforced
Data Virtualization provides a powerful solution for these cloud data sharing challenges.
Data Virtualization lets you:
- Access any cloud data source
- Model and transform cloud datasets quickly
- Deliver cloud data to a wide range of applications development tools via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
- Share and reuse cloud data across multiple applications
- Automatically optimize queries and apply caching to mitigate WAN latency
- Align data access and delivery to conform with enterprise and cloud data security and governance requirements
With Data Virtualization you get:
- One place to go for cloud and enterprise data
- Better applications from broader cloud data access and more complex transformations
- Lower costs due to dataset reuse across diverse applications
- Faster query performance
- Greater cloud data security and governance
Would you like to know how Datalumen can also help you understand how your organization can benefit from using Data Virtualization? Contact us and start our data conversation.
TIBCO NOW 2020 – KEY DATA MANAGEMENT NEWS
/by Datalumen
Day 1-2-3
Day 2 built upon the discussions of why sustainable business practices and data-centric tools are critical to very practical discussions about how organizations can build a foundation for sustainable innovation. ”Data is the common element,” explained Dan Streetman, TIBCO’s CEO. “How data connects systems, people, and devices. How we manage, unify, and bring order to that data and ultimately how we build on that data foundation to make faster, smarter decisions is how you build a framework for sustainable innovation.”
Day 3 was all about delivering on the promise of sustained innovation and was led by TIBCO’s Head of Customer Excellence, Jeff Hess. He explained that it all starts with putting your customers first. You need to constantly ask “How can we make life better for our customers?” It’s a powerful clarifier in a world of competing priorities. By asking this question every day it’s the only way to ensure you can deliver on the promise of sustained innovation.
What’s new in the TIBCO data management corner?
TIBCO Any Data Hub.
TIBCO Data Virtualization.
Next to that there was also the necessary focus on the key Master Data Management (MDM) component, TIBCO EBX (former Orchestra Networks MDM that got acquired), which still manages to dominate the MDM market. Linked to that and better integrated is the second generation of the recently introduced TIBCO Cloud Metadata and step by step starts to extend the platform capabilities in the meta data management domain.
Ready for your TIBCO NOW binge-watching?
TIBCO NOW 2020 has finished, but be aware that you can still watch everything on-demand through October 9 2020. TIBCO is leaving registration open after the event, so if you haven’t already, you have one last chance to register.Sessions worthwhile checking out:
- Empower Your Business with Any Data from Any Location at Any Speed. Watch ▶
- Build a virtual data layer that integrates your enterprise data silos. Watch ▶
- Govern All Your Master and Reference Data. Watch ▶
- Discover, Catalog, and Govern All Your Enterprise Metadata. Watch ▶
- OPG: Billion-dollar Nuclear Programs on Time and on Budget with MDM. Watch ▶
- What’s new about TIBCO EBX & TIBCO Cloud Metadata. Watch ▶
Did you miss any of these sessions? Do you have a specific data question? Would you like to have a briefing or a demo for your team? Let us know and we are happy to setup a one-on-one.
SUMMER READING TIP
/by DatalumenSummer is here and the longer days it brings means more time available to spend with a ripping read. That’s how it ideally works at least. We selected 3 valuable books worth your extra time.
The Chief Data Officer’s Playbook
The issues and profession of the Chief Data Officer (CDO) are of significant interest and relevance to organisations and data professionals internationally. Written by two practicing CDOs, this new book offers a practical, direct and engaging discussion of the role, its place and importance within organisations. Chief Data Officer is a new and rapidly expanding role and many organisations are finding that it is an uncomfortable fit into the existing C-suite. Bringing together views, opinions and practitioners experience for the first time, The Chief Data Officer’s Playbook offers a compelling guide to anyone looking to understand the current (and possible future) CDO landscape.
Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility
Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility, the first book ever written on the topic of data virtualization, introduces the technology that enables data virtualization and presents ten real-world case studies that demonstrate the significant value and tangible business agility benefits that can be achieved through the implementation of data virtualization solutions. The book introduces the relationship between data virtualization and business agility but also gives you a more thorough exploration of data virtualization technology. Topics include what is data virtualization, why use it, how it works and how enterprises typically adopt it.
Start With Why
Simon Sinek started a movement to help people become more inspired at work, and in turn inspire their colleagues and customers. Since then, millions have been touched by the power of his ideas, including more than 28 million who’ve watched his TED Talk based on ‘Start With Why’ — the third most popular TED video of all time. Sinek starts with a fundamental question: Why are some people and organizations more innovative, more influential, and more profitable than others? Why do some command greater loyalty from customers and employees alike? Even among the successful, why are so few able to repeat their success over and over?
People like Martin Luther King, Steve Jobs, and the Wright Brothers had little in common, but they all started with Why. They realized that people won’t truly buy into a product, service, movement, or idea until they understand the Why behind it. ‘Start With Why’ shows that the leaders who’ve had the greatest influence in the world all think, act, and communicate the same way — and it’s the opposite of what everyone else does. Sinek calls this powerful idea The Golden Circle, and it provides a framework upon which organizations can be built, movements can be led, and people can be inspired. And it all starts with Why.
Summer Giveaways
We’re giving away 50 copies of ‘Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility’. Want to win? Just complete the form and cross your fingers. Good luck!
Winners are picked randomly at the end of the giveaway. Our privacy policy is available here.
FORRESTER DATA VIRTUALIZATION MARKET Q4 2017 UPDATE
/by DatalumenForrester Research recently published The Forrester Wave™: Enterprise Data Virtualization, Q4 2017 report. The firm profiled 13 vendors in this report. The last Wave on this topic was published a while ago, March 2015, with 9 vendors. Here is an overview of what has changed in the last two-and-a-half years.
Data Virtualization Market has Expanded
According to this Forrester report, the enterprise data virtualization market has expanded along multiple dimensions – customer adoption, more industries, more use cases, new players, and acquisitions.
- More customer adoption – Forrester states customer adoption of data virtualization has been gaining momentum. In 2017, Forrester profiled 2,106 global technology decision makers for its Data Global Business Technographics Data and Analytics Survey, and found that “…56% of global technology decision makers in our 2017 survey tell us they have already implemented, are implementing, or are expanding or upgrading their implementations of DV technology, up from 45% in 2016.”
- More industries – Forrester states that in its early years, data virtualization was primarily used in financial services, telecom, and government sectors. In the last 5 years, however, Forrester has found significant adoption of DV in insurance, retail, healthcare, manufacturing, oil and gas, and eCommerce verticals as well.
- More use cases – Further, Forrester found that among the customers who have been using data virtualization, the deployment has increased from single-use case, primarily customer analytics, to a broader enterprise-wide use involving multiple use cases such as internet of things, fraud detection, and integrated insights.
- New players – In the 2017 Enterprise Data Virtualization Wave report, four new vendors have been included implying, in our opinion, expanding data virtualization market.
- Acquisitions – In signs that the data virtualization market is maturing, TIBCO Software recently acquired Cisco Information Server, thus entering the data virtualization market.
We think all these data points are significant indicators that the data virtualization market is a healthy, growing market that is reaching maturity.
Data Virtualization Poised for Further Growth Pushed Forward by Leaders
Forrester expects the data virtualization market to grow further “because more enterprise architecture (EA) professionals see data virtualization as critical to their enterprise data strategy.” It says that these EA Pros are looking to support more complex data virtualization deployments. To satisfy such needs, the leaders featured in the report provide high-end scale, security, modeling, and broad use case support with their mature product offerings. “The leaders we identified offer large and complex deployments, and they support a broader set of use cases and more mature data management capabilities,” Forrester says. It is worth noting that four of the past five leaders retained their Leaders positions, while one vendor slipped into the Strong Performers.
Read the Complete Report
The Forrester Wave: Enterprise Data Virtualization, Q4 2017 is a must read for enterprise architecture (EA) professionals. According to Forrester, “Enterprise data virtualization has become critical to every organization in overcoming growing data challenges. These platforms deliver faster access to connected data and support self-service and agile data-access capabilities for EA pros to drive new business initiatives.”
Would you like to know what
Data Virtualization can also mean for your organization?
Have a look at our Data Virtualization section
and contact us.
HOW TO REALLY SHIFT FROM IT-DRIVEN TO SELF-SERVICE ANALYTICS WITH DATA VIRTUALIZATION? LOOK BEYOND THE SHOP WINDOW.
/by DatalumenBusiness intelligence & analytics today have dramatically shifted from the traditional IT-driven model to a modern self-service approach. This is due to a number of changes, including the fact that the balance of power has steadily shifted from IT to the business, and also the fact that the business community has new access to more innovative technologies that give them powerful analytical and visualization capabilities (e.g. Tableau, …). This increased use and capability has put the business in the driver seat of much front-end BI decision-making.
In order to help your business community continue to increase its self-service capabilities, there is one important, but often-overlooked item: Many implementations fail to realize their full potential because they fall into the trap of building out just the proverbial shop window, and forgetting the actual shop! It is just as important to add increased accessibility and flexibility to the underlying data-layer (and ease the access, discovery, and governance of your data), as it is to provide users the front-end thru powerful analytics and visualization capabilities.
With respect to self‐service analytics, four phases can be identified in the market. These also typically mirror how analytics are implemented in many of companies. The following diagram describes in four phases how data virtualization can strengthen and enrich the self‐service data integration capabilities of tools for reporting and analytics:
THE NEED FOR DATA PREPARATION AND DATA VIRTUALIZATION
To support both IT-driven and Business-driven BI, two techniques are required: data preparation, and data virtualization. There are a many scenarios where you can use these techniques to strengthen and speedup the implementation of self‐service analytics:
- Using data virtualization to operationalize user‐defined data sets
- Using data virtualization as a data source for data preparation
- Using data virtualization to make data sets developed with data preparation available for all users
To learn about how to succeed in your data journey, feel free to contact us. More info about our full spectrum of data solutions is also available on the Datalumen website.
Read more in detail about the different scenario’s in the ‘Strengthening Self-Service Analytics with Data Preparation and Data Virtualization’ whitepaper. In addition, this whitepaper describes how these two BI forms can operate side by side in a cooperative fashion without lowering the level of self‐service for business users. In other words, it describes how the best of both worlds can be combined. This whitepaper is written by Rick Van Der Lans, an indepedent analyst and expert. |