2022 - Datalumen

COLLIBRA DATA CITIZENS 22 – INNOVATIONS TO SIMPLIFY AND SCALE DATA INTELLIGENCE ACROSS ORGANIZATIONS WITH RICH USER EXPERIENCES

03/11/2022/in Collibra, Data Governance, Data Quality/by Datalumen

Collibra has introduced a range of new innovations at the Data Citizens ’22 conference, aimed at making data intelligence easier and more accessible to users.

Collibra Data Intelligence Cloud has introduced various advancements to improve search, collaboration, business process automation, and analytics capabilities. Additionally, it has also launched new products to provide data access governance and enhance data quality and observability in the cloud. Collibra Data Intelligence Cloud merges an enterprise-level data catalog, data lineage, adaptable governance, uninterrupted quality, and in-built data privacy to deliver a comprehensive solution.

Let’s have a look at the new announced functionality:

Simple and Rich Experience is the key message

Marketplace	Frequently, teams face difficulty in locating dependable data for their use. With the introduction of the Collibra Data Marketplace, this task has become simpler and quicker than ever before. Teams can now access pre-selected and sanctioned data through this platform, enabling them to make informed decisions with greater confidence and reliability. By leveraging the capabilities of the Collibra metadata graph, the Data Marketplace facilitates the swift and effortless search, comprehension, and collaboration with data within the Collibra Data Catalog, akin to performing a speedy Google search.
Usage analytics	To encourage data literacy and encourage user engagement, it’s important to have a clear understanding of user behavior within any data intelligence platform. The Usage Analytics dashboard is a new feature that offers organizations real-time, useful insights into which domains, communities, and assets are being used most frequently by users, allowing teams to monitor adoption rates and take steps to optimize their data intelligence investments.
Homepage	Creating a user-friendly experience that allows users to quickly and easily find what they need is crucial. The revamped Collibra homepage offers a streamlined and personalized experience, featuring insights, links, widgets, and recommended datasets based on a user’s browsing history or popular items. This consistent and intuitive design ensures that users can navigate the platform seamlessly, providing a hassle-free experience every time they log into Collibra Data Intelligence Cloud.
Workflow designer	Data teams often find manual rules and processes to be challenging and prone to errors. Collibra Data Intelligence Cloud’s Workflow Designer, which is now in beta, addresses this issue by enabling teams to work together to develop and utilize new workflows to automate business processes. The Workflow Designer can be accessed within the Collibra Data Intelligence Cloud and now has a new App Model view, allowing users to quickly define, validate, and deploy a set of processes or forms to simplify tasks.

Improved performance, scalability, and security

Collibra Protect	Collibra Protect is a solution that offers smart data controls, allowing organizations to efficiently identify, describe, and safeguard data across various cloud platforms. Collibra has collaborated with Snowflake, the Data Cloud company, to offer this new integration that enables data stewards to define and execute data protection policies without any coding in just a matter of minutes. By using Collibra Protect, organizations gain greater visibility into the usage of sensitive and protected data, and when paired with data classification, it helps them protect data and comply with regulations at scale.
Data Quality & Observability in the Cloud	Collibra’s latest version of Data Quality & Observability provides enhanced scalability, agility, and security to streamline data quality operations across multiple cloud platforms. With the flexibility to deploy this solution in any cloud environment, organizations can reduce their IT overhead, receive real-time updates, and easily adjust their scaling to align with business requirements.
Data Quality Pushdown for Snowflake	The new feature of Data Quality Pushdown for Snowflake empowers organizations to execute data quality operations within Snowflake. With this offering, organizations can leverage the advantages of cloud-based data quality management without the added concern of egress charges and reliance on Spark compute.
New Integrations	Nowadays, almost 77% of organizations are integrating up to five diverse types of data in pipelines, and up to 10 different types of data storage or management technologies. Collibra is pleased to collaborate with top technology organizations worldwide to provide reliable data across a larger number of sources for all users. With new integrations currently in beta, mutual Collibra customers utilizing Snowflake, Azure Data Factory, and Google Cloud Storage can acquire complete visibility into cloud data assets from source to destination and offer trustworthy data to all users throughout the organization.

Some of this functionality was announced as beta and is available to a number of existing customers for testing purposes.

Want to Accelerate your Collibra time to value and increase adoption?

Would you like to find out how Datalumen can help? Contact us and start our data conversation.

HOW A BELGO-GLOBAL BIOPHARMACEUTICAL COMPANY ACCELERATES LIFE-SAVING DRUG DISCOVERIES BY REDUCING QUERY LATENCY 30X

09/06/2022/in Data Acceleration, SingleStore/by Datalumen

Discovering new medical therapies and drugs is a complex and time-consuming process. High-performance computing and interactive data analytics can make a substantial difference and accelerate discovery and time-to-market.

UCB is a multinational biopharmaceutical company headquartered in Brussels. UCB focuses primarily on research and development, specifically involving medications centered on epilepsy, Parkinson’s disease, and Crohn’s disease. The company’s efforts are focused on treatments for severe diseases treated by specialists, particularly in the fields of central nervous system disorders (including epilepsy), inflammatory disorders (including allergy) and oncology.

Challenges

The company’s data backbone built on Oracle database technology was struggling to meet the new data platform’s performance demands. Next to performance scalelability was also a point of attention as future data volumes would make the current system unstable.

UCB wanted to build a data platform for early stage drug discovery enabling scientists to quickly access research data via self-service and as a result accelerate the early discovery process. Its goals included:

Decreasing the time scientists spend waiting for data. Previously, scientists had to wait for days or weeks for the data to be made available for analysis by IT.
Helping scientists reach answers faster. A faster and easier-to-use system would allow scientists to test hypotheses quickly.
A user-friendly self-service interface. Scientists would be able to use this system without needing assistance from IT. They could easily pick and choose the right data for their experiments, prepared in the way that makes the most sense for the success of the experiment.

30X Speed improvements enabling real-time dashboards
48X batch data refresh ops improvement

Requirements

The new data platform needed to deliver faster query performance, while being able to easily scale to handle the large, complex queries scientists used to gather research data. The new solution had to accommodate massive data sets and growth so the company wouldn’t need to transition to another database in the future.

More importantly, UCB’s new early stage discovery data platform needed to be built around the FAIR principle: Findable, Accessible, Interoperable, and Reusable. The chosen database technology would need to fit this approach and easily connect with the rest of UCB’s data stack.

Platform choice

Up to 800 scientists worldwide rely on UCB’s data platform every day to power their essential research. “We chose SingleStore because it offered a modern database that can accelerate time to insight with ultra-fast ingestion, super-low latency, high-performance queries, and high concurrency,” said Frédéric Vanclef, Senior IT Expert, UCB. SingleStore offers parallel, high-scale streaming data ingest that can handle trillions of events per second for immediate availability and concurrency for tens of thousands of users, which supports UCB’s current needs and positions it for future growth.

Outcomes

With SingleStore, UCB is now giving scientists what they need in real time to get the answers they need faster to drive their research:

Empowering scientists with a high-performance self-service solution
UCB scientists can now collect and prepare their data themselves, getting exactly what they need for their experiments. The SingleStore-powered data backbone gathers all of the research and referential data into a single source of truth, allowing users to select from a wide range of high-quality data. Now, scientists can ask more questions during early stage drug discovery. “Thanks to SingleStore, we can do more and do it faster, which is invaluable to our research,” said Vanclef.
Elevating the role of IT from tactical to highly strategic
UCB has a dedicated IT support team to assist its scientists. Before the rollout of the early stage discovery data platform, this team was responsible for handling data requests from scientists. Unfortunately, this scenario meant scientists experienced long delays between when they asked for the data and actually received it. With the new self-service solution, the IT team was able to transform its mission and actions from tactical to highly strategic.
Accelerating drug discovery
UCB’s new data backbone has massive improvements in query speed and latency. Instead of being forced to wait up to 20 minutes for query results, scientists now have access to real-time data and query results in 20 seconds, reducing query latency by more than 30X. They can check on available data sets in real time, eliminating delays between data publication and availability in the data mart. The platform’s batch data refresh operation run times dropped by 48X: from 4 hours to 5 minutes(!).
Providing scientists with analytical flexibility
Each data type has its own optimal analytics approach, requiring different tools to handle the wide range of information that UCB scientists work with. SingleStore supports multiple popular analytics solutions so that scientists can work with their preferred applications for a particular experiment.

Source: SingleStore UCB casestory. Full document can be downloaded here.

Also in need for a data acceleration solution to boost your data?

Would you like to find out how Datalumen can also help you as a SingleStore partner? Contact us and start our data conversation.

THE MARKETING DATA JUNGLE

02/06/2022/in Data Governance, Data Quality, Data Virtualization, Master Data Management/by Datalumen

Customer & household profiling, personalization, journey analysis, segmentation, funnel analytics, acquisition & conversion metrics, predictive analytics & forecasting, … The marketing goal to deliver a trustworthy and complete insight in the customer across different channels can be quiet difficult to accomplish.

A substantial amount of marketing departments have chosen to rely on a mix of platforms going from CEM/CXM, CDP, CRM, eCommerce, Customer Service, Contact Center, Marketing Automation to Marketing Analytics. A lot of these platforms are best of breed and come from a diverse number of vendors who are leader in their specific market segment. Internal custom build solutions (Microsoft Excel, homebrew data environments, …) always complete this type of setup.

78% According to a Forrester study, although 78% of marketers claim that a data-driven marketing strategy is crucial, as many as 70% of them admit they have poor quality and inconsistent data.

The challenges

Creating a 360° customer view across this diverse landscape is not a walk in the park. All of these marketing platforms do provide added value but are basically separate silos. All of these environments use different data and the data that they have in common, is typically used in a different way. If you need to join all these pieces together, you need some magical super glue. Reality is that none of the marketing platform vendors actually have this in house.

Another point of attention is your data scope. We don’t need to explain you that customer experience is the hot thing in marketing nowadays. However marketeers need to do much more than just analyze customer experience data in order to create real customer insight.

Creating insight also requires that the data that you analyze goes beyond the traditional customer data domain. Combining customer data with i.e. the proper product/service, supplier, financial, … data is rather fundamental for this type of exercises. This type of extended data domains is usually lacking or the required detail level is not present in one particular platform.

38% Recent research from KPMG and Forrester Consulting shows that 38% of marketers claimed they have a high level of confidence in their data and analytics that drives their customer insights. That’s said, only a third of them seem to trust the analytics they generate from their business operations.

The foundations

Regardless of the mix of marketing platforms, many marketing leaders don’t succeed in taking full advantage of all their data. As a logical result they also fail to make a real impact with their data driven marketing initiatives. The underlying reason for this issue is that many marketing organizations lack a number of crucial data management building blocks that allow them to break out of these typical martech silos. The most important data capabilities that you should take into account are:

Capability	Description
Master Data Management (aka MDM)	Creating a single view or so called golden record is the essence of Master Data Management. This allows you to make sure that a customer, product, etc is consistent across different applications.
Business Glossary	Having the correct terms & definitions might seem trivial but reality is that in the majority of the organizations noise on the line is reality. However having crystal clear terms and definitions is a basic requirement to have all stakeholders manage the data in the same way and prevent conflicts and waste down the data supply chain.
Data Catalog	Imagine Google-like functionality to search through your data assets. Find out what data you have, what’s the origin, how and where it is being used.
Data Quality	The why of proper data quality is obvious for any data consuming organization. If you have disconnected data landscape, data quality is even more important because it also facilitates the automatic match & merge glue exercise that you put in place to come to a common view on your data assets.
Data Virtualization	Getting real-time access to your data in an ad hoc and dynamic way is one of the missing pieces to get to your 360° view in time and budget. Forgot about traditional consumer headaches such as long waiting times, misunderstood requests, lack of agility, etc.

We intentionally use the term capability because this isn’t a IT story. All of these capabilities have a people, process and technology aspect and all of them should be driven by the business stakeholders. IT and technology is facilitating.

The results

Datalumen_Marketing_data_People_Process_Technology

If you manage to put in place the described data management capabilities you basically get in control. Your organization can find, understand and make data useful. You improve the efficiency of your people and processes, and reduce your data compliance risks. The benefits in a nutshell:

Get full visibility of your data landscape by making data available and easily accessible across your organization. Deliver trusted data with documented definitions and certified data assets, so users feel confident using the data. Take back control using an approach that delivers everything you need to ensure data is accurate, consistent, complete and discoverable.
Increase efficiency of your people and processes. Improve data transparency by establishing one enterprise-wide repository of assets, so every user can easily understand and discover data relevant to them. Increase efficiency using workflows to automate processes, helping improve collaboration and speed of task completion. Quickly understand your data’s history with automated business and technical lineage that help you clearly see how data transforms and flows from system to system and source to report.
Reduce data and compliance risks. Mitigate compliance risk setting up data policies to control data retention and usage that can be applied across the organization, helping you meet your data compliance requirements. Reduce data risk by building and maintaining a business glossary of approved terms and definitions, helping ensure clarity and consistency of data assets for all users.

42% of data-driven marketers say their current technology in place is out of date and insufficient to help them do their jobs. Walker Sands Communications State of Marketing Technology report.

Conclusion

The data you need to be successful with your marketing efforts is there. You just have to transform it into useable data so that you can get accurate insights to make better decisions. The key in all of this is getting rid of your marketing platform silos by making sure that you have the proper data foundations in place. The data foundations to speed up and extend the capabilities of your datadriven marketing initiatives.

Need help unlocking your marketing data?

Would you like to find out how Datalumen can also help you with your marketing & data initiatives? Contact us and start our data conversation.

CHANGE & DATA GOVERNANCE – TAKE A LEAP FORWARD

19/05/2022/in Data Governance/by Datalumen

A successful data governance initiative is based on properly managing the People, Process, Data & Technology square. The most important element of these four is undoubtedly People. The reason for that is that at the end it boils down to people in your organization to act in a new business environment. This always implies change so make sure that you have an enabling framework for managing also the people side of change. Prepare, support and equip individuals at different levels in your organization to drive change and data governance success.

Change & the critical ingredient for data governance success.

Change is crucial in the success or failure of a data governance initiative for two reasons:

1First of all you should realize that with data governance you are going to tilt an organization. What we mean by this is that the situation before data governance is usually a silo-oriented organization. Individual employees, teams, departments, etc are the exclusive owner of their systems and associated data. With the implementation of data governance you will tilt that typical vertical data approach and align data flows with business processes that also run horizontally through an entire organization. This means that you need to help the organization to arrive at an environment where the data sharing & collaboration concept is the new normal.

2The second important reason is the so-called data governance heartbeat. What we see in many organizations is that there is a lot of enthusiasm at the start of a program. However, without the necessary framework, read also a change management plan, you run the fundamental risk that such an initiative will eventually die a silent death. People lose interest, no longer feel involved, no longer see the point of it. From that perspective, it is necessary to create a framework that keeps data governance’s heart beating.

How to approach change?

Change goes beyond training & communication. To facilitate the necessary changes, ChangeLab and Datalumen designed the ADKAR-based LEAP approach. LEAP is an acronym that stands for Learn, Envision, Apply & Poll. Each of these important steps help realize successful and lasting change.

LEARN

• Organizational culture and risk mitigation culture; “Readiness” for change

• Scope and type of the (sub)projects

• Timing and path for the change

• Set up the change team

• Stakeholder assessment & impact analysis (personnel analysis)

• Sponsorship analysis

• Risk assessment

ENVISION

APPLY

PERSIST

Need help covering change in the context of your data initiatives?

Would you like to find out how Datalumen can also help you with your Data Governance initiative? Contact us and start our data conversation.

CALCULATING DATA GOVERNANCE ROI

08/04/2022/in Data Governance/by Dimitri Maesfranckx

7 STEPS TO BUILD A SUCCESSFUL BUSINESS CASE FOR MDM PROGRAMS

16/03/2022/in Master Data Management/by Datalumen

TOP 5 DATA GOVERNANCE MISTAKES & HOW TO AVOID THEM

20/02/2022/in Data Governance/by Datalumen

The importance of data in a digital transformation context is known to everyone. Actually getting control and properly governing this new oil does not happen automatically. In this article we have summarized the top 5 Data Governance mistakes and also give you a number of tips on how to avoid them.

1. Data Governance is not business driven

Who is leading your Data Governance effort? If your initiative is driven by IT, you dramatically limit your chance of success. A Data Governance approach is a company-wide initiative and needs business & it support. It also needs support from the different organizational levels. Your executive level needs to openly express support in different ways (sponsorship but also communication). However this shouldn’t be a top down initiative and all other involved levels will also need to be on board. Keep in mind that they will make your data organization really happen.

2. Data Maturity level of your organization is unknown or too low

Being aware of the need for Data Governance is one thing. Being ready for Data Governance is a different story. In that sense it is crucial to understand the data maturity level of your organization.

There are several models to determine your data maturity level, but one of the most commonly used is the Gartner model. Surveys reveal that 60% of organizations rank themselves in the lowest 3 levels. Referring to this model, your organization should be close (or beyond) the systematic maturity level. If you are not, make sure to first fix this before taking next steps in your initiative. You need to have these basics properly in place. Without this minimum level of maturity, it doesn’t really makes sense to take the next steps. You don’t build a house without the necessary foundations.
3. A Data Governance Project rather than Program approach

A substantial amount of companies tend to start a Data Governance initiative as a traditional project. Think about a well-defined structure, the effort and duration is well known, the benefits have been defined, … When you think about Data Governance or data in general, you know that’s not the case. Data is dynamic, ever changing and it has far more touch points. Because of this, a Data Governance initiative doesn’t fit a traditional focused project management approach. What does fit is a higher level program approach in which you could have defined a number of project streams that focus on one particular area. Some of these streams can have a defined duration (i.e. implementation of a business glossary). Others (i.e. change management) can have a more ongoing character.

4. Big Bang vs Quick Win approach

Regardless of the fact that you have a proper company-wide program in place, you have to make sure that you focus on the proper quick wins to inspire buy-in and help build momentum. Your motto should not be Big Bang but rather Big Vision & Quick Wins.

Data Governance requires involvement from all levels of stakeholders. As a result you need to make everyone clear what your strategy & roadmap looks like.

With this type of programs you need to have the required enthusiasm when you take your first steps. It is key that you keep this heart beat in your program and for that reason you need to deliver quick wins. If you don’t do that, you strongly risk losing traction. Successfully delivering quick wins helps you gain credit and support with future steps.

5. No 3P mix approach

Data Governance has important People, Process and Platform dimensions. It’s never just one of these and requires that you pay the necessary attention to all of them.

When you implement Data Governance, people will almost certainly need to start working in a different way. They potentially may need to give up exclusive data ownership … All elements that require strong change management.
When you implement Data Governance you tilt your organization from a system silo point of view approach to a data process perspective. The ownership of your customer data is no longer just the CRM or a Marketing Manager but all the key stakeholders involved in customer related business processes.
When you want to make Data Governance a success you need to make it as efficient and easy as possible for every stakeholder. This implies that you should also thoroughly think about how you can facilitate them in the best possible way. Typically, this implies looking beyond traditional Excel, Sharepoint, Wiki type solutions and looking into implementing platforms that support your complete Data Governance community.

Also in need for data governance?
Would you like to know how Datalumen can also help you get your data agenda on track? Contact us and start our data conversation.

DATA VIRTUALIZATION: TOP USE CASES THAT MAKE A DIFFERENCE

17/01/2022/in Data Virtualization/by Dimitri Maesfranckx

Data Virtualization is definitely on the rise. At its Data and Analytics Summit in London, Gartner was projecting accelerated data virtualization adoption for both first-time and expanded deployments. Besides market analysts, we also see high demand and can confirm this being one of the hottest data solutions. But what are the top uses cases for data virtualization?

Logical Data Warehouse Architecture (LDW)

Traditional data warehouses entered on repositories are no longer sufficient to support today’s complex data and analytics landscape. The logical data warehouse (LDW) combines the strengths of traditional repository warehouses with alternative data management and access strategies to improve your agility, accelerate innovation, and respond more efficiently to changing business requirements.

Challenges include:

A data services approach that separates data access from processing, processing from transformation, and transformation from delivery
Diverse analytic tools and users
Diverse data types and sources including traditional data repositories, distributed processing (big data), virtualized sources, and analytic sandboxes
Unified business ontologies that resolve diverse IT taxonomies via common semantics
Unified information governance including data quality, master data management, security, and more
Service level agreement (SLA) driven operationalization

Data Virtualization provides a virtualization-centric LDW architecture solution.

With Data Virtualization you can:

Access any source including traditional data repositories, distributed processing (big data), virtualized sources, and analytic sandboxes, both onpremises and in the cloud
Model and transform data services quickly, in conformance with semantic standards
Deliver data in support of a wide range of use cases via industry-standard APIs including ODBC, JDBC, SOAP, REST, and more
Share and reuse data services across many applications
Automatically allocate workloads to match SLA requirements
Align data access and use with enterprise security and governance requirements

With Data Virtualization you get:

One logical place to go for analytic datasets regardless of source or application
Better analysis from broader data access and more complex transformations
Faster analysis time-to-solution via agile data service development and reuse
Higher quality analysis via consistent, well-understood data
Higher SLAs via loose-coupling and optimization of access, processing and transformation
Flexibility to add or change data sources or application consumers as required
More complete and consistent enterprise data security and governance

Logical Data Lake (LDL)

The idea of having all of your data stored and available in a data lake sounds like a wonderful idea to everyone. Finally the promise of Big Data can become reality. But what is the reality behind implementing a data lake? Virtually all organizations already have numerous data repositories – data warehouses, operational data stores, data stored in files, etc. It is impossible to load all of this data into a data lake and give everyone access to it. Next to data volume, You should also think about other elements like different data formats, data quality and even data security.

All this complexity however doesn’t mean that you should forget about a data lake. Using Data Virtualization, it is possible to leave the data in its current environment and create a virtual or better a logical data lake.

Challenges include:

Move all the data into a single location
Move all the data in a timely manner to that single location
Deliver the data as integrated data
Deliver fresh data
Where to store the date and how limit the impact when this storage mechanism needs to be changed
Avoid and limit the impact of the ever changing technologies

Data Virtualization lets you:

Isolate your data consumers from the underlying data architecture
Be flexible in changing the underlying architecture without impacting your data consumers
Deliver fresh data without the need to move it upfront

With Data Virtualization you get:

Fresh data for your data consumers
Agile data delivery in the shape the data consumers like to have it
Deliver data independent whether it is located in traditional systems, multiple big data systems or a combination of all of these
Agility to change the data architecture without impacting the data consumers

Prototyping For Physical Integration

Physical integration is a proven approach to analytic data integration; However, long lead times associated with physical integration — on average 7+ weeks according to TDWI — can delay realizing business value. Further, physical integration requires significant data engineering efforts and a complex software development lifecycle.

Challenges include:

Requirements. Business requirements are not always clear at the start of a project and thus can be difficult for business users to clearly communicate.
Design. Identifying and associating new mappings, new ETLs, and schema changes is complex. Further, current data engineering staff may not understand older schemas and ETLs. This makes detailed technical specifications a key requirement.
Development. Schema changes and ETL builds are required prior to end user validation. Resultant rework cycles often delay solution delivery.
Deployment. Modifying existing warehouse / data mart schemas and ETLs can be difficult and/or risky.

Data Virtualization lets you:

Interactively refine requirements, and based on actual data, build virtual data services side-by-side with business users.
Quickly deploy agreed datasets into production to meet immediate business needs.
Invest additional engineering efforts on physical integration later, only if required.
If required, use mappings and destination schema within the proven dataset as a working prototype for physical integration ETLs and schema changes.
Once physical integration is tested, transparently migrate from virtual to physical without loss of service.

With Data Virtualization you get:

Faster time-to-solution than physical integration, and accelerated business benefits
Less effort spent on upfront requirements definition and technical specification
The right level of data engineering required to meet requirements, while avoiding unnecessary over-engineering
Less disruption of existing physical repositories, schemas, and ETLs

Data Access / Semantic Layer For Analytics

Vendor-specific analytic semantic layers provide specialized data access and semantic transformation capabilities that simplify your analytic application development.

However, these vendor-specific semantic layer solutions have limitations including:

Delayed support of new data sources and types
Inability to share analytic datasets with other vendor’s analytic tools
Federated query performance that is not well optimized
Limited range of transformation capabilities and tools

Data Virtualization provides a vendor-agnostic solution to data access / semantic layer for analytics challenges.

Challenges include:

Access any data source required
Model and transform analytic datasets quickly
Deliver analytic data to a wide range of analytics vendor tools via industrystandard APIs including ODBC, JDBC, SOAP, REST, and more
Share and reuse analytic datasets across multiple vendors’ tools
Automatically optimize queries
Conform analytic data access and delivery to enterprise security and governance requirements

With Data Virtualization you get:

One place to go for analytic datasets regardless of analytic tool vendor
Better analysis from broader data access and more complex transformations
Lower costs, with reuse of analytic datasets across diverse analytic tools and users
Faster query performance
Greater analytic data security and governance

Data Preperation

Self-service data preparation has proven to be a great way for business users to quickly transform raw data into more analytic friendly datasets. However, some agile data preparation needs require data engineering skill and higher-level integration capabilities.

Challenges include:

Support for increasingly diverse and distributed data sources and types
Limited range of transformation capabilities and tools
Constraints on securing, governing, sharing, reusing, and productionizing prepared datasets

Data Virtualization provides an agile data preparation solution for data engineers that complements business user data preparation tools.

Data Virtualization lets you:

Interactively refine requirements and prepare datasets with business users based on actual data
Prepare datasets that may require complex transformations or highperformance queries
Leverage existing datasets when preparing new datasets
Quickly deploy prepared datasets into production when appropriate
Align data preparation activities with enterprise security and governance requirements

With Data Virtualization you get:

Rapid, IT-grade datasets that meet analytic data needs
The right level of data engineering required to meet requirements, while avoiding unnecessary over-engineering
Less effort spent productionizing datasets
More complete and consistent data security and governance

Virtual Operational Data Stores (VODS)

Physical operational data stores (ODS) have proven a useful compromise that balances operational data access needs with operational system SLAs.
However, replicating operational data in an ODS is not without it costs.

Challenges include:

Significant development investments for ODS set up, and for integration projects that move data to them.
Higher operating costs for managing the associated infrastructure.
Integration workloads on the operational system.
Often the operational source is not resource constrained, or operational queries may be light enough to not create significant workloads.
When operational data is in an ODS, it may still require further transformations to make it useful for diverse analysis needs.

Data Virtualization lets you:

Access any operational data or other sources as required
Model and transform operational datasets quickly
Deliver data to a wide range of operational applications via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
Share and reuse analytic datasets across applications
Reduce the impact on operational sources via query optimization and intelligent caching
Conform operational data access and delivery to enterprise security and governance requirements

With Data Virtualization you get:

One virtual place to go for operational data
Better analysis from broader data access and more flexible transformations
Lower costs due to less replicated data maintained in physical ODSs
More than good enough query performance without impacting operational system SLAs

Data Hub Enablement

The data hub is logical architecture that enables data sharing by connecting producers of data (applications, processes, and teams) with consumers of data (other applications, processes, and teams). Master data hubs, logical data warehouses, customer data hubs, reference data stores, and more are examples of different kinds of data hubs. Data hub domains might be geographically focused, business process-focused, or application-focused.

Challenges include:

The data hub provision data to and receive data from analytic and operational applications
Hub data is governed and secure
Data flows into and out of the hub are visible

Data Virtualization data hub solution delivers these requirements.

Data Virtualization lets you:

Introspect sources and identify potential data hub entities and relationships
Access any data hub data source
Model and transform data hub datasets
Deliver data hub datasets to diverse analytic and operational applications via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
Share and reuse data hub datasets across multiple applications
Conform data hub access and delivery to enterprise security and governance requirements

With Data Virtualization you get:

A complete solution for data hub implementations
Better analysis and business processes via consistent use of data hub datasets
Higher analytic and operational application quality via consistent use of data hub datasets
Greater agility when adding or changing data hub datasets
Complete visibility into data hub data flows
End-to-end data hub security and governance

Registry-Style Master Data Management (MDM)

Master Data Management (MDM)is an essential capability. Analyst firms such as Gartner have identified four MDM implementation styles (consolidation, registry, centralized, and coexistence) that you can deploy independently or combine to help enable successful MDM efforts.

Challenges include:

Access to master and reference data from diverse sources
A cross-reference table (index) that reconciles and links related master data entities and identifiers by source
Data services that expose the cross-reference table to analytic and operational applications that require master data from one or more sources
Data federation that leverages the cross-reference table when querying detailed data associated with master entities

Data Virtualization is a poven technology for registry-style MDM solutions.

Data Virtualization lets you:

Introspect sources and identify potential master data entities and relationships
Build a physical master data registry that relates and links master data across sources
Cache registry copies adjacent to MDM user applications to accelerate frequent MDM queries
Combine master, detail, and non-master data to provide more complete 360-degree views of key entities

With Data Virtualization you get:

A complete solution for registry-style MDM implementations
Better analysis via more complete views of master data entities across sources
Higher analytic and data quality via consistent use of master and reference data
Faster query performance and less disruption to master data sources
Greater agility when adding or changing master and reference data sources

Legacy System Migration

New technology provides more advanced capabilities and lower cost infrastructure. You want to take advantage. However, migrating legacy data
repositories to new ones or legacy applications to new applications technology is not easy.

Challenges include:

Business continuity requires non-stop operations before, during, and after the migration.
Applications and data repositories are often tightly coupled making them difficult to change.
Big bang cutovers are problematic due to so many moving parts.
Too often, testing and tuning only happen after the fact.

Data Virtualization provides a flexible solution for legacy system migration challenges.

Data Virtualization lets you:

Create a loosely coupled, middle-tier of data services that mirror as-is data access, transformation, and delivery functionality
Test and tune these data services on the sidelines without impacting current operations
Modify the as-is data services to now support the future-state application or repository, then retest and retune
Migrate the legacy application or repository
Implement future-state data services to consume or deliver data to and from the new application or repository

With Data Virtualization you get:

To take advantage of new technology opportunities that can improve your business and cut your costs
Loose coupling you need to divide complex migration projects into more manageable phases
Less risk by avoiding big bang migrations
Reusable data services that are easy to modify and extend for additional applications and users

Application Data Access

Your applications run on data. However, application data access can be difficult.

Challenges include:

The need to understand and access increasingly diverse and distributed data sources and types
Difficulty in sharing data assets with other applications
Federated query performance that may require optimization
Complex transformations that may require specialized tools and techniques
Complex data and application security requirements that need to be enforced

Data Virtualization provides a powerful solution to these application data access challenges.

Data Virtualization lets you:

Access any data source required
Model and transform application datasets quickly
Deliver data to a wide range of applications development tools via industry standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
Share and reuse application datasets across multiple analytic and operational applications
Automatically optimize queries
Conform data access and delivery to enterprise security and governance requirements

Cloud Data Sharing

With the rise of cloud-based applications and infrastructure, more data than ever resides outside your enterprise. As a result, your need to share data across your cloud and enterprise sources has grown significantly.

Challenges include:

The need to understand and access increasingly diverse cloud data sources and APIs
Diverse data consumers, each with their own data needs and application technologies
Complex transformations that may require specialized tools and techniques
Wide-area network (WAN) query performance that may require optimization
Complex cloud data security requirements that need to be enforced

Data Virtualization provides a powerful solution for these cloud data sharing challenges.

Data Virtualization lets you:

Access any cloud data source
Model and transform cloud datasets quickly
Deliver cloud data to a wide range of applications development tools via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
Share and reuse cloud data across multiple applications
Automatically optimize queries and apply caching to mitigate WAN latency
Align data access and delivery to conform with enterprise and cloud data security and governance requirements

With Data Virtualization you get:

One place to go for cloud and enterprise data
Better applications from broader cloud data access and more complex transformations
Lower costs due to dataset reuse across diverse applications
Faster query performance
Greater cloud data security and governance

Interested in Data Virtualization?
Would you like to know how Datalumen can also help you understand how your organization can benefit from using Data Virtualization? Contact us and start our data conversation.

Archive for year: 2022

Simple and Rich Experience is the key message

Improved performance, scalability, and security

Challenges

Requirements

Platform choice

Outcomes

The challenges

The foundations

The results

Conclusion

Change & the critical ingredient for data governance success.

How to approach change?

1. Data Governance is not business driven

2. Data Maturity level of your organization is unknown or too low

4. Big Bang vs Quick Win approach

5. No 3P mix approach

Challenges include:

With Data Virtualization you can:

With Data Virtualization you get:

I Need To

Solutions

Services

About Us