TOP 5 DATA GOVERNANCE MISTAKES & HOW TO AVOID THEM
/by DatalumenThe importance of data in a digital transformation context is known to everyone. Actually getting control and properly governing this new oil does not happen automatically. In this article we have summarized the top 5 Data Governance mistakes and also give you a number of tips on how to avoid them.
1. Data Governance is not business driven
Who is leading your Data Governance effort? If your initiative is driven by IT, you dramatically limit your chance of success. A Data Governance approach is a company-wide initiative and needs business & it support. It also needs support from the different organizational levels. Your executive level needs to openly express support in different ways (sponsorship but also communication). However this shouldn’t be a top down initiative and all other involved levels will also need to be on board. Keep in mind that they will make your data organization really happen.
2. Data Maturity level of your organization is unknown or too low
Being aware of the need for Data Governance is one thing. Being ready for Data Governance is a different story. In that sense it is crucial to understand the data maturity level of your organization.
There are several models to determine your data maturity level, but one of the most commonly used is the Gartner model. Surveys reveal that 60% of organizations rank themselves in the lowest 3 levels. Referring to this model, your organization should be close (or beyond) the systematic maturity level. If you are not, make sure to first fix this before taking next steps in your initiative. You need to have these basics properly in place. Without this minimum level of maturity, it doesn’t really makes sense to take the next steps. You don’t build a house without the necessary foundations.
3. A Data Governance Project rather than Program approach
A substantial amount of companies tend to start a Data Governance initiative as a traditional project. Think about a well-defined structure, the effort and duration is well known, the benefits have been defined, … When you think about Data Governance or data in general, you know that’s not the case. Data is dynamic, ever changing and it has far more touch points. Because of this, a Data Governance initiative doesn’t fit a traditional focused project management approach. What does fit is a higher level program approach in which you could have defined a number of project streams that focus on one particular area. Some of these streams can have a defined duration (i.e. implementation of a business glossary). Others (i.e. change management) can have a more ongoing character.
4. Big Bang vs Quick Win approach
Regardless of the fact that you have a proper company-wide program in place, you have to make sure that you focus on the proper quick wins to inspire buy-in and help build momentum. Your motto should not be Big Bang but rather Big Vision & Quick Wins.
Data Governance requires involvement from all levels of stakeholders. As a result you need to make everyone clear what your strategy & roadmap looks like.
With this type of programs you need to have the required enthusiasm when you take your first steps. It is key that you keep this heart beat in your program and for that reason you need to deliver quick wins. If you don’t do that, you strongly risk losing traction. Successfully delivering quick wins helps you gain credit and support with future steps.
5. No 3P mix approach
Data Governance has important People, Process and Platform dimensions. It’s never just one of these and requires that you pay the necessary attention to all of them.
- When you implement Data Governance, people will almost certainly need to start working in a different way. They potentially may need to give up exclusive data ownership … All elements that require strong change management.
- When you implement Data Governance you tilt your organization from a system silo point of view approach to a data process perspective. The ownership of your customer data is no longer just the CRM or a Marketing Manager but all the key stakeholders involved in customer related business processes.
- When you want to make Data Governance a success you need to make it as efficient and easy as possible for every stakeholder. This implies that you should also thoroughly think about how you can facilitate them in the best possible way. Typically, this implies looking beyond traditional Excel, Sharepoint, Wiki type solutions and looking into implementing platforms that support your complete Data Governance community.
Would you like to know how Datalumen can also help you get your data agenda on track? Contact us and start our data conversation.
DATA VIRTUALIZATION: TOP USE CASES THAT MAKE A DIFFERENCE
/by Dimitri MaesfranckxTraditional data warehouses entered on repositories are no longer sufficient to support today’s complex data and analytics landscape. The logical data warehouse (LDW) combines the strengths of traditional repository warehouses with alternative data management and access strategies to improve your agility, accelerate innovation, and respond more efficiently to changing business requirements.
Challenges include:
- A data services approach that separates data access from processing, processing from transformation, and transformation from delivery
- Diverse analytic tools and users
- Diverse data types and sources including traditional data repositories, distributed processing (big data), virtualized sources, and analytic sandboxes
- Unified business ontologies that resolve diverse IT taxonomies via common semantics
- Unified information governance including data quality, master data management, security, and more
- Service level agreement (SLA) driven operationalization
Data Virtualization provides a virtualization-centric LDW architecture solution.
With Data Virtualization you can:
- Access any source including traditional data repositories, distributed processing (big data), virtualized sources, and analytic sandboxes, both onpremises and in the cloud
- Model and transform data services quickly, in conformance with semantic standards
- Deliver data in support of a wide range of use cases via industry-standard APIs including ODBC, JDBC, SOAP, REST, and more
- Share and reuse data services across many applications
- Automatically allocate workloads to match SLA requirements
- Align data access and use with enterprise security and governance requirements
With Data Virtualization you get:
- One logical place to go for analytic datasets regardless of source or application
- Better analysis from broader data access and more complex transformations
- Faster analysis time-to-solution via agile data service development and reuse
- Higher quality analysis via consistent, well-understood data
- Higher SLAs via loose-coupling and optimization of access, processing and transformation
- Flexibility to add or change data sources or application consumers as required
- More complete and consistent enterprise data security and governance
The idea of having all of your data stored and available in a data lake sounds like a wonderful idea to everyone. Finally the promise of Big Data can become reality. But what is the reality behind implementing a data lake? Virtually all organizations already have numerous data repositories – data warehouses, operational data stores, data stored in files, etc. It is impossible to load all of this data into a data lake and give everyone access to it. Next to data volume, You should also think about other elements like different data formats, data quality and even data security.
All this complexity however doesn’t mean that you should forget about a data lake. Using Data Virtualization, it is possible to leave the data in its current environment and create a virtual or better a logical data lake.
Challenges include:
- Move all the data into a single location
- Move all the data in a timely manner to that single location
- Deliver the data as integrated data
- Deliver fresh data
- Where to store the date and how limit the impact when this storage mechanism needs to be changed
- Avoid and limit the impact of the ever changing technologies
Data Virtualization lets you:
- Isolate your data consumers from the underlying data architecture
- Be flexible in changing the underlying architecture without impacting your data consumers
- Deliver fresh data without the need to move it upfront
With Data Virtualization you get:
- Fresh data for your data consumers
- Agile data delivery in the shape the data consumers like to have it
- Deliver data independent whether it is located in traditional systems, multiple big data systems or a combination of all of these
- Agility to change the data architecture without impacting the data consumers
Physical integration is a proven approach to analytic data integration; However, long lead times associated with physical integration — on average 7+ weeks according to TDWI — can delay realizing business value. Further, physical integration requires significant data engineering efforts and a complex software development lifecycle.
Challenges include:
- Requirements. Business requirements are not always clear at the start of a project and thus can be difficult for business users to clearly communicate.
- Design. Identifying and associating new mappings, new ETLs, and schema changes is complex. Further, current data engineering staff may not understand older schemas and ETLs. This makes detailed technical specifications a key requirement.
- Development. Schema changes and ETL builds are required prior to end user validation. Resultant rework cycles often delay solution delivery.
- Deployment. Modifying existing warehouse / data mart schemas and ETLs can be difficult and/or risky.
Data Virtualization lets you:
- Interactively refine requirements, and based on actual data, build virtual data services side-by-side with business users.
- Quickly deploy agreed datasets into production to meet immediate business needs.
- Invest additional engineering efforts on physical integration later, only if required.
- If required, use mappings and destination schema within the proven dataset as a working prototype for physical integration ETLs and schema changes.
- Once physical integration is tested, transparently migrate from virtual to physical without loss of service.
With Data Virtualization you get:
- Faster time-to-solution than physical integration, and accelerated business benefits
- Less effort spent on upfront requirements definition and technical specification
- The right level of data engineering required to meet requirements, while avoiding unnecessary over-engineering
- Less disruption of existing physical repositories, schemas, and ETLs
Vendor-specific analytic semantic layers provide specialized data access and semantic transformation capabilities that simplify your analytic application development.
However, these vendor-specific semantic layer solutions have limitations including:
- Delayed support of new data sources and types
- Inability to share analytic datasets with other vendor’s analytic tools
- Federated query performance that is not well optimized
- Limited range of transformation capabilities and tools
Data Virtualization provides a vendor-agnostic solution to data access / semantic layer for analytics challenges.
Challenges include:
- Access any data source required
- Model and transform analytic datasets quickly
- Deliver analytic data to a wide range of analytics vendor tools via industrystandard APIs including ODBC, JDBC, SOAP, REST, and more
- Share and reuse analytic datasets across multiple vendors’ tools
- Automatically optimize queries
- Conform analytic data access and delivery to enterprise security and governance requirements
With Data Virtualization you get:
- One place to go for analytic datasets regardless of analytic tool vendor
- Better analysis from broader data access and more complex transformations
- Lower costs, with reuse of analytic datasets across diverse analytic tools and users
- Faster query performance
- Greater analytic data security and governance
Self-service data preparation has proven to be a great way for business users to quickly transform raw data into more analytic friendly datasets. However, some agile data preparation needs require data engineering skill and higher-level integration capabilities.
Challenges include:
- Support for increasingly diverse and distributed data sources and types
- Limited range of transformation capabilities and tools
- Constraints on securing, governing, sharing, reusing, and productionizing prepared datasets
Data Virtualization provides an agile data preparation solution for data engineers that complements business user data preparation tools.
Data Virtualization lets you:
- Interactively refine requirements and prepare datasets with business users based on actual data
- Prepare datasets that may require complex transformations or highperformance queries
- Leverage existing datasets when preparing new datasets
- Quickly deploy prepared datasets into production when appropriate
- Align data preparation activities with enterprise security and governance requirements
With Data Virtualization you get:
- Rapid, IT-grade datasets that meet analytic data needs
- The right level of data engineering required to meet requirements, while avoiding unnecessary over-engineering
- Less effort spent productionizing datasets
- More complete and consistent data security and governance
Physical operational data stores (ODS) have proven a useful compromise that balances operational data access needs with operational system SLAs.
However, replicating operational data in an ODS is not without it costs.
Challenges include:
- Significant development investments for ODS set up, and for integration projects that move data to them.
- Higher operating costs for managing the associated infrastructure.
- Integration workloads on the operational system.
- Often the operational source is not resource constrained, or operational queries may be light enough to not create significant workloads.
- When operational data is in an ODS, it may still require further transformations to make it useful for diverse analysis needs.
Data Virtualization lets you:
- Access any operational data or other sources as required
- Model and transform operational datasets quickly
- Deliver data to a wide range of operational applications via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
- Share and reuse analytic datasets across applications
- Reduce the impact on operational sources via query optimization and intelligent caching
- Conform operational data access and delivery to enterprise security and governance requirements
With Data Virtualization you get:
- One virtual place to go for operational data
- Better analysis from broader data access and more flexible transformations
- Lower costs due to less replicated data maintained in physical ODSs
- More than good enough query performance without impacting operational system SLAs
The data hub is logical architecture that enables data sharing by connecting producers of data (applications, processes, and teams) with consumers of data (other applications, processes, and teams). Master data hubs, logical data warehouses, customer data hubs, reference data stores, and more are examples of different kinds of data hubs. Data hub domains might be geographically focused, business process-focused, or application-focused.
Challenges include:
- The data hub provision data to and receive data from analytic and operational applications
- Hub data is governed and secure
- Data flows into and out of the hub are visible
Data Virtualization data hub solution delivers these requirements.
Data Virtualization lets you:
- Introspect sources and identify potential data hub entities and relationships
- Access any data hub data source
- Model and transform data hub datasets
- Deliver data hub datasets to diverse analytic and operational applications via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
- Share and reuse data hub datasets across multiple applications
- Conform data hub access and delivery to enterprise security and governance requirements
With Data Virtualization you get:
- A complete solution for data hub implementations
- Better analysis and business processes via consistent use of data hub datasets
- Higher analytic and operational application quality via consistent use of data hub datasets
- Greater agility when adding or changing data hub datasets
- Complete visibility into data hub data flows
- End-to-end data hub security and governance
Master Data Management (MDM)is an essential capability. Analyst firms such as Gartner have identified four MDM implementation styles (consolidation, registry, centralized, and coexistence) that you can deploy independently or combine to help enable successful MDM efforts.
Challenges include:
- Access to master and reference data from diverse sources
- A cross-reference table (index) that reconciles and links related master data entities and identifiers by source
- Data services that expose the cross-reference table to analytic and operational applications that require master data from one or more sources
- Data federation that leverages the cross-reference table when querying detailed data associated with master entities
Data Virtualization is a poven technology for registry-style MDM solutions.
Data Virtualization lets you:
- Introspect sources and identify potential master data entities and relationships
- Build a physical master data registry that relates and links master data across sources
- Cache registry copies adjacent to MDM user applications to accelerate frequent MDM queries
- Combine master, detail, and non-master data to provide more complete 360-degree views of key entities
With Data Virtualization you get:
- A complete solution for registry-style MDM implementations
- Better analysis via more complete views of master data entities across sources
- Higher analytic and data quality via consistent use of master and reference data
- Faster query performance and less disruption to master data sources
- Greater agility when adding or changing master and reference data sources
New technology provides more advanced capabilities and lower cost infrastructure. You want to take advantage. However, migrating legacy data
repositories to new ones or legacy applications to new applications technology is not easy.
Challenges include:
- Business continuity requires non-stop operations before, during, and after the migration.
- Applications and data repositories are often tightly coupled making them difficult to change.
- Big bang cutovers are problematic due to so many moving parts.
- Too often, testing and tuning only happen after the fact.
Data Virtualization provides a flexible solution for legacy system migration challenges.
Data Virtualization lets you:
- Create a loosely coupled, middle-tier of data services that mirror as-is data access, transformation, and delivery functionality
- Test and tune these data services on the sidelines without impacting current operations
- Modify the as-is data services to now support the future-state application or repository, then retest and retune
- Migrate the legacy application or repository
- Implement future-state data services to consume or deliver data to and from the new application or repository
With Data Virtualization you get:
- To take advantage of new technology opportunities that can improve your business and cut your costs
- Loose coupling you need to divide complex migration projects into more manageable phases
- Less risk by avoiding big bang migrations
- Reusable data services that are easy to modify and extend for additional applications and users
Your applications run on data. However, application data access can be difficult.
Challenges include:
- The need to understand and access increasingly diverse and distributed data sources and types
- Difficulty in sharing data assets with other applications
- Federated query performance that may require optimization
- Complex transformations that may require specialized tools and techniques
- Complex data and application security requirements that need to be enforced
Data Virtualization provides a powerful solution to these application data access challenges.
Data Virtualization lets you:
- Access any data source required
- Model and transform application datasets quickly
- Deliver data to a wide range of applications development tools via industry standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
- Share and reuse application datasets across multiple analytic and operational applications
- Automatically optimize queries
- Conform data access and delivery to enterprise security and governance requirements
With the rise of cloud-based applications and infrastructure, more data than ever resides outside your enterprise. As a result, your need to share data across your cloud and enterprise sources has grown significantly.
Challenges include:
- The need to understand and access increasingly diverse cloud data sources and APIs
- Diverse data consumers, each with their own data needs and application technologies
- Complex transformations that may require specialized tools and techniques
- Wide-area network (WAN) query performance that may require optimization
- Complex cloud data security requirements that need to be enforced
Data Virtualization provides a powerful solution for these cloud data sharing challenges.
Data Virtualization lets you:
- Access any cloud data source
- Model and transform cloud datasets quickly
- Deliver cloud data to a wide range of applications development tools via industry-standard APIs, protocols, and architectures including ODBC, JDBC, SOAP, REST, and more
- Share and reuse cloud data across multiple applications
- Automatically optimize queries and apply caching to mitigate WAN latency
- Align data access and delivery to conform with enterprise and cloud data security and governance requirements
With Data Virtualization you get:
- One place to go for cloud and enterprise data
- Better applications from broader cloud data access and more complex transformations
- Lower costs due to dataset reuse across diverse applications
- Faster query performance
- Greater cloud data security and governance
Would you like to know how Datalumen can also help you understand how your organization can benefit from using Data Virtualization? Contact us and start our data conversation.
THE GDPR BUSINESS VALUE ROADMAP
/by DatalumenGetting a good understanding of the requirements but also the opportunities and business value is not easy. We designed a GDPR business value roadmap to help you with this and also make you understand what capabilities you need to get the job done.
Complete the form and download this Datalumen infogram (A3 PDF).
Would you like to know what Datalumen can also mean to your GDPR or other data governance initiatives? Have a look at our GDPR or Data Governance,
The Datalumen privacy policy can be consulted here.
contact us and start our Data Conversation.
TIBCO NOW 2020 – KEY DATA MANAGEMENT NEWS
/by Datalumen
Day 1-2-3
Day 2 built upon the discussions of why sustainable business practices and data-centric tools are critical to very practical discussions about how organizations can build a foundation for sustainable innovation. ”Data is the common element,” explained Dan Streetman, TIBCO’s CEO. “How data connects systems, people, and devices. How we manage, unify, and bring order to that data and ultimately how we build on that data foundation to make faster, smarter decisions is how you build a framework for sustainable innovation.”
Day 3 was all about delivering on the promise of sustained innovation and was led by TIBCO’s Head of Customer Excellence, Jeff Hess. He explained that it all starts with putting your customers first. You need to constantly ask “How can we make life better for our customers?” It’s a powerful clarifier in a world of competing priorities. By asking this question every day it’s the only way to ensure you can deliver on the promise of sustained innovation.
What’s new in the TIBCO data management corner?
TIBCO Any Data Hub.
TIBCO Data Virtualization.
Next to that there was also the necessary focus on the key Master Data Management (MDM) component, TIBCO EBX (former Orchestra Networks MDM that got acquired), which still manages to dominate the MDM market. Linked to that and better integrated is the second generation of the recently introduced TIBCO Cloud Metadata and step by step starts to extend the platform capabilities in the meta data management domain.
Ready for your TIBCO NOW binge-watching?
TIBCO NOW 2020 has finished, but be aware that you can still watch everything on-demand through October 9 2020. TIBCO is leaving registration open after the event, so if you haven’t already, you have one last chance to register.Sessions worthwhile checking out:
- Empower Your Business with Any Data from Any Location at Any Speed. Watch ▶
- Build a virtual data layer that integrates your enterprise data silos. Watch ▶
- Govern All Your Master and Reference Data. Watch ▶
- Discover, Catalog, and Govern All Your Enterprise Metadata. Watch ▶
- OPG: Billion-dollar Nuclear Programs on Time and on Budget with MDM. Watch ▶
- What’s new about TIBCO EBX & TIBCO Cloud Metadata. Watch ▶
Did you miss any of these sessions? Do you have a specific data question? Would you like to have a briefing or a demo for your team? Let us know and we are happy to setup a one-on-one.
CDO EXCHANGE 2020 – KEY TAKEAWAYS
/by DatalumenKey Takeaways
- No data program or initiative without a purpose. Sounds like basic but so true. It’s not the first time that that a data initiative is kicked off because it’s innovative but at the end doesn’t address any real business need.
- In order to be successful you need to inform business leaders at all levels. Not just C-level but all leadership in your business stakeholder community. Of course, these business leaders also need to be open to listen. Reading tip: Also have see our post about MDM business case building.
- Correctly translating business needs is important but also understanding the right priorities is key. What are my real burning data platforms?
- In the AI, ML, …. Basically the data science context, we are happy to see that a substantial number of business leaders managed to gain understanding of the most important principles of data science. Pay the necessary attention to this in your organization and make sure that you also remove the data & data science ‘language barrier’.
- Don’t forget the company politics. In some organizations you will need to cope with individuals attempting to sabotage constructive change if it was ‘not invented here’ or ‘owned by us’. This point of attention was valid before, but unfortunately is still there and definitely present in the context of data programs.
- Ethics is not about a choice, it’s an obligation. Data can be powerful and can potentially deliver huge value. Besides this potential, it also comes with a duty of care. In an era where customer centricity is vital, you need to make sure that your data management is objective, trustworthy and transparent to your customers and other stakeholders.
- Ethics carry a cost. However the cost of not doing things right is much higher. Think about the overall and longer term reputationally and commercially cost.
Do you have a data management question or require some level of support with a data initiative in your organization? Feel to reach out to us and schedule a free sync session.
SUMMER READING TIP
/by DatalumenSummer is here and the longer days it brings means more time available to spend with a ripping read. That’s how it ideally works at least. We selected 3 valuable books worth your extra time.
The Chief Data Officer’s Playbook
The issues and profession of the Chief Data Officer (CDO) are of significant interest and relevance to organisations and data professionals internationally. Written by two practicing CDOs, this new book offers a practical, direct and engaging discussion of the role, its place and importance within organisations. Chief Data Officer is a new and rapidly expanding role and many organisations are finding that it is an uncomfortable fit into the existing C-suite. Bringing together views, opinions and practitioners experience for the first time, The Chief Data Officer’s Playbook offers a compelling guide to anyone looking to understand the current (and possible future) CDO landscape.
Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility
Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility, the first book ever written on the topic of data virtualization, introduces the technology that enables data virtualization and presents ten real-world case studies that demonstrate the significant value and tangible business agility benefits that can be achieved through the implementation of data virtualization solutions. The book introduces the relationship between data virtualization and business agility but also gives you a more thorough exploration of data virtualization technology. Topics include what is data virtualization, why use it, how it works and how enterprises typically adopt it.
Start With Why
Simon Sinek started a movement to help people become more inspired at work, and in turn inspire their colleagues and customers. Since then, millions have been touched by the power of his ideas, including more than 28 million who’ve watched his TED Talk based on ‘Start With Why’ — the third most popular TED video of all time. Sinek starts with a fundamental question: Why are some people and organizations more innovative, more influential, and more profitable than others? Why do some command greater loyalty from customers and employees alike? Even among the successful, why are so few able to repeat their success over and over?
People like Martin Luther King, Steve Jobs, and the Wright Brothers had little in common, but they all started with Why. They realized that people won’t truly buy into a product, service, movement, or idea until they understand the Why behind it. ‘Start With Why’ shows that the leaders who’ve had the greatest influence in the world all think, act, and communicate the same way — and it’s the opposite of what everyone else does. Sinek calls this powerful idea The Golden Circle, and it provides a framework upon which organizations can be built, movements can be led, and people can be inspired. And it all starts with Why.
Summer Giveaways
We’re giving away 50 copies of ‘Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility’. Want to win? Just complete the form and cross your fingers. Good luck!
Winners are picked randomly at the end of the giveaway. Our privacy policy is available here.
RABOBANK GIVES CUSTOMERS ANIMAL & PLANT NAMES TO ADDRESS GDPR REQUIREMENTS
/by DatalumenThe Dutch bank Rabobank has implemented a creative way of using customer data, without having to request permissions. If you are one of their customers and they use your data with internal tests to develop new services, there is a chance that you will get a different name. With special software data is pseudonymized and they do so with Latin plant and animal names.
Your first name will become i.e. Rosa arvensis, the Latin name of a forest rose, and your street name i.e. Turdus merula, the scientific name of a blackbird. It is a useful solution for the bank to be somehow in line with the General Data Protection Regulation (GDPR) that takes effect on the 25th of May. When developing applications or services, analyzing data or executing marketing campaigns based on PII (Personally Identifiable Information) type of data, companies require to have an explicit consent. In order to be able to do this after May and without getting your consent, the bank uses data masking / pseudonnymization techniques.
Explicit consent & pseudonymization
With the new privacy law the personal data of citizens are better protected. One of the corner stones of the GDPR is the requirement to get an explicit consent and linked to that the purpose. Even with a general consent, companies do not get a carte blanche to do whatever they want to do with your data. Organizations must explain how data is used and by whom, where they are stored and for how long (more info about GDPR). Companies can work around these limitations if they anonymize / pseudonymize this PII type of data because they can still use and valorize this data but without a direct and obvious link to you as a person. You as a person become unrecognizable but your data remains usable for analysis or tests.
Why scientific animal and plant names?
‘You can not use names that are traceable to the person according to the rules, but suppose it is a requirement to use letters with names, you have to come up with something else,” explains the vendor that delivered the software. “That’s how we came up with flower names, you can not confuse them, but they look like names for the system. Therefore, it is not necessary for organizations to change entire programs to comply with the new privacy law”.°
Note that data anonymization/ pseudonymization technology does not require you to use plant and animal names. Most of this type of implementations will convert real to fictitious names and addresses that even better reflect the reality and perhaps better also match the usage requirements (i.e. specific application testing requirements). Typically substitution techniques are applied where a real name is replaced with a another real name.
Take aways
Pseudonymization vs anonymization
Pseudonymization and anonymization are two distinct terms that are often confused in the data security world. With the advent of GDPR, it is important to understand the difference, since anonymized data and pseudonymized data fall under very different categories in the regulation. Pseudonymization and anonymization are different in one key aspect. Anonymization irreversibly removes any way of identifying the data subject. Pseudonymization substitutes the identity of the data subject in such a way that additional information is required to re-identify the data subject. With anonymisation, the data is cleansed for any information that may be an identifier of a data subject. Pseudonymisation does not remove all identifying information from the data but only reduces the linkability of a dataset with the original identity (using i.e. a specific encryption scheme).
Pseudonymization is a method to substitute identifiable data with a reversible, consistent value. Anonymization is the destruction of the identifiable data.
Only for test data management?
You will need to look into your exact use cases and determine what techniques are the most appropriate ones. Every organization will most likely need both. Here are some use cases that illustrate this:
Use case | Functionality | Technique |
Your marketing team needs to setup a marketing campaign and will need to use customer data (city, total customer value, household context, …). | Depending on the consent that you received, anonymization or pseudonymization techniques might need to be applied. | Data Masking |
You are currently implementing a new CRM system and have outsourced the implementation to an external partner. | Anonymization needs to be applied. The data (including the sensitive PII data) that you use for test data management purposes will need to transformed to data that cannot be linked to the original. | Data Masking |
You are implementing a cloud based business application and want to make sure that your PII data is really protected. You even want to prevent that the IT team (with full system and database privileges) of your cloud provider has no access to your data. | Distinct from data masking, data encryption translates data into another form, or code, so that only people with access to a secret key or password can read it. People with access but without the key will not be able to read the real content of the data. | Data Encryption |
You have a global organization also servicing EU clients. Due to the GDPR, you want to prevent your non-EU employees to access data from your EU clients. | Based on role and location, dynamic data masking accommodates data security and privacy policies that vary based on users’ locations. Also data encryption can be setup to facilitate this. | Data Masking Data Encryption |
Your have a brilliant team of data scientists on board. They love to crunch all your Big Data and come up with the best analysis. In order to do that, they need all the data you possibly have. | A data lake also needs to be in line with what the GDPR specifies. Depending on the usage you may need to implement anonymization or pseudonymization techniques. | Data Masking |
Is Pseudenomization the golden GDPR bullet?
Pseudonomization or anonymization can be one aspect of a good GDPR approach. However, it is definitely not the complete answer and you also will need to look into a number of other important elements:
Consent Mastering
Key to the GDPR is consent and the linked purpose dimension. In order to manage the complete consent state you need to make sure that this information is available to all your data consumers and automatically applied. You can use consent mastering techniques such as master data management and data virtualization for this purpose.
Data Discovery & Classification
The GDPR is all about protecting personal data. Do you know where all you PII type of data is located? Data discovery will automatically locate and classify sensitive data and calculate risk/breach cost based on defined policies.
Data Discovery & Classification
Data Register
A data register is also a key GDPR requirement. You are expected to maintain a record of processing activities under your responsibility or with other words you must keep an inventory of all personal data processed. The minimum information goes beyond knowing what data an organization processes. Also included should be for example the purposes of the processing, whether or not the personal data is exported and all third parties receiving the data.
A data register that is integrated in your overall data governance program and that is linked with the reality of your data landscape is the recommended way forward.
° Financieele Dagblad
Would you like to know how Datalumen can also enable you to use your data assets in line with the GDPR?
Contact us and start our Data Conversation.
EUROPEAN RETAILERS ARE MISSING 35% OF SALES DUE TO LACK OF PRODUCT INFORMATION
/by Dimitri MaesfranckxThe holiday season is the most important sales moment of the year. Nevertheless, a Zetes study reveals that retailers miss about 35 percent sales due to products not being immediately available or the lack of product information.
A quarter of consumers leave a store without actually buying anything if they do not immediately see the product that they are looking for or if that is not immediately available. It is one of the conclusions of a market research conducted by supply chain expert Zetes. The study specifically focused on buyer behavior during the annual peak period between November and January and analyzed both physical and online retail. 120 European retailers and over 2000 consumers were interviewed for this study in the January 2018 timeframe.
Time is money
The study states that stores miss 35 percent of sales due to the unavailability of products. The main reason for this is the expectation of the customer: more than in the past, customers will simply leave a store if they do not immediately see a product and they do not bother to talk to a shop assistant.
If customers do however approach a shop assistant, they expect to receive more information about the product availability within two minutes. A rather limited window of opportunity especially if you know that the study also calculated that 51 percent of shop employees need to go to a cash desk to obtain the necessary information, and that 47 percent also needs to check the warehouse to verify the availability of a product. Both actions cost time and time is very expensive during the peak period. The study also reveals that 62 percent of retailers do not have access to real-time product data.
Return Management
Also deliveries and returns typically cause extra problems during these busy months. It is common knowledge that people tend to buy quicker when they are sure that they can possibly return a product. The processing of returned parcels also causes problems. 26 percent of retailers indicate that they are having problems during the peak, so that only 39 percent of the returned goods are available for sale within 48 hours.
Conclusion
“A lack of visibility of data is the core of these sales problems during the holidays,” the report states. “Consumers want choices, and they want to be informed. Instead of a general “not available” message, a retailer has a much greater chance of securing sales by telling the customer that a product will soon be back in stock and delivered within three days or will be available for click & collect.” There is still a significant room for information management improvement with direct sales optimization as a result. Would you like to know how Datalumen can also enable you to get real-time product information? Contact us and start our Data Conversation.
More information
https://www.zetes.com/en/white-papers
GARTNER SURVEY FINDS CHIEF DATA OFFICERS ARE DELIVERING BUSINESS IMPACT AND ENABLING DIGITAL TRANSFORMATION
/by DatalumenBy 2021, the CDO Role Will Be the Most Gender Diverse of All Technology-Affiliated C-level Positions.
As the role of chief data officer (CDO) continues to gain traction within organizations, a recent survey by Gartner, Inc. found that these data and analytics leaders are proving to be a linchpin of digital business transformation.
The third annual Gartner Chief Data Officer survey was conducted July through September 2017 with 287 CDOs, chief analytics officers and other high-level data and analytics leaders from across the world. Respondents were required to have the title of CDO, chief analytics officer or be a senior leader with responsibility for leading data and/or analytics in their organization.
“While the early crop of CDOs was focused on data governance, data quality and regulatory drivers, today’s CDOs are now also delivering tangible business value, and enabling a data-driven culture,” said Valerie Logan, research director at Gartner. “Aligned with this shift in focus, the survey also showed that for the first time, more than half of CDOs now report directly to a top business leader such as the CEO, COO, CFO, president/owner or board/shareholders. By 2021, the office of the CDO will be seen as a mission-critical function comparable to IT, business operations, HR and finance in 75 percent of large enterprises.”
The survey found that support for the CDO role and business function is rising globally. A majority of survey respondents reported holding the formal title of CDO, revealing a steady increase over 2016 (57 percent in 2017 compared with 50 percent in 2016). Those organizations implementing an Office of the CDO also rose since last year, with 47 percent reporting an Office of the CDO implemented (either formally or informally) in 2017, compared with 23 percent fully implemented in 2016.
“The steady maturation of the office of the CDO underlines the acceptance and broader understanding of the role and recognizes the impact and value CDOs worldwide are providing,” said Michael Moran, research director at Gartner. “The addition of new talent for increasing responsibilities, growing budgets and increasing positive engagement across the C-suite illustrate how central the role of CDO is becoming to more and more organizations.”
Budgets are also on the rise. Respondents to the 2017 survey report an average CDO office budget of $8 million, representing a 23 percent increase from the average of $6.5 million reported in 2016. Fifteen percent of respondents report budgets more than $20 million, contrasting with 7 percent last year. A further indicator of maturity is the size of the office of the CDO organization. Last year’s study reported total full time employees at an average of 38 (not distinguishing between direct and indirect reporting), while this year reports an average of 54 direct and indirect employees, representing the federated nature of the office of the CDO design.
Key Findings
CDO shift from defense to offense to drive digital transformation
With more than one-third of respondents saying “increase revenue” is a top three measure of success, the survey findings show a clear bias developing in favor of value creation over risk mitigation as the key measure of success for a CDO. The survey also looked at how CDOs allocate their time. On a mean basis, 45 percent of the CDO’s time is allocated to value creation and/or revenue generation, 28 percent to cost savings and efficiency, and 27 percent to risk mitigation.
“CDOs and any data and analytics leader must take responsibility to put data governance and analytics principles on the digital agenda. They have the right and obligation to do it,” said Mario Faria, managing vice president at Gartner.
CDO are responsible for more than just data governance
According to the survey, in 2017, CDOs are not just focused on data as the title may imply. Their responsibilities span data management, analytics, data science, ethics and digital transformation. A larger than expected percentage of respondents (36 percent) also report responsibility for profit and loss (P&L) ownership. “This increased level of reported responsibility by CDOs reflects the growing importance and pervasive nature of data and analytics across organizations, and the maturity of the CDO role and function,” said Ms. Logan.
In the 2017 survey, 86 percent of respondents ranked “defining data and analytics strategy for the organization” as their top responsibility, up from 64 percent in 2016. This reflects a need for creating or modernizing data and analytics strategies within an increasing dependence on data and insights within a digital business context.
CDO are becoming impactful change agents leading the data-driven transformation
The survey results provided insight into the kind of activities CDOs are taking on in order to drive change in their organizations. Several areas seem to have a notable increase in CDO responsibilities compared with last year:
- Serving as a digital advisor: 71 percent of respondents are acting as a thought leader on emerging digital models, and helping to create the digital business vision for the enterprise.
- Providing an external pulse and liaison: 60 percent of respondents are assessing external opportunities and threats as input to business strategy, and 75 percent of respondents are building and maintaining external relationships across the organization’s ecosystem.
- Exploiting data for competitive edge: 77 percent of respondents are developing new data and analytics solutions to compete in new ways.
CDO are diverse and tackling a wide array of internal challenges
Gartner predicts that by 2021, the CDO role will be the most gender diverse of all technology-affiliated C-level positions and the survey results reflect that position. Of the respondents to Gartner’s 2017 CDO survey who provided their gender, 19 percent were female and this proportion is even higher within large organizations — 25 percent in organizations with worldwide revenue of more than $1 billion. This contrasts with 13 percent of CIOs who are women, per the 2018 Gartner CIO Agenda Survey. When it comes to average age of CDOs, 29 percent of respondents said they were 40 or younger.
The survey respondents reported that there is no shortage of internal roadblocks challenging CDOs. The top internal roadblock to the success of the Office of the CDO is “culture challenges to accept change” — a top three challenge for 40 percent of respondents in 2017. A new roadblock, “poor data literacy,” debuted as the second biggest challenge (35 percent), suggesting that a top CDO priority is ensuring commonality of shared language and fluency with data, analytics and business outcomes across a wide range of organizational roles. When asked about engagement with other C-level executives, respondents ranked the relationship with the CIO and CTO as the strongest, followed by a broad, healthy degree of positive engagement across the C-Suite. Would you like to know what Datalumen can mean to your CDO Office? Have a look at our Services Offering,
contact us and start our Data Conversation.