The buzz about “big data” is here for a couple of years now. Have we witnessed incredible results? Yes. But maybe they aren’t as impressive as previously believed they would be. When it comes down to Big Data, we’re actually talking about data integration, data governance and data security. The bottom line? Data needs to be properly managed, whatever its size and type of content. Hence, total data management approaches as master data management are gaining momentum and are the way forward when it comes down to tackling an enterprise’s Big Data problem.
Your First Big Data Stepstone
In order to make Big Data work you need to address data complexity in the context of the golden V’s: Volume, Velocity and Variety. Accessing, ingesting, processing and deploying your data doesn’t automatically happen and traditional data approaches based on manual processes simply don’t work. The reason why these typically fails is you because:
- you need to be able to ingest data at any speed
- you need to process data in a flexible, read scalable and efficient, but also repetitive way
- and last but not least you need to be able to deliver data anywhere and with the dynamics of the ever changing big data landscape in mind, this is definitely a challenge
Your Second Big Data Stepstone
A substantial amount of people believe that Big Data is the golden grail and consider it as a magical black box solution. They believe that you can just get whatever data in your Big Data environment and it miraculously is going result into useful information. Reality is somehow different. In order to get value out of your initative, you also need to actually govern your Big Data. You need to govern it in two ways:
Your Big Data environment is not a trash bin.
Key for success is that you are able to cleanse, enrich and standardize your Big Data. You need to prove the added value of your Big Data initiative so don’t forget your consumers and make sure you are able to generate and share trusted insights. According to Experian’s 2015 Data Quality Benchmark Report, organizations suspect 26% of their data to be inaccurate. Reality is that with Big Data this % can be even be 2 to 3 times worse.
Your Big Data is not an island.
Governing your Big Data is one element but in order to get value out of it you should be able to combine it with the rest of your data landscape. According to Gartner, through 2017, 90% of the information assets from big data analytic efforts will be siloed and unleverageable across multiple business processes. That’s a pity given that using Master Data Management techniques you can break the Big Data walls down and create that 360° view on your customer, product, asset or virtually any other data domain.
Your Third Big Data Stepstone
With the typical Big Data volumes but also growth in mind, many organizations have limited to no visibility into the location and use of their sensitive data. However new laws and regulations like GDPR do require a correct understanding of the data risks based on number of elements like data location, proliferation, protection and usage. This obviously applies to traditional data but is definitely also needed for Big Data. Especially if you know that a substantial amount of organizations tend to use their Big Data environment as a black hole, the risk of having also unknown sensitive Big Data is real.
How do you approach this:
Classify your sensitive data. In a nutshell, data inventory, topology, business process and data flow mapping and operations mapping.
De-identifies your data so it can be used wherever you need it. Think about reporting and analysis environments, think about testing, etc. For this purpose masking and anonymization techniques and software can be used.
Once you know where your sensitive data is located you can actually protect it through tokenization and encryption techniques. These techniques are required if you want to keep and use your sensitive data in the original format.