Data cleansing is a critical step that all organizations must factor into their data management and technology efforts. Today, organizations collect data from multiple sources, from their website to social media, in app, brick and mortar, the call center, and more. As consumers around the world now have the right to see and even delete their data, if you don’t have an efficient system for compliant customer lookup and erasure, this could be a real problem.
1) Identify siloed data
Often, teams across an organization can own different platforms that house data, causing data to be all over the place and mismanaged in siloes. This can not only cause redundancies of data records but can also quickly become a compliance and privacy risk (more on that later). The first step should be to identify any siloes to connect all data to feed into a data lake, which is a central storage location where the entire organization can access all the data within one location. To find these siloes, have conversations with teams around your organization – even across regions if needed – to ensure that you are accounting for every place that data lives to bring it all together.
From there, build out a step-by-step roadmap framework of how you want to centralize and host this data. This is also a critical step because when the data comes together, it should be formatted in a standardized, usable format for all the teams using it. This is known as data governance and helps to set rules and regulations within the organization of how data will be formatted, when to use it, and who has access.
2) Adhere to data compliance
As technologies evolve, some organizations have kept all their data since its inception and have never deleted any of it, cleansed it, or consolidated it. Because of this, teams may not even understand what the data is or why it was collected. Additionally, to add a layer of complexity, when organizations are bought or sold, parent companies need to understand the data and combine it to create a unified ecosystem with the adopted data. In an example of a company merging to a parent company, the siloed and potentially unaccounted for data can sit in different environments and pose a compliance risk for CCPA, GDPR, and other state privacy laws. If it is not identified, categorized, and consolidated, it should be considered for possible deletion. As data managers come and go, data locations and access can get lost in the shuffle. This poses a potentially huge issue as consumers now have the right to request to have their data deleted.
It also increases the risk of unknown access to data (IT can’t keep up with where all the data is and who is accessing it – this is why organizations need a data governance policy). Having a policy that looks at cleansing your data can help possibly prepare for future privacy laws and ensure a standard usable format.
3) Consider storage costs
Once organizations can collect all data and house it in a single location, it is a good idea to consider hosting data in the cloud. Cloud solutions such as Amazon Web Services and Google Cloud allow for you to pay as you go and not overspend on storage that you aren’t using, unlike traditional on-premise solutions which also take up physical space. Some cloud solutions even provide discounts for using more storage under one account. Data storage costs can vary; if you are dealing with lots of corporate data it can quickly become expensive.
These considerations can also help identify which data can be moved to cheaper options. We call this cold storage, which is a type of storage that generally has slow speeds and is seldomly or never accessed. Data that needs to be accessed continuously should be placed in hot storage for organization, where data can be accessed quickly and frequently.
These data cleaning considerations can be daunting, and it can be helpful to talk to a technology strategy consulting partner to build out your data governance roadmap to find out where to start. Want to learn how Merkle can help? Contact our experts here.