One of my favorite images is the picture of a line painted on the highway and instead of moving a tree branch out of the way, the worker curves the painted line around the broken limb with the caption “Not in my job description”.
If you are tasked with building rich, graphical data exploration visualizations and presenting insight for large volumes of data, you typically not only create those reports and dashboards, but must also load, manipulate, and optimize the raw data. As a data visualization expert, you are expected to go above and beyond your job description to source and massage the data, even though loading and massaging data really isn’t in your job description. However, there are tools that can assist you so that you have more time allocated to meaningful work, rather than having to focus on the mundane tasks of managing data. Below are three ways to make your life better:
1. Data ingestion tools
When I first started working in the data management and analysis industry, my job was to analyze data and pull insights into meaningful readouts to present to my managers. Sound familiar? As the industry has progressed, the advent of cloud computing and tools, like Big Query and Snowflake, make data ingestion a more routine task, albeit one that is still necessary. Technology companies have created affordable products and services that make the loading, transformation, and audit tracking of data feeds a very attractive option. These tools require little or no assistance from your IT department, and they can be installed and operative in weeks (not months) providing operational views into all the data flowing into your solution.
2. Identity resolution and privacy
Simply loading the data is not enough. As a data visualization expert, you need to be thinking about two very important capabilities:
- Identity resolution to prevent the duplication of data
- The latest regional privacy laws
You will need to have confidence in the stories the reports are telling about the data. If they show that 10,000 people who purchased a product live in a specific zip code, you need to trust that users aren’t counted more than once. We can’t expect visualization experts to be wizards, so they must rely on a data management solution that can handle these standard identity resolution tasks. By performing data cleansing and implementing standard processes like NCOA (National Change of Address), you can ensure that the data being reported is deduped and accurately represents the latest iteration of individual, address, and household.
With GDPR’s passage in 2017 and CCPA going live in early 2020, all users who work with any PII data sources need to be cognizant of the need to have traceability incorporated into any data management solution. If a user requests to have her data removed from the database, this task should be centrally managed as it flows through any downstream data or analytics marts.
3. Have a standard data layer created and ready to use
You don’t want to assemble, organize, and aggregate the data when creating your reports and dashboards. Instead, a standard data layer with all the necessary tables and fields will enable you to visualize the data as you see fit. This standard data layer speeds up the initial implementation, and with a data ingestion and ETL tool routing the data into the conformed layer, you don’t need to worry about whether you have the freshest data on which to report. The standard data layer has the business rules, transformation logic, joins, etc. — pre-built for those of us in the data visualization community to effectively do our jobs.
While there are some unicorns in the industry who enjoy data management duties as much as their expected role of visualizing and reporting data learnings, they are not the norm. With the advent of cloud technologies, ETL, and identity resolution tools, the task of managing data to bring disparate sources into a central location has never been easier. These tools, when implemented correctly, are inexpensive and they have provided 50 percent decreases in the time spent on rudimentary data management tasks.
Learn more about Merkle’s Rapid Audience Layer solution here. Are you heading to Tableau Conference 2019? Visit us at Booth K17.