The data surrounding the COVID-19 pandemic and the need to be able to quickly understand it has shone a bright light on data visualization.
While business users may not have the same health critical needs as pandemic organizations, they do need the ability to have understandable and meaningful insights into hard to understand and shifting data.
The pandemic has illuminated the need for data visualization, not created the need.
Accurate and timely visual representation of data, however, comes at a cost. Without a proper understanding of what’s required to successfully execute the visual representation of data (commonly referred to as a dashboard) with the right information, it’s easy to stumble through the process, or worse, present information that may provoke poor decisions off of incomplete data.
There are three main groups of challenges that face data scientists, analysts, and others that work with visualization tools:
- The practice of data visualization: Roles, responsibilities, and best practices
- Data: Collection, management, and preparation of the data itself
- Implementation: properly setting up and using the tools and techniques available to capture and present the data
As we look at each of these challenges, one thing is clear – data visualization isn’t possible without integration. And while integration can’t solve all of data visualization challenges, there are some areas where integration can solve the problems faced when collating data into an easily digestible format.
The Practice of Data Visualization
The practice of data visualization involves engaging the right people, with defined responsibilities, following best practices, and ensuring that there is transparency in the process so it’s clear if the data is presenting the entire picture or only part of one.
Getting the right roles and responsibilities in place is, as with any other discipline, an essential piece of accomplishing the work.
Certainly, having the right skills to create informative visuals in whatever tools your company uses – from Power BI to Tableau and many more – is key. Leveraging integrations to facilitate populating the data visualization tool also requires having the right resources. The challenge here is making sure that the proper data is available while maintaining the security and integrity of the information.
As data visualization is still a developing area of expertise, the best practices for data visualization are also evolving. At a minimum, questions like how frequently does the data need to be updated and where does it come from need to be answered. Transparency of data availability is also crucial. If users are expecting a dashboard that is updated daily, but the data feeding it is only updated weekly, there is a discrepancy of requirements that will need to be documented or addressed.
Mitigating these risks requires adequate governance and enforcement using technology tools. To make the most of these solutions, your visualization practice needs to pay attention to data and system security, establish the rules of access and availability, and define how differences in requirements will be resolved.
Proper design and development of the integrations are at the core of solving the balance between exposing the data required for effective visualizations and security. There needs to be equanimity between what’s revealed in dashboards and the obfuscation of fields that contain PII or other sensitive organization data. Architects should also keep in mind the need to secure necessary data while in-flight and when being stored for ingestion and display in visualizations.
In truth, you can’t have integrations without security. Access and identity management are a vital component of integration development. Even so, establishing rules for the source availability and the completeness of crucial data elements – even in something as simple as a word doc – provides the foundation for using technology to enforce the established governance.
Marrying the protection of the data with the rules surrounding its availability and use leads to determining how the visualizations will present the data and inform the user of its completeness. If a process is needed to scrub data before it can be passed from a source system to a visualization tool and that scrubbing can only happen once a week, the user should be informed, not out of a need to change the process, but to create an understanding of the data’s relevancy at a given time.
Unsurprisingly, the data itself presents challenges. Before designing and creating needed dashboards, data teams must define what’s needed. Will raw data be used, or will it be filtered? What are the impacts on data consistency with filtered data? What’s the plan for data collection – what will it be used for? Are the right source systems being used, or are you trending toward data collection bias by relying heavily on information from a particular division or department?
As the integrations to feed the visualizations are developed, these questions can and should be answered during the design phase. The development of your integrations will be influenced by where the data comes from, such as raw data sources or from an MDM, and if that data will need to be processed and cleaned.
Any data manipulation will need to be closely considered, as well. Too much filtering and manipulation can change the meaning of the information. Integration design should take into account minimal processing, using Authentication (AuthN) and Authorization (AuthZ) tools to restrict access to the data instead of filtering.
Implementation of Visualizations
As mentioned, getting your visualizations implemented requires the right resources with the skills to design and created the dashboards and the toolsets themselves.
As you consider the system requirements and integration details that feed the visualization, adequate consideration must be included in the design to define how the data will be accessed, when, and the impacts those decisions will have on system load and budgets.
For instance, a full data refresh may not be possible for large data sets, but if something like that is required, what does the load on the system look like? What will a refresh do to your cloud compute times, and, in turn, what will that do to your budget?
Service level agreements that include all stakeholders should also be considered, communicated, and managed. New data requests will take time to fill and will impact resources.
Mitigating some of these concerns through the thoughtful design of your integrations requires adherence to best practices. Your planning should inform implementation decisions like when ETL is the right integration pattern to use versus an API-driven approach. Is a hybrid method appropriate, and if so, what are the implications on both the data and performance?
The answers to these questions inevitably lead to further understandings, like what systems resources are needed and which team members are best suited to the tasks, or even if an outside consultant is better suited for the execution of the project.
As with earlier considerations, the need for data security will force discussions around the proper IdM and data management tools, and when, where, and how to securely archive the resulting data sets.
Target architecture plays into these choices, as well, which will inevitably lead to discussions of project delivery, resource, and maintenance costs. Each design decision point will impact the implementation of the data visualizations, from storage through data access and integrations, all the way to dashboard implementation.
Obviously, data visualization development is far more nuanced and involved than sticking a front end on an Excel spreadsheet. The needs of the end-user will drive each step of the planning, design, and implementation process.
This means it can be challenging when planning visualization systems to see the forest for the trees. If you’re looking for expert guidance on getting integration plans for clear and influential data visualizations without spinning your wheels on the details, give Big Compass a call. We can help you at every step of the planning and design process, and can even provide the skills to implement highly performant and secure integrations to feed data to your dashboards.
- Integration Challenges for Supporting Data Visualization - July 30, 2020
- The 4 Benefits of Hybrid Design Patterns for Data Visualization - June 26, 2020