A Step-By-Step Guide to Creating Your Data Strategy.
We use the word “data” a lot; however, we rarely place enough rigor on defining what the word “data” means. There is a misnomer that having a data strategy is only pertinent to “massive” corporations. Any organization starting an intelligent automation program should first create a data strategy.
Data is the first input and output for all automation we create; it trains our machine learning and AI models and drives decisions that impact our safety and health. Data bridges our world with the technology world.
Non “massive” firms are often intimidated by data. But you don’t need a data team or fancy tools. The following is a playbook to help you formulate your data strategy. We’ve included some potential “data hacks” and called out potential roadblocks that can delay your project.
Step 1: Mobilize your Data Champions: As with any plan, it’s vital first to identify the key stakeholders, their roles, responsibilities, and areas of accountability. Equally important is establishing objective and subjective success criteria.
Step 2: Identity where the data is coming from. Before you get started, perform an audit of where your information is coming from. This is called your source data. Most organizations utilize an ERP, and this is a great starting point. It may be helpful to have IT provide a list of currently supported applications and quickly evaluate any application your department interacts with. Don’t forget about the manual data that you are collecting. This may be challenging to manage since it is usually not contained in a standardized manner. But with some simple procedural changes, you can minimize manual intervention and move this information to a spreadsheet to facilitate data access.
Data Hack: Instead of creating a process to collect new data, look at any existing data filters. Your organization may limit the fields you can manage, and modifying your current filters is a simple way to broaden your data set with minimal effort.
Step 3: Identify where the data is going and create a data inventory that includes the target data elements, their source, required data elements, schema, and metadata. This could be file type (i.e., JSON, CSV, TXT) or data type (i.e., numbers, characters, strings). It is essential to identify data targets early in the game. It would help if you classified the required file types and data types.
Step 4: Identify gaps in data collection, and determine if they can be filled by people, new processes, or tools.
Potential Roadblock: Federal, local, and industry laws and regulations for data collection, transit, and storage of confidential information may affect your strategy. These guidelines may require your team to collect more data or prohibit you from collecting certain types of data you identified, especially in the case of PII. Ensure all audit requirements are built into the solution from the beginning to avoid significant rework. It is essential to address these items upfront and understand that they may delay your project’s implementation.
Data Hack: You can unblock additional functionality from your current tools by speaking with your IT team and software vendors. There may be aspects of your BI solution that are locked by your IT team. Speak with your software vendor about new functionality that can be added to your existing tools – sometimes, these come at no cost. It is essential to extract the most benefit from your existing
Step 5: Develop a test strategy and process. Identifying, tracking, and correcting data errors as close to real-time as possible is essential. Create standard reports for common mistakes like duplicate cases, errors, and gaps in continuity.
Caution: Poor input data such as Garbled content, invalid relationships, incomplete data sets, context changes, and duplication will pose challenges. It is essential to monitor for changes in the attributes of the source data. There may be updates or changes in your IT operating procedures that could affect the downstream data processing. Developing an exception handling and validation strategy integrated with your testing approach is essential.
Data Hack: Automation is a great way to speed up testing and data integrity. There are several tools on the market, as well as systems integrators who can help expedite the selection and implementation of these tools.
Step 6: Create a remediation plan to address any adverse impacts on the existing infrastructure, hardware, software, procedures, and protocols. The creation of this plan starts by first identifying risks associated with the project that would impact feasibility, technical performance, conversion, schedule, costs, and recovery procedures.
Step 7: Formalize your data plan and define what success looks like. Steps 1-6 above will provide you with the information required for this plan. Your plan should detail the overall approach and process needed for the data conversions. Identifying your objectives, assumptions, dependencies, and potential constraints is essential. You should also identify the deliverables necessary to complete the effort and the acceptance criteria for every deliverable. Documenting these items will be helpful when you implement continuous improvement initiatives.
A Final Word of Advice: We usually design our solutions around main metrics: increasing data velocity, minimizing errors, and maximizing accuracy and quality. We often overlook data integrity. These parameters contribute to data integrity; however, organizations often fall short in ensuring the consistency of these metrics across the end-to-end process. Each stage introduces the potential for data corruption and degradation of integrity. We must ensure additional rigor around the testing when data changes systems and processes. This includes increased sampling frequency and matching.
Automation is a great way to speed up testing and data integrity. There are several tools on the market, as well as systems integrators who can help expedite the selection and implementation of these tools.