Chapter 6 - Evaluating Information Quality

Chapter 6 - Evaluating Information Quality

Once a data system exists, the key to achieving and maintaining a high level of data quality is to regularly assess all aspects of data quality to improve the data collection and processing procedure or correct special problems with the data as they arise. That can be accomplished by regular assessments of the data collected, special studies of aspects of the data and the effectiveness of the collection and processing methods, and running quality control of key processes to control process quality and collect data quality information.

6.1 Data Quality Assessments

Principles

  • "Data quality assessments" are data quality audits of data systems and the data collection process.
  • Data quality assessments are comprehensive reviews of the data system to note to what degree the system follows these guidelines and to assess sources of error and other potential quality problems in the data.
  • The assessments are intended to help the data system owner to improve data quality.
  • The assessments will conclude with a report on findings and results.

Guidelines

  • Since data users do not have the same access to or exposure to information about the data system that its owners have, the data system owners should perform data quality assessments.
  • Data quality assessments should be undertaken periodically to ensure that the quality of the information disseminated meets requirements.
  • Data quality assessments should be used as part of a data system redesign effort.
  • Data users, including secondary data users, should be consulted to suggest areas to be assessed, and to provide feedback on the usefulness of the data products.
  • Assessments should involve at least one member with a knowledge of data quality who is not involved in preparing the data system information for public dissemination.
  • Findings and results of a data quality assessment should always be documented.

References

  • General Accounting Office, Performance Plans: Selected Approaches for Verification and Validation of Agency Performance Information, GAO/GGD-99-139 (July 1999).

6.2 Evaluation Studies

Principles

  • Evaluation studies are focused experiments carried out to evaluate some aspect of data quality.
  • Many aspects of data quality cannot be assessed by examining end-product data.
  • Evaluation studies include re-measurement, independent data collection, user surveys, collection method parallel trials (e.g., incentive tests), census matching, administrative record matching, comparisons to other collections, methodology testing in a cognitive lab, and mode studies.
  • "Critical data systems" are systems that either contain data identified as "influential" or provide input to DOT-level performance measures.

Guidelines

  • Critical data systems should have a program of evaluation studies to estimate the extent of each aspect of non-sampling error periodically and after a major system redesign.
  • Critical data systems should periodically evaluate bias due to missing data, coverage bias, measurement error, and user satisfaction.
  • All data systems should conduct an evaluation study when there is evidence that one or more error sources could be compromising key data elements enough to make them fail to meet data requirements.
  • All data systems should conduct an evaluation study if analysis of the data reveals a significant problem, but the source is not obvious.

References

  • General Accounting Office, Performance Plans: Selected Approaches for Verification and Validation of Agency Performance Information, GAO/GGD-99-139 (July 1999).
  • Office of Management and Budget, Statistical Policy Working Paper 31: Measuring and Reporting Sources of Error in Surveys (July 2001).
  • Lessler, J. and W. Kalsbeek. 1992. Nonsampling Error in Surveys. New York, NY: Wiley.

6.3 Quality Control Processes

Principles

  • Activities in survey collection and processing will add error to the data to some degree. Therefore, each activity needs some form of quality control process to prevent and/or correct error introduced during the activity.
  • The more complex or tedious an activity is, the more likely error will be introduced, and therefore, the more elaborate the quality control needs to be.
  • A second factor that will determine the level of quality control is the importance of the data being processed.
  • Data system activities that need extensive quality control are check-in of paper forms, data entry from paper forms, coding, editing, and imputation.
  • Quality control methods include 100% replication, as with key entry of critical data, sample replication (usually used in a stable continuous process), analysis of the data file before and after the activity, and simple reviews.

Guidelines

  • Each activity should be examined for its potential to introduce error.
  • The extent of quality control for each activity should be based on the potential of the activity to introduce error combined with the importance of the data.
  • Data should be collected from the quality control efforts to indicate the effectiveness of the quality control and to help determine whether it should be changed.
  • The quality control should be included in the documentation of methods at each stage.

References

  • Ott, E., E. Shilling, and D. Neubauer. 2000. Process Quality Control: Troubleshooting and Interpretation of Data. New York, NY: McGraw-Hill.

6.4 Data Error Correction

Principles

  • No data system is free of errors.
  • Actions taken when evidence of data error comes to light are dependent on the strength of the evidence, the impact that the potential error would have on primary estimates produced by the data system, and the resources required to verify and correct the problem.

Guidelines

  • A standard process for dealing with possible errors in the data system should exist and be documented.
  • If a disseminated data file is "frozen" for practical reasons (e.g., reproducibility and configuration management) when errors in the data become known, the errors should be documented and accompany the data.

References

  • Ott, E., E. Shilling, and D. Neubauer. 2000. Process Quality Control: Troubleshooting and Interpretation of Data. New York, NY: McGraw-Hill.