Guidelines for Data Quality Assessment (DQA)
What is Data Quality Assessment (DQA)?
DQA stands for Data Quality Assessment or Data Quality Audit. It is a systematic process of evaluating the quality of data that is being collected, processed, stored, and used in a program or project. The objective of DQA is to identify and address any issues or challenges related to data quality that may affect the validity, reliability, and usefulness of the data.
The DQA process typically involves a review of data collection methods, data entry processes, data management systems, data analysis procedures, and data reporting and dissemination processes. The DQA may also include a review of the quality of the data itself, including data completeness, accuracy, consistency, and timeliness.
The results of the DQA are used to identify areas for improvement in the data management and analysis processes, and to develop a plan for addressing any issues that have been identified. The DQA can also provide valuable information for decision-makers, as it can help to ensure that the data being used to support program or project decisions is of high quality and reliability.
In summary, DQA is an important component of any program or project, as it helps to ensure that the data being used is of high quality and that decisions are based on accurate and reliable information.
DQA refers to a standard of qualitative or quantitative data highlighting the results of the interventions of a project/program. Generally, the standard of data should be of sufficiently high quality to support management needs for planning and decision-making. In fact, high-quality data are the cornerstone for evidence-based decision-making. Through the DQA process, one can determine the status of the data quality standard of the data being reported.
What are Data Quality Standards?
To
ensure the quality of data, a performance monitoring system must address the data
quality standards. In other words, to be useful for monitoring and credible for
reporting, data should reasonably meet the standards of data quality, which are
as follows:
Validity:
Data should clearly and adequately represent the intended result.
Reliability:
Data should reflect stable and consistent data collection processes and
analysis methods over time.
Timeliness:
Data should be available at a useful frequency, should be current, and should
be timely enough to influence management decision-making.
Integrity:
Data collected should have safeguards to minimize the risk of transcription
error or data manipulation.
Completeness:
Data should be "100%
complete" in respect of its structure and inclusiveness.
Data
that do not meet these standards (see Annex 1 for details) could result in an
erosion of confidence in the data or could lead to poor decision-making. Maintaining
data quality standards requires strong leadership and commitment to project
management.
Purpose of DQA:
The
purpose of a DQA is to ensure that the project management is aware of:
·
Strengths
and weaknesses of indicator data, as determined by applying the five data
quality standards and the extent to which
data integrity can be trusted to influence management decisions; and
·
Take
necessary measures for resolving data quality issues
DQA
must be conducted for each performance indicator reported by an Implementing
Partner. In the case of Suchana, this includes the indicators reported in the MPDS,
quarterly progress report, and annual narrative report.
When DQA should be conducted?
DQA
must occur for an indicator within three months of being reported and then every
year thereafter. However, the Suchana MEAL team may choose to conduct DQAs more
frequently if there is consistent data quality issues with the implementing
partners. Usually, the following circumstances are considered for conducting a
DQA more frequently:
·
When
an indicator is identified as having a high risk of error in implementation
(e.g., the indicator may include unclear or inherently complicated data
collection methodology);
·
When
indicator data deviate excessively from the target (e.g., the indicator has a
target of 100 and the actual data that are reported is 952);
·
When
stakeholders or implementers suggest there may be issues with indicator data;
·
When
staff seek to confirm that a previously identified data quality problem has
been resolved; and
·
When
the indicator data are critically or strategically important (in respect of
reporting to the donor)
Who is responsible for conducting DQA?
Ideally,
Suchana MEAL team members are responsible for the content and timely completion
of the DQA. The concerned TP project managers and IP project coordinators are
encouraged to participate in the DQA. This promotes internal understanding and
ownership of the strengths and weaknesses of the data and enables them to work
together to address any uncovered programmatic issues that contribute to data
limitations.
Selection of indicators for DQA:
Suchana MEAL framework includes various types of indicators to measure results occurring at different levels of objectives. The sources of data of these indicators are also different.
Objective hierarchy |
Types of indicators |
Sources of data |
Responsible of data |
Goal |
Impact indicators |
End line evaluation |
ICDDR,B |
Purpose |
Outcome indicators |
Annual survey Semi-annual survey |
SCI |
Output |
Output indicators |
Semi-annual survey, |
TPs/IPs |
Activities |
Process indicators |
MPDS/MAPDS |
TPs/IPs |
The data collection and data analysis of the end-line evaluation and annual/semi-annual surveys are done under a strong data quality assurance process. DQA has a very minor chance to add value to the indicators covered by this evaluation and surveys. However, a different DQA process will be used to assess the data quality of these indicators.
On the
other hand, the data of the output and process indicators are collected and
reported by the frontline staff of the technical and implementing partners. These
data are frequently used for decision-making. So, all these data should undergo
a DQA process to attain high quality so that decisions can be made properly
intends to lead the Suchana goal.
How to conduct DQA:
The
process for conducting a DQA varies depending on how and by whom the data are
collected. This guideline describe the method of conducting a DQA of the primary
data collected by implementing partners for output and process indicators. The
process of this type of DQA includes the following five steps:
1. Preparation: The first step of conducting a DQA is selecting which indicators will undergo a DQA and notifying the relevant partners or stakeholders about the date of the DQA. Suchana's MEAL team should maintain a schedule of DQA for each indicator. Such a schedule will help them to conduct DQAs of multiple indicators at the same time.
2. Desk Review: The desk review is usually held at the head office of the implementing partners where the DQA team
reviews all available documents related to the indicator before going to the
field to verify data. For example, training plans, training modules, training
attendance sheets, etc. to be reviewed for training indicators. Similarly, IGA
procurements documents, IGA distribution register, IGA selection criteria, etc.
to be reviewed for IGA indicator. The implementing partner’s work plan might
also prove useful to the DQA team to help them identify what activity
efforts or specific interventions are producing the data.
For
indicators that have previously undergone a DQA, the focus should be on
documents and data that have been created and collected since the previous DQA
was conducted. These documents include all reports to Suchana in which
performance data were reported (e.g., monthly reports, quarterly reports,
annual reports, and other special reports). The DQA team should review these
documents to understand their improvement in maintaining data quality standards.
3. Field Review: The field review includes visiting the field offices of
implementing partners (UzC Office in the case of Suchana), where data are collected,
verified, analyzed, and stored. The DQA team will observe and review databases,
filing systems, and data verification processes. They will also meet with partner
staff (such as SCM, FF, and Union Coordinators) to discuss and understand data collection
challenges in the field, and assess their understanding of the data quality standards
of each indicator. If needed, the DQA team may visit BHHs to interview
beneficiaries for verifying data reported by IPs.
In Suchana, IPs are working in
20 Upazilas under Sylhet and Moulavibazar districts. It is not feasible for the DQA
team to personally visit each Upazila. It is also not feasible to visit the field
offices of all the implementing partners to assess each and every indicator. In
this case, the DQA team may select a sample of UzC offices to examine in person. See Annex-2 for the DQA sampling framework.
4. Documentation: The results of a DQA are documented in the DQA checklist (Annex-3) and in the summary report (Annex-4) that highlights any uncovered data limitations, challenges, planned mitigation efforts, and updating the date of the next DQA. The DQA report for any indicator whose data are being collected by multiple sources (often multiple implementing partners) should be stored in a centralized place (in Suchana Sylhet Office) so all parties collecting and using the indicator have easy access to the information uncovered in the DQA. DQA reports should also be shared with all partners collecting data for the indicator.
5. Mitigation Plan: Once the DQA is completed, the DQA team should assess whether any
mitigation actions are needed to address data quality concerns. If there are
some data quality concerns, but managers feel comfortable that the data are of
sufficient quality and mitigation would be too costly when compared to marginal
benefits, then there may be no need for further action beyond documenting the
data limitation. On the other hand, the identification of data quality
concerns may call for a mitigation plan, particularly if the data will be used
to inform decisions or if the data are reported externally. The DQA team, in
consultation with the IPPC, should clearly document the decision and
justification for an action or no action in the DQA report. The DQA report includes
a section to record “actions needed to address limitations prior to the next
DQA.”
Data Quality Standards
To be useful in managing for results and credible for reporting, Suchana should ensure that the performance data meet five data quality standards. In some cases, performance data will not fully meet all five standards, and the known data limitations should be documented. The five quality standards that cover quantitative and qualitative performance data are:
- Accuracy: It is also known as validity. Accurate data should clearly and adequately represent the intended result. In other words, data accuracy refers to whether the data values stored for an indicator are the correct values. To be correct, data values must be the right value and must be represented in a consistent and unambiguous form. Another key issue is whether data reflect a bias such as an interviewer bias, unrepresentative sampling, or transcription bias
- Reliability. Data should reflect stable and consistent data collection processes and analysis methods from over time. The key issue is whether different analysts would come to the same conclusions if the data collection and analysis processes were repeated. Program teams should be confident that progress toward performance targets reflects real changes rather than variations in data collection methods.
- Timeliness. Data should be timely enough to influence management decision-making at the appropriate levels. One key issue is whether the data are available frequently enough to influence the appropriate level of management decisions. A second key issue is whether data are current enough when they become available.
- Integrity. Data that are collected, analyzed, and reported should have established mechanisms in place to reduce the possibility that they are intentionally manipulated for political or personal reasons. Data integrity is at the greatest risk of being compromised during data collection and analysis.
- Completeness. Data should be "100% complete" in respect of its structure and inclusiveness. It should represent the beneficiaries associated with the result and maintain homogeneity in the unit of data collection. Moreover, data should be sufficiently precise to present a fair picture of performance and enable management decision-making at the appropriate levels. One key issue is whether data are at an appropriate level of detail to influence related management decisions.
Annex - 2
DQA Sampling Framework
The main objective of the DQA sampling framework
is selecting Upazila/Union and the group of beneficiaries for DQA field visit.
Two-stage cluster sampling will be used in this regard. In the first stage, a
cluster of Upazila/Union will be determined, while in the second stage, a cluster of the beneficiary group will be determined from the selected cluster of Upazila/Union.
The DQA team will work with the relevant stakeholders (UzC, UC, APC, MEKMO etc.) to determine the number of clusters
and groups within clusters. The appropriate number of sites and clusters
depends on the objectives of the assessment. A large number of clusters and groups
may be needed if DQA covers various indicators that deal with various type of
stakeholders under different implementing partners. Often it isn’t necessary to
have a statistically robust estimate of the sample for DQA. It is sufficient to
have a reasonable estimate of the accuracy of reporting to direct system-strengthening measures and build capacity on data quality standards. A
reasonable estimate requires far fewer sites and is more practical in terms of
resources. In some cases, the DQA team looks for answers of the following questions to determine
how extensive the field review should be and how big sample sites will be
required for this review:
·
Does the desk review raise concerns about the quality of data? Do
these concerns vary across the sites and the implementing partners?
·
Does the DQA team expect the same level of quality beyond the
sample?
· Are the data being used for management or reporting purposes that
are of such importance that greater time and effort should be spent on
conducting the DQA?
·
What level of limitations may be acceptable in respect of data
quality?
DQA Checklist[1]
BASIC INFORMATION |
|
Name of indicator being assessed |
|
Date
of the DQA |
|
Location
of the DQA |
|
Name
of Suchana staff involved in DQA |
|
Name
of persons met by the DQA team |
|
1.
ACCURACY: Data clearly and adequately representing the intended result
in a consistent and unambiguous form (35) |
|||
Questions |
Score allocated |
Score obtained |
Please
Explain |
Does the indicator measure results of
interventions implement? |
5 |
|
|
Are the people reporting the results to understand properly what they should report and when, and are guided properly by the
supervisor? |
5 |
|
|
Is there any effort in place to reduce
the potential for personal bias by the people collecting the data? |
5 |
|
|
Is the data collection instrument (Android
Apps, paper-based checklist, etc.) designed properly and the data collector
can use the instrument efficiently? |
5 |
|
|
Are the numbers reported for this
indicator accurate? (Check related documents to verify the number). Does it
aggregate properly? |
5 |
|
|
Is data collected in the same format as it
has been reported? If converted, does it convert properly? |
5 |
|
|
Compare the data reported to Suchana
with MIS/MPDS/MAPDS/QR. Does it match? If not, why? |
5 |
|
|
2.
RELIABILITY: Data is collected and reported through a stable and
consistent processes (20) |
|||
Is a
consistent data collection process/method and instrument used from the beginning
of the project? |
10 |
|
|
Is
there a system in place to verify data? Are these data verified before using
for data analysis and reporting? (picture, attend sheet, master roll, etc.) |
5 |
|
|
Are written procedures/instructions in place for
data collection, cleaning, analysis, and reporting? Did you see the
documents? |
5 |
|
|
3.
TIMELINESS: Data should be sufficiently frequent and current to inform
management decision making (10) |
|||
Are
data reported at the time it should be reported? What is the average range of
recall for reporting data of an event/activity? |
5 |
|
|
Are
data reported within the desired period? (monthly, quarterly, or annually as per
DFID schedule) |
5 |
|
|
4.
INTEGRITY: Data should be protected from manipulation for political or
personal reasons (15) |
|||
Are data safe enough from the
possibility of unauthorized changes or manipulation? |
5 |
|
|
Is
there any possibility of having a transfer error while exporting data from
Suchana MIS/MPDS/MAPDS/QR/master roll/attend sheet for analysis and
reporting? |
5 |
|
|
Is a responsible person skilled enough in data management, analysis, and reporting? |
5 |
|
|
5.
COMPLETENESS: Data should be sufficiently accurate to present a fair picture
of performance (15) |
|||
Is
data reported with all necessary attributes? |
5 |
|
|
Is
there a method in place for detecting duplicate data? Is reported data free
from duplication |
5 |
|
|
Is there a method in place for detecting missing
data? Is anything missing in the reported data? |
5 |
|
|
6.
CONCLUSIONS (5) |
|||
Is the DQA team satisfied
with the overall credibility/quality of the data? |
5 |
|
|
Annex- 4:
Reporting Template
Once DQA is accomplished, the team will compile responses in one DQA checklist for each of the indicators and will prepare a summary narrative report highlighting key findings, observations, improvement areas, and action points to be undertaken for further improvement of the data. The complied DQA checklist will be annexed to the summary report. Please give a few action pictures in the report. The summary report will be prepared as per the following template:
Introduction:
Please briefly describe the data type that has been assessed, the sample site and area covered and team composition etc.
Description:
Please briefly describe the following performance score with findings.
Indicators |
Performance score for data
quality standards |
Important observations/ Improvement
areas to be considered |
|||||
Accuracy |
Reliability |
Timeliness |
Integrity |
Completeness |
Conclusion |
|
|
Indicator1 |
|
|
|
|
|
|
|
Indicator2 |
|
|
|
|
|
|
|
Indicator3 |
|
|
|
|
|
|
|
Indicator4 |
|
|
|
|
|
|
|
Challenges in Data Quality
Assurance
Please briefly describe the challenges of generating high-quality data
Recommendations
Please briefly describe the key recommendations to be undertaken
Action plan (or issue log)
Based on the improvement area an action plan will be made as per
the template below.
Summarize key issues that the
Program should follow up on at various levels of the system (e.g., issues found at the site level and/or at the intermediate
aggregation site level). |
||||||
SL# |
Action to be undertaken |
Responsible person |
Timeline |
Support needed (From whom) |
Follow-up Date |
Issue Reference |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
I appreciate you taking the time and effort to share your knowledge. This material proved to be really efficient and beneficial to me. Thank you very much for providing this information. Continue to write your blog.
ReplyDeleteData Engineering Services
Machine Learning Services
Data Analytics Services
Data Modernization Services
Thank u for your complement. stay with us
DeleteI appreciate you taking the time and effort to share your knowledge. This material proved to be really efficient and beneficial to me. Thank you very much for providing this information. Continue to write your blog.
ReplyDeleteData Engineering Services
Machine Learning Services
Data Analytics Services
Data Modernization Services
Thank you. Try to continue
Delete