CSV Considerations Around Data Integrity

March 3, 2016

Data integrity is a current hot topic, but not a new one, within the life sciences industries and associated product supply chains.

This article will not delve into why there have been so many recent data integrity issues within the EU and FDA regulated industries.  Instead, this article will aim to gist the regulatory perspective and identify current best practice thinking relative to what one can do from a compliance and quality perspective to avoid and detect data integrity issues and overall data quality pitfalls.

If you have read regulatory guides and rules, you have read the generic and recurring terms…’quality and integrity of the data’…

So what does that mean?

Just as pharmaceutical products must meet certain quality attributes associated with effects on patients such as strength, identity, safety, purity, and quality (SISPQ), so too must the associated data meet certain quality and integrity attributes…i.e., ALCOA+.

The acronym ALCOA6 stands for the following attributes:  Attributable, Legible, Contemporaneous, Original, and Accurate.  Refer to the glossary at the end for definitions of terms.

ALCOA may be considered the data quality attributes focused on doing it right the first time when it is done, i.e., task based.

The acronym ALCOA+7 stands for ALCOA in addition to the following attributes:  Complete, Consistent, Enduring, and Available.  Again, refer to the glossary at the end for definitions of terms.

ALCOA+ may be considered the data quality attributes that are focused on establishing and monitoring the support processes around data activities, continuous improvement and overall product quality.

So in order to achieve overall data quality and associated product quality, one must have both ALCOA and ALCOA+.  Therefore, one can infer that Product Quality is directly associated with Data Quality.

This association of ALCOA, ALCOA+ and overall data and product quality can be theoretically depicted by the diagram below.  It shows that the higher the ALCOA and ALCOA+, the higher the overall data quality and product quality.

Overall Quality

In order to further set the regulatory context, the following excerpts from the FDA Guidance for Industry – Electronic Source Data in Clinical Investigations provide a mindset that can be transferrable to any computer system intended to satisfy data integrity expectations:

…capturing source data in electronic form…is intended to assist in ensuring the reliability, quality, integrity, and traceability of data from electronic source to electronic regulatory submission.

… Adequate controls should be in place to ensure confidence in the reliability, quality, and integrity of the electronic source data.

Assuring data integrity requires appropriate quality and risk management systems and processes, including adherence to sound scientific principles and good (electronic) documentation practices, keeping in mind that there should be no loss of quality when an electronic system is used in place of a paper system.

Furthermore, technology alone cannot eliminate data integrity issues.  There are still people and manual processes involved that must be accounted for, monitored, and improved.

Therefore, if people and manual processes are still involved to some degree, one must take a holistic approach to addressing data integrity and applying the necessary designs and controls across all the spheres of influence depicted below.

So where and how should one implement data integrity and data quality design considerations and controls?

The table below identifies areas of focus and consideration.

One important caveat, when trying to implement a system and associated controls, is that a team is involved throughout the entire lifecycle.  It must be a team of qualified subject matter experts (SMEs) in the following minimum areas:  System/Business Process Owner, Technical Representative, Computer System Validation, and Quality.

In summary, data integrity is a component of data quality that is directly relational to product quality.  Technology alone will not solve the situation; it requires a hybrid (human and computerized) approach to address and improve your overall data and product quality.  That said, if one considers and implements systems in a team-based approach, focused on process/product knowledge and continuous improvement, positive results will be seen relative to increased efficiency, compliance, and quality.


Accurate:  data is correct including context/meaning (e.g., metadata) and edits

ALCOA:  Attributable, Legible, Contemporaneous, Original, and Accurate

ALCOA+:  ALCOA in addition to the following attributes:  Complete, Consistent, Enduring, and Available

Attributable: who acquired the data or performed an action (or modification) and when

Available: Readily accessible in human readable form for review throughout the retention period for the record

Complete:  Data includes all data (passing or otherwise) from all actions taken to obtain the required information, including metadata (e.g., audit trail) and edits

Consistent: Data is created in a repeatable and comparative manner (traceable)

Contemporaneous:  documented at the time of the activity (promptly)

Enduring: Stored on media proven for the record retention period

Integrity:  The extent to which all data are complete, consistent and accurate throughout the data lifecycle.  MHRA and WHO definitions

Legible:  Data is permanent and easily read (by a human)

(data) Lifecycle:  A planned approach to assessing and managing risks to data in a manner commensurate with potential impact on patient safety, product quality and/or the reliability of the decisions made throughout all phases of the process by which data is created, processed, reviewed, analyzed and reported, transferred, stored and retrieved, and continuously monitored until retired  MHRA / WHO definition

Original:  the first recording of data, raw or source data, or a certified true copy

(data) Quality:  [ICH Q10] The degree to which a set of inherent properties of a product, system or process fulfils requirements.

Source Data (clinical trial):  All information in original records and certified copies of original records of clinical findings, observations, or other activities in a clinical trial necessary for the reconstruction and evaluation of the trial. [FDA]


  1. FDA, Guidance for Industry – Computerized Systems Used in Clinical Investigations, May 2007
  2. FDA, Guidance for Industry – Electronic Source Data in Clinical Investigations, September 2013
  3. MHRA, MHRA GMP Data Integrity Definitions and Guidance for Industry, March 2015
  4. Newton, M., White, C. “Data Quality and Data Integrity:  What is the Difference?”, ispeak, June 2015
  5. WHO, “Guidance on Good Data and Record Management Practices” September 2015
  6. Woollen, Stan W. “Data Quality and the Origin of ALCOA” The Compass – Summer 2010
  7. White, Christopher H., Gonzalez, Lizzandra R. “The Data Quality Equation – A Pragmatic Approach to Data Integrity | IVT, August 2015.

Learn more about ProPharma Group's CSV services. 
Contact us to get in touch with our subject matter experts for a customized Computer Systems Validation presentation.


September 14, 2016

How critical is the Technology Transfer phase of new drug development?

During the development phase of a new drug, great pains are taken to characterize the molecule and to run a myriad of laboratory and animal tests to determine the product attributes, toxicology...

May 11, 2016

Does your Quality System stand up to the challenge?

Implementing and maintaining a Quality System is a complex challenge. It is as much as an art as it is a science. A company’s Quality System establishes the framework to manage and maintain...

Looking to the Cloud for your Business

computer susteIt’s a Friday afternoon. Quarter’s end. Your V.P. of Regulatory Affairs calls your office bellowing something about not being able to process the latest submission data- can’t access...