Data collection

This section contains the following:


Data collection i.e. the measurement and recording of data, is considered to be the most crucial stage of the data management process.  Errors made at this stage can be the most difficult to detect and to correct compared to those made at a later stage.  Data management procedures at later stages can do little to improve the reliability and validity of the source data. 

The first stage in designing the data collection strategy for a trial is to decide what data will be collected and at what time points (see Outcome Assessment).  These decisions should be driven by the scientific objectives of the research and care should be taken to only collect data that are needed to meet the objectives.  Investigators should keep the amount of data collected per participant to a minimum by referring to dummy tables (see Dummy tables).

Once it has been decided which data to collect, the investigator must choose how to collect the data.  This may be by questionnaire, face-to-face interview, case note review, focus group, electronic record systems or through routinely collected data.  Investigators should be aware of Data Protection legislation in their own country.

Back to top

Things to consider when writing a protocol

Back to top

Illustrative example - Perinatal care trial

Figure 2 presents a summary of all the staff that will be involved in data collection. All hospitals will receive one computer that will be used only for data management and operated by the data manager. Intervention hospitals will receive a second computer, but this equipment will be used by birth attendants and will have no role in data collection for primary and secondary outcomes. It will be used, however, to gather data on certain process measure, only in intervention hospitals.

Data collection picture

Figure 2  Staff involved in data collection.

The data collection system will be centrally coordinated at CLAP by a statistician. The team at CLAP will include one programmer, one statistical assistant, and two data clerks. The computer programmer at CLAP will develop the software for data collection and validation. The statistical assistant will carry out day-to-day data management activities (communication with data supervisor and data clerks at the hospitals, production of monitoring and validation reports, etc). For paper forms sent to CLAP, the two data clerks will perform two independent data entries. Two data supervisors in Argentina and one data supervisor in Uruguay will implement and supervise the data collection at the country level during the whole study period. Data supervisors will visit hospitals usually on a weekly basis, although the frequency of visits may vary according to hospital performance and needs. One data manager will be hired in each hospital. In most cases, this personnel will be one hospital employee that will work part-time for the project.

Clinical data will be collected in four periods, and each will be 1 year apart:

Collection schedule picture

Period I. Baseline data collection: before randomization in the preparatory phase, for primary and secondary outcomes.
Period II. Mid-intervention data collection: immediately before the implementation of the guidelines, for primary outcomes only.
Period III. Main post-intervention data collection: immediately following the maintenance component of the intervention, for primary and secondary outcomes.
Period IV. Second post-intervention data collection: 1 year after the main post-intervention data collection, for primary outcomes only.

Questionnaire Administration
Clinical data

Data will be carbon copy from the clinical record

Questionnaires to birth attendants
Questionnaires will be administered to all birth attendants in the participating hospitals prior to randomization (immediately after Period I) and after the end of the intervention (immediately after Period III).

Collection of Biological Samples
Measured total blood loss (ml)

Nurses, midwifes and physicians who are a part of the teams attending deliveries at participating hospitals will be trained in post-partum blood loss measurement.  Nurses and midwifes will be the main persons responsible for the measurements.  Both the country coordinators and the data collection supervisors will be in charge of training.  A pilot study will be implemented at the pilot hospital in Montevideo (Pereira Rossell) to assess the acceptability of the measuring technique. 

Training study personnel in data collection
The training will be done during the preparatory phase, among other activities. This phase will take approximately 18 months.  The Data Manager will coordinate the training in data collection procedures.  The pilot of the manual of operations for data collection and data collection forms will be done in one hospital in Montevideo, Uruguay, and one hospital in Buenos Aires, Argentina.  Those hospitals will not be randomized, but will be similar to those that will be assigned to the intervention and control group.

Training of Biological Sample Collectors
Nurses or midwives in the labor ward of each hospital will be trained in how to measure total blood loss in vaginal deliveries. They will be trained to perform this measurement as a routine activity in all vaginal deliveries during the data collection periods. Data supervisors and Country coordinators will be in charge of the training activities and will provide hospitals with the standard measuring drapes.
(CLAP Trial - go to protocol)

Back to top    

Illustrative example - WHO pre-eclampsia trial

At each visit and at delivery, clinical data will be collected and recorded in forms designed for the study.

Blood pressure

Blood pressure measurements will be standardized.  The equipment must be serviced locally before the initiation of the study.  The subject should be at rest, seated, for 5 minutes; the cuff should be placed on the right arm at the level of the heart before the measurements.  Two blood pressure measurements of systolic and diastolic will be taken using a standard sphygmomanometer at 3-minute intervals. Leave the cuff deflated on the subject's arm and wait for 3 minutes to take the second measurement.  Diastolic blood pressure will be measured at the 5th Korotkoff sound, which is the disappearance of the sounds.

At every centre, the trial coordinator will train staff on their abilities to measure blood pressure according to guidelines contained in the trial document: A practical guide on how to measure blood pressure and test for proteinuria.  The trial coordinator will be trained before the initiation of recruitment.  Retraining sessions will be carried out every three months.  At monthly intervals, staff will be tested for reliability of blood pressure measurements.  Test-retest procedures will be performed using a double stethoscope and results of the measurements will be recorded on appropriate forms to calculate agreement between examiners.

Completed data collection forms will be returned to WHO in Geneva monthly; data checking and entry will be continuous.  All data will be double entered, cleaned and queries checked immediately with the local investigators.  Prompt return of data collection forms and speedy clarification of queries will facilitate verification of data.

(WHO Multicentre Randomized Trial of Calcium Supplementation for the Prevention of Pre-eclampsia - go to protocol)

Back to top    

Additional resources

Trial Protocol Tool resource icon Checklist for data collection

This checklist has been contributed by Barbara Farrell who prepared it for the second version of the Trial Management Guide.

Back to top

Trial Protocol Tool resource icon Never give up on data collection

This guide has been contributed by Barbara Farrell who prepared it for the second version of the Trial Management Guide.

Back to top

PST web resource Sample schedules for data collection

These are a selection of schedules for data collection.

Back to top

Trial Protocol Tool resource icon Checklist for implications of Data Protection

This checklist was a modified version of one developed by Barbara Farrell who prepared it for the second version of the Trial Managers Guide.

Back to top

Web resource icon Questionnaire design: asking questions with a purpose

This site provides guidance for constructing, formatting and piloting a questionnaire.

Back to top

Further reading

Hosking JD, Newhouse MM, Bagniewska A, et al. Data collection and transcription.
Controlled Clinical Trials 1995;16:66S–103S.

Knatterud GL, Foreman SA, Canner PL: Design of data forms. Controlled Clinical Trials 1983;4:423-440

Hilner JE, McDonald A, Van Horn L, et al. Quality control of dietary data collection
in the CARDIA study. Controlled Clinical Trials 1992;13:156–169.

Spilker B, Schoenfelder J: Data collection forms in clinical trials. New York, Raven Press, 1991
Pocock SJ. Clinical Trials: A Practical Approach. John Wiley and Sons, Chichester, 1983.

Duley L and Farrell B.  Clinical Trials. London: BMJ Books, 2002.

Back to top     

This page was last updated 4th June 2004.