4.1 Summary of the inclusion criteria for trials in the present overviews

Trials were to be included if, and only if, the following four criteria were all satisfied:

  1. The trial started to enter patients before 1.1.1985 (see section 4.3 below)
  2. The trial contained some properly randomized comparison: Trials were excluded if the stated methods of treatment allocation used might well be subject to potentially serious biases due to prior knowledge of the likely treatment allocation (e.g. widespread foreknowledge of randomization lists, or use of alternate allocation, allocation by odd/even dates of birth or record numbers, historical controls, etc). Trials were also excluded if imbalances between the treatment groups with respect to numbers of patients (either overall or in particular categories) or with respect to follow-up were demonstrated by further investigation to be due to lack of proper randomization.
  3. The trial included at least two treatment groups that provided an unconfounded* concurrently randomized comparison of any of the following (a-f):
    1. radiotherapy versus no radiotherapy.
    2. tamoxifen versus no tamoxifen (including trials of tamoxifen plus chemotherapy versus the same regimen without tamoxifen);
    3. chemotherapy versus no chemotherapy (including trials of chemotherapy plus tamoxifen versus the same regimen without chemotherapy, but excluding trials of perioperative chemotherapy);
    4. polychemotherapy versus single-agent chemotherapy;
    5. short duration versus long duration of the same chemotherapy;
    6. ovarian ablation versus no ablation;
  4. The trial was not conducted in the USSR or Japan (see section 4.3 below)

4.2 Principles of identification of trials: selection bias may be limited even without absolute completeness

In practice, it may never be possible to identify absolutely all relevant randomized trials and to obtain absolutely all the data from all such trials. Some randomized trials may, despite extensive efforts, be overlooked or otherwise unavailable, and in the many trials where national mortality records cannot or have not been used to complement other sources of information several patients may be lost to mortality follow-up. So, for almost any particular result from an overview of the material that is available, absolute completeness cannot be guaranteed. The question to ask of an overview of many trials, therefore, is not whether it can be guaranteed to be complete, nor even whether absolutely all selective bias can be excluded, but instead whether or not any remaining biases due to missing trials or missing patients could reasonably be thought to cause any serious problems of interpretation. Some readers may initially suppose that no trustworthy inferences can be drawn from a review that is at all incomplete. This would be too rigid, however, for if taken literally it would mean that once some information from some randomized trial had been permanently lost then no amount of evidence from subsequent trials could ever suffice to answer the question of interest. Moreover, even though the editorial columns of many journals and the discussion sections of many papers often do not involve a complete review of the trial evidence, they have already led to many well-founded conclusions (although they have also led to the perpetuation of some differences of opinion that might have been resolved by a more systematic approach to the randomized evidence).

Clear evidence of an effect of therapy is provided when an overview yields a result with random errors small enough to make that result significantly different not just from zero but also from the size of selective bias that could plausibly be attributed to any incompleteness of the overview. Greater completeness, therefore, serves two complementary purposes: first it generally reduces the size of selective bias that can plausibly be ascribed to incompleteness, and second it reduces the size of the random error.

The present overview has sought to make available information that is as complete as possible, and then to address particular questions using as much of this information as is possible without serious bias. This overview is based not on all data from all the relevant randomized trials, but instead on reasonably unbiased data from a reasonably unbiased selection of these trials. (See, for example, the practical reasons for exclusion of data from the USSR and Japan that are discussed below.)

4.3 Practice of identification of trials: multiple sources of information

Several avenues of enquiry were pursued to locate as many of the relevant trials as possible. A first attempt involved discussion with trialists and scrutiny of review articles. This was supplemented by scrutiny of the systematic lists of trials that have been prepared by the UICC (Geneva), NCI (Bethesda) and UKCCCR (London), and by scrutiny of all current and previous proceedings of ASCO, AACR and UICC meetings. It was further augmented by a formal computer-aided literature search, by enquiry of the manufacturers of tamoxifen and by discussion with at least one person from each major trial organization to seek more exact details of all the randomized trials in early breast cancer ever performed by those (or any other) trial organizations.

The list of trials identified by September 1984 was circulated among trialists invited to attend a meeting at Heathrow Airport, London, at that time.1 The smallness of the numbers of additions made then and subsequently suggests that the large majority of all patients in randomized trials of tamoxifen or chemotherapy that started before 1984 have been identified, except those from trials in the USSR and Japan, where much relevant research was found to be in progress. Visits to the USSR and Japan have helped clarify the situations in those countries, but although investigators in these countries are willing to collaborate, sufficient details are not yet available. To limit the possibility of bias, therefore, all trial results from the USSR and Japan have been excluded from the present (though, it is to be hoped, not from future) analyses.

Similarly, to limit the possibility of bias due to selectively incomplete inclusion of recent trials, all results from trials that started to randomize patients on or after 1.1.1985 have been excluded from the present report.

4.4 Methods of seeking data from individual trials

At the 1984 Heathrow meeting many investigators agreed that data on each individual patient should be sought, so that analyses of the duration of overall survival and of recurrence-free survival could be undertaken. Some trial groups provided only grouped tabulations of results, but most provided individual patient data either on special forms completed according to standard instructions (Appendix Figure 1 [not reproduced here]) or, more commonly, on magnetic tape or computer listings. Instructions defined the exact information sought and divided it into two categories: "essential minimum data", which included identifiers, allocated treatments and duration of survival or follow-up, and "optional extra data", which included information on recurrence and the date of recurrence. The "essential minimum data" columns were completed on virtually all patients and the recurrence data columns were completed on about 94% of the patients for whom individual data profiles were provided. Nine data categories were marked with an asterisk indicating that these items need not be provided if it was not convenient to do so. Because the information in some of these categories is seriously incomplete (e.g. information on contralateral breast cancers) little use will be made of it in the present report, although more complete information is being sought for future reports.

Trials of tamoxifen and trials of cytotoxic chemotherapy were sought assiduously and checked extensively, and their results are the principal object of the present report. Some types of trials (e.g. those evaluating adjuvant radiotherapy and ovarian ablation) were sought but were not as extensively checked, so the analyses of them in the present report will be brief. Other types of trials (e.g. those evaluating immunotherapy, or different surgical procedures) were not thoroughly sought, so no analysis of them is presented; again, however, more complete information on them is being sought for future reports.

Trial results can affect both trial closure and trial publication, so the exclusion of ongoing or unpublished trials could introduce the sort of data-dependent selection bias that the overview is designed to avoid. But, public availability of interim results from an ongoing, or unpublished, trial might interfere with recruitment into the study or prejudice its subsequent publication. In order to limit this problem, the results of some ongoing trials were provided for the overview on the condition that their individual results were not to be presented separately while randomization continued. In this way, confidential results could contribute to the overview without being explicitly reported in the tabular analyses.

4.5 Methods of checking data from individual trials

The availability of data on individual patients permitted a wide range of fairly obvious consistency checks, many of which revealed some errors or omissions in at least some trials that could be rectified by correspondence with the principal investigators. For example, where investigators identified patients by sequential numbers, checks were made for any breaks in the sequence that might indicate improper exclusion of some randomized patients. Similarly, where the original treatment group sizes differed by more than one standard deviation (e.g. 110 allocated one treatment compared with only 90 allocated another treatment), checks were again made to see whether there had been any improper exclusions or undocumented changes in the treatment allocation proportions.

The chief purpose of the present report is to compare one treatment with another, rather than to estimate the absolute risks of death or recurrence in women with early breast cancer. For such comparisons to be unbiased the fundamental requirement is not that follow-up should be absolutely complete but merely that there should be no systematic differences in the completeness of follow-up between treatment-allocated and control-allocated patients who are not known to be dead. Consequently, for each trial from which individual patient data were available the duration of follow-up was checked, with particular attention to the proportions of those not known to be dead whose last follow-up was before 1.1.1984. (By the time of the main data collection in 1985 this would have represented a delay of well over a year.) In the few instances where significant differences in follow-up duration were observed, attempts were made to rectify this by seeking additional information, either from the trialists or from national mortality records. The balance between groups with respect to entry date, age, menopausal status, nodal status and, where available, estrogen receptor (ER) status was also checked, and any imbalances were pursued. Other checks were instituted on the internal consistency of individual patient records.l9

Finally, trialists were supplied with the checked and corrected records of each individual patient (when individual patient data had been supplied) and with summary tables in standard format computed from these checked records (as in the brief trial summaries that are appended to this report). Minor corrections in many of the trial results were engendered by these checks, but the arithmetic conclusions of the overviews were not materially altered by them. This does not mean, however, that these extensive checks had little effect on the interpretation of the results. An overview can yield clear evidence of an effect of treatment only if the size of the effect is clearly different from the size of bias that could plausibly be postulated. So, even if such checks do not materially alter the estimate of the treatment effect, they can still help limit the plausible size of the bias and thereby considerably strengthen the confidence that can be attached to the principal conclusions. The principal analyses are of all events reported as occurring before September 1, 1985, which is when the main data checks were first completed. (The exhaustive mass of data checks that continued for long after that date produced numerous minor corrections, but no substantial changes in the results of any trial.)

4.6 Methods of publishing data from individual trials

The data obtained on outcome by allocated treatment in each trial belong not to the central organizers of the collaboration but to those responsible for conducting that trial. It is only the trialists who really know the strengths, limitations and unusual features of their trial, and who can vouch for the reliability of the data from it. For both these reasons, this overview is accompanied by separate short reports, prepared in collaboration with the individual trialists, describing their trial methods and the checked results from those trials that are used in the overview. The standard summary tables in those reports generally include enough data to permit analyses of time to death and to recurrence, stratified by age and by various other prognostic features. The present overview is based only on the data in the accompanying short reports - with the exception of the few trials that have, as yet, provided only confidential results - and thus derives almost entirely from results that are now publicly available. The names of the trials that did make this available, together with their sizes and a few other brief details, are given in Appendix Table 1 [not reproduced here], which includes information on a total of about 40,000 women randomized over the years into about 100 trials.



*Only trials in which such comparisons are "unconfounded" are included. By definition, in unconfounded trials one group differs from another only in the treatment of interest: thus, the tamoxifen overview includes trials of tamoxifen plus chemotherapy versus the same chemotherapy alone, but not trials of tamoxifen plus prednisone versus no treatment. (ln trials of chemotherapy, however, prednisone was considered an integral part of the chemotherapy regimen being evaluated, so trials of chemotherapy plus prednisone versus no treatment were included in the chemotherapy overview.)