Sources and Data Treatment

Sources and Data Treatment (General)

The Human Mortality Database for Latin America has mortality series for total national territories by age, sex and cause of death, as well as the same categories per region within each country since 1980 to the latest available year. The countries included are: Argentina, Brazil, Colombia, Mexico and Peru. We also present population counts from censuses at the national level and, whenever possible, at the regional level too; as well as the latest population estimations for inter-census years taken from “United Nations World Population Prospects, 2010 Revision”, published in 2012 by the Population Division-Population Estimates and Projections Section ( It also provides a database with titles and authors of academic publications on mortality for the countries of interest. Finally, we provide our own estimation of adult mortality under-registration following the Hill, You and Choi (2009) combined method (see more on Methodology for Completeness of Adult Death Counts Coverage section).

Our sources, the official vital records for each country, are collected in different formats. Most of death records were provided from each national statistical authority, in other cases they were downloaded from their official websites or from a particular United Nations Office that keeps records directly sent from the official statistical offices, such as UNFPA-United Nations Fund for Population Activities or WHO-World Health Organization. In rare cases, data was digitalized from printed versions of the national statistical authority. Each data series reports the original sources for each year. In all cases the data with causes of death are defined as the underlying cause of death, coded by the relevant national authority. Underlying cause of death is defined as “the disease or injury which initiated the train of morbid events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury” following the International Classification of Diseases (ICD). Each country’s particularities on the codification of causes of deaths are specified in their own sources and data treatment. Geographic definitions use what countries have already defined, from the national borders to the regional limits. Sub-national borders are defined as they currently exist.
We want to point out that although all series are taken from official sources, they do not necessarily coincide. Yet, most differences are very small and are the result of our double checking process (see data treatment section) that took us to correct processing mistakes. In all cases we strongly preferred to keep fidelity to the official figures rather than making a correction. The most common mistakes were typos or additions that did not added up. For instance, if a data point was published like 5 instead of 8, it was corrected. Or if a decimal point was missing then we corrected it too (5.00 instead of 500). All sources, changes and data treatment are explained for each country, we encourage users to read the particularities of the country of your interest.

1. Description of the Data Tables

Tables include what is described in their titles in excel format. For each excel file you will find different tabs corresponding to a particular year. For instance, death records by age, sex and province for Peru are found under the sub-title Peru in our link “Country Data”. After you click in PER, Deaths by age, sex and region the table with the data will open-up and each tab is named with the information for the corresponding year.

2. Data Treatment

Original data was provided in electronic format or digitalized from a printed version. Electronic format data is usually provided by the official statistical offices and in rare cases is taken from datasets built up by WHO or UNFPA. In particular, data published by the WHO-Mortality Database ( and/or the United Nations Demographic Yearbook – DYB, for the corresponding year, ( Once data is at hand we processed the data to arrange each of the tables produced for each year. After that we double check for zero mortality cases due to maternal mortality causes for males, or women below age 15 or above age 50. We also double checked that there are no death records for perinatal mortality causes above age 2. In the very few cases those are found, then, those records are re-classified as unknown age and sex for the former and unknown age but kept the same sex as originally recorded for the latest.

After data is double checked by these particular causes, we prorated records with unknown age and sex, exclusively for the table that contains death records by age, sex, region and cause of death. From that table all other tables are built up until reaching the table with the least levels of desaggregation: national death counts by age and sex. Notice that unknown causes of death or unknown region are not prorated, as the use of codes for ill-defined and unknown causes of death, must be taken into account for the construction of mortality rates for specific causes, particularly if you plan to compare across countries.

Whenever the data is found in paper, rather in electronic format, the treatment is exactly the same after the data has been carefully digitalized for all files and columns exactly as they are printed. Only one additional check is done before the check on causes of death: we double check that all summations coincide with those transcribed, so we can correct for typos and processing mistakes.

3.Methodology for Completeness of Adult Death Counts Coverage

In developing countries, mortality estimates and knowledge of levels and trends of mortality are limited by the quality of data (Hill, 2003; Luy, 2010; United Nations 1997; 2002). The most common problems faced in these countries are incomplete coverage of vital registration systems, errors in age declaration for both population and death counts, and lack of information on causes of deaths. In recent years, collection of data for death counts has improved, but there are still limitations for studying mortality in several parts of the world. The problem is more complex when studying small areas and sub-national populations where in addition to data limitations, statistical and demographic methods also have strong limiations. Thus, public health administrations are faced with limited information to allocate resources and it is also difficult to study the progress of public policy interventions for several countries around the world, limiting the action of government agencies in improving the quality of life of these populations.

The death distribution methods are widely used to estimate adult mortality (Timaeus, 1991). They compare the distribution of deaths by age with the age distribution of the living and provide age patterns of mortality in a defined reference period. There are two major approaches: the General Growth Balance Methods (Hill, 1987), and the Synthetic Extinct Generation method (Benneth & Horiuchi, 1981). The death distribution methods make several strong assumptions: that the population is closed to migration, that the completeness of recording of deaths is constant by age, that the completeness of recording of population is constant by age, and that ages of the living and the dead are reported without error.

The Bennett and Horiuchi (1981) method is a synthetic cohort analog of Vincent’s (1951) method of extinct generations, known as Synthetic Extinct Generations (SEG) method. In this methods, age-specific growth rates are used to convert an observed distribution of deaths by age into the corresponding stationary population age distribution. Since in a stationary population the deaths above each age x are equal to the population aged x, the deaths in the stationary population above age x provide an estimate of the population of age x. The completeness of death registration relative to population is estimated by the ratio of the death-based estimate of population aged x to the observed population aged x.

The General Growth Balance (GGB) method (Hill 1987) is a generalization to all closed populations of the Brass (1975) Growth Balance method. The Demographic Balancing Equation expresses the identity that the growth rate of the population is equal to the difference between the entry rate and the exit rate. This identity holds for open-ended age segments x+. The entry rate x+ minus the growth rate x+ provides a residual estimate of the death ratex+. We can then estimate the residual from population data from two population censuses and compared to a direct estimate using the recorded deaths (from the census or vital registration), and comparing these two records we can estimate the completeness of death recording relative to population.

Hill, You and Choi (2009) proposed that the combination of SEG and GGB might be more robust than either one individually. The combined method consists of first applying GGB to estimate any changes in census coverage (k1/k2), using the estimate to adjust one or other census to make the two consistent, and then applying SEG using the adjusted population data in place of the reported.
This section is constantly under construction