Official statistics are disseminated in a form that precludes the possibility of direct or indirect identification of a person (subsection 35 (1) of the Official Statistics Act). For that reason, statistical disclosure control methods are used to protect small values in frequency tables, by modifying, summarising or perturbing the data. The aim of statistical disclosure control methods is to ensure that the statistical output provides valuable information while protecting the confidentiality of personal data.
The following principles are used in the dissemination of the data of the 2011 Population and Housing Census:
Controlled rounding results in the upwards or downwards adjustment of real values, preserving additivity of the table as much as possible. Due to rounding, the published total may differ from the sum of subdivisions and the value of the same indicator may differ slightly in different tables.
Rounding has been used in tables that specify place of residence on the level of local government units. These tables are supplied with a note about the use of rounding. The values of the variables ‘place of residence’, ‘sex’ and ‘age’ are not changed in the tables.
In the formulation of the data protection principles and in the selection of the specific disclosure control method, Statistics Estonia relied on the following criteria: protection of micro-data must be ensured; information loss must be minimal; it should be possible to publish statistics also on small local government units and small settlements; and, the chosen method should be comprehensible to users. Two methods were considered: cell suppression and controlled rounding.
The first method (cell suppression) means that the values of all risky cells (1 or 2 persons) are hidden and some non-risky cells are also suppressed, to make sure that the suppressed values cannot be derived. At Statistics Estonia, this method is used for enterprise statistics and has proven to be very suitable for the subject area of economy. However, this method is not suitable for census data, because the output tables have a more complex structure and the additionally suppressed cells would result in great information loss.
With the second method (controlled rounding), loss of information is minimal. Rounding is carried out using a base of 3, meaning that all values are rounded to the nearest number divisible by three. This protects the values 1 and 2. At the same time, it creates very little noise, while allowing the publication of data on small local government units as well. On the most detailed level, the published values differ from the actual values by 1–2 persons. The difference may be greater in case of sums, but will remain below 1%. The value 0 is not changed in the rounding process; the values 1 and 2 are rounded to 0 or 3. This method is applied using the special tau-Argus software.
These methods are also used by other developed countries: