Official statistics and European statistics
What is official statistics?
Official statistics are public information that reflect the social situation and changes in the society. Statistics are produced for the benefit of the society on the basis of a national or European Union statistical programme. Official statistics are necessary for preparing development plans and making projections, for policy-design, scientific and applied research, and for making knowledge-based decisions.
Official statistics are accessible to all and help everyone make decisions in their private or work lives.
Official statistics comply with international classifications and methodologies and with the principles of impartiality, reliability, relevance, cost-effectiveness, confidentiality and transparency. In Estonia, the producers of official statistics are Statistics Estonia and Eesti Pank (Bank of Estonia).
One year in Statistics Estonia
- 3,000 output indicators
- 70 statistical activities
- 90,000 data providers
- 420,000 answered questionnaires
- 55,000 calls and e-mails answered by customer support
- 132,000 variables collected in questionnaires
- 1.5 million website visits
- Nearly 2 million statistical database views
- Over 3,000 requests for statistics
What is European statistics?
European statistics are high-quality statistics and data on Europe. European statistics are compiled by Eurostat and national statistical authorities.
Statistics are part of everybody’s daily life. They are used by governments, politicians, businesses, academia and researchers, but also to inform the media and citizens. Professional independence of statistical offices ensures that statistics are produced free of any political and other external influence.
European statistics reliably portray reality. They are coherent and comparable over time and between regions and countries, easily accessible and clear.
Eurostat is the statistical office of the European Union. It is in charge of the development, production and dissemination of European statistics. The values of Eurostat are respect and trust, striving for excellence, driving innovation, service orientation and professional independence.
Eurostat produces European statistics in partnership with national statistical institutes and other national authorities in the EU Member States and the European Free Trade Agreement (EFTA) countries. This partnership is known as the European Statistical System (ESS). Member States collect data and compile statistics for national and European purposes. Eurostat leads the way on producing harmonised and comparable statistics in close cooperation with the other ESS members.
- Built on reliable, fact-based data
- Produced free of political or any other influences
- Comparable between countries and regions
- European statistics are used to monitor European policies that affect daily life
What is the European Statistics Code of Practice?
The Code of Practice defines 16 key principles covering three dimensions: the institutional environment in which the statistical authorities operate, statistical processes and statistical outputs. A set of indicators (84) of best practices and standards for each of the 16 principles provides a reference for reviewing the implementation of the Code by the national statistical institutes and Eurostat.
16 principles of the European Statistics Code of Practice
- Professional independence
1bis. Coordination and cooperation
- Mandate for data collection and access to data
- Adequacy of resources
- Commitment to quality
- Statistical confidentiality and data protection
- Impartiality and objectivity
- Sound methodology
- Appropriate statistical procedures
- Non-excessive burden on respondents
- Cost effectiveness
- Accuracy and reliability
- Timeliness and punctuality
- Coherence and comparability
- Accessibility and clarity
Read more about the 16 principles and 84 indicators of the Code at:
1. Specifying needs
People have always been curious – it is the basis of human development and rational behavior. Many questions start with the words “how many” or “how much”. Today, questions such as the following are often asked from Statistics Estonia:
- How many children in county N are going to school next year?
- How many households there are in which the partners are not officially married?
- How much do people earn on average in a month in Valga county, Ida-Viru county or in Estonia as a whole?
Answers to questions for which there is significant public interest can often be found in the statistical database on the website of Statistics Estonia.
Official statistics are mainly ordered by ministries and other state authorities. They are also ordered by industry associations, research and educational institutions and local government authorities and -associations. In cooperation with Statistics Estonia, it will be determined which indicators are needed and analysts will study which data sources can be used for data collection. Today, relevant information is often partly or fully collected in state databases. In the production of statistics, existing information is used as much as possible, including information from state databases such as the population register, commercial register, Estonian Education Information System and the register of buildings. The output methodology will be described.2. Production system design
In the production of statistics, information generated by automated processes, such as mobile positioning data, social networking data, satellite images, etc. is of more and more interest. This phase entails describing what kind of data processing should be applied and how to prepare data from the data sources for analysis.
If the studied characteristics include assessments (e.g. satisfaction) or information that cannot be obtained from databases, it is necessary to interview people or economic entities. Compiling a questionnaire is one of the more labour-intensive stages in the preparation of a survey. Personal interview methods have improved over time, but so far traditional face-to-face interviews have not been eliminated, although they are time-consuming and expensive. Telephone interviews and web interviews are increasingly used to collect data. Usually, the best results are achieved by combining several methods. Most of the surveys of economic entities are carried out online, which is the most convenient and flexible method for respondents.3. Building the production system
All checks and rules described during the design phase need to be entered into information systems and programmed to make the data acquisition and processing as automatic as possible. In addition, necessary tools are built for data processors for data imputation and checks. The production system should also be tested to make sure that the whole process runs flawlessly and that the respondents and data processors could perform their tasks smoothly.4. Data protection
When setting up a research task, the population is specified, i.e. the number of persons or objects about which conclusions are expected. In the case of a sample survey, a part of them are selected, i.e. a sample is drawn and its size is determined. Therefore, each person or entity in the sample represents a whole range of similar persons or economic entities.
Data collection by a survey is the most expensive and time-consuming stage. If it fails because the subjects do not respond or their responses are illogical, the study will not reveal anything, as there cannot be reliable results without reliable data.
Data from databases are received through a secure channel in a pre-programmed manner and by an agreed due date.5. Data processing
The preparation of data for analysis has become much quicker due to technical progress, and checks in data collection programmes, which do not allow logically inconsistent responses (e.g., a 17-year-old respondent cannot have higher education). Data are also mostly electronically coded.
However, electronically collected data are not always correct. Anyone who handles data knows that datasets include typos. Hidden errors are detected when complex checks are run and comparisons are made with other sources. A major problem in datasets are data gaps that interfere with data processing, especially when more sophisticated models are to be applied. To replace missing values, different imputation methods are used, in which some object values are replaced with those of similar objects.
In order that the sample data could represent the population, a weight, or expansion factor, has to be calculated for each object in the sample. The weight of a sample object shows how many similar population objects this sample object represents. The weight of a sample object is always one or a figure bigger than one. In business entity surveys, the weight of large entities is often one because they are sufficiently unique in Estonia and there are no similar entities in the population.6. Calculation and analysis of data
The information collected during the data collection phase must be converted to a format that would allow answering the questions raised at the beginning of the study. Quite often, average values (average gross monthly wages) or totals (total number of unemployed persons) as well as various indexes are calculated. Price indices indicate price developments over time. The most complex, combining statistics of various domains, is the calculation of the gross domestic product.
When publishing statistics, Statistics Estonia must ensure that the information in data tables does not allow any economic entity, person or other object to be identified. Various mathematical methods and confidentiality rules are applied for this purpose.
A large part of survey results are presented in breakdowns and tables, but sometimes further analysis is carried out and more sophisticated statistical methods are applied. Models are increasingly used in statistical activities. They help to find relationships and causes. One might ask, for example, what influences the size of income. Complex models are used to analyse time series.7. Publication of results and statistical dissemination
Analyses are of value only if the results are made available to the users. Today, electronic dissemination is the prevailing dissemination method. The advantage of electronic tables and charts is that they are interactive: the users can choose tables and design graphs on the basis of their own interests and needs. A major improvement in spatial statistics is interactive maps with multiple layers. The number of printed texts has decreased considerably, but paper books and magazines have not disappeared, rather they include more specific information in compact form.8. Data evaluation
Quality evaluation of published statistics is based on five principles: relevance, accuracy and reliability, timeliness and punctuality, coherence and comparability, accessibility and clarity.
There are many quality indicators for measuring accuracy and reliability, the most important of which are the standard error and variation coefficient. The totals and mean values calculated on the basis of samples are estimates of the actual totals and mean values, which we generally do not know unless we interview all the objects in the population. The distance of the estimate to the actual size is indicated by the standard error or variation coefficient. The smaller these are, the more accurate the calculated estimate. Other quality indicators of statistics are the response rate, imputation rate, bias due to under- and over-coverage.