SciELO - Scientific Electronic Library Online

vol.9 issue1Inclusion, basic pensions and (in)equality: the distribution of economic protection in old age in four Latin American countriesEconomías de escala en la producción de algodón author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand




  • Have no cited articlesCited by SciELO

Related links

  • Have no similar articlesSimilars in SciELO



Print version ISSN 1852-4418On-line version ISSN 1852-4222


ROSATI, Germán. Development of an imputation model for income variables with lost values using ensamble learning methods: Application to the permanent household survey (EPH). SaberEs [online]. 2017, vol.9, n.1, pp.91-111. ISSN 1852-4418.

This paper aims to present some advances made in the development of  a missing values and non-response imputation model for income variables in household surveys. The general methodological propose is exposed and the results of some tests. Two imputation methods are evaluated: 1) hot deck (widely used in mayor surveys such as Encuesta Permanente de Hogares and Encuesta Anual de Hogares of the Buenos Aires City) and 2) a LASSO regression model ensamble. The ensamble is generated using the bagging algorithm. The first and second part of the document reviews the main missing data generation mechanisms and its implications for the use of imputation methods. In the third section, several imputation methods are reviewed, emphasizing its assumptions, advantages and limitations. The fourth part analyzes the theoretical and methodological foundations of LASSO and ensamble learning. Finally, the fifth section presents some results of the application of this method to Encuesta Permanente de Hogares data.

Keywords : Regularization; LASSO; Non response.

        · abstract in Spanish     · text in Spanish     · Spanish ( pdf )


Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License