We show that the inconsistency of the Two-Sample Two-Stage Least Squares (TSTSLS) Intergenerational Earnings Elasticity (IGE) estimator is a two-way prediction problem involving the replication of (i) the variance of unseen parental earnings, and (ii) the endogeneity of unseen parental earnings in the equation of children's earnings. Concretely, we show that the TSTSLS estimator asymptotically recovers the OLS IGE when the first-stage R-squared, i.e., the share of explained variance of parental earnings, equals the share of explained endogeneity of parental earnings in the child's earnings equation. This condition leads to two notable outcomes with respect to previous findings in the literature: (i) perfect prediction of parental earnings is a specific instance of our condition, indicating that consistency can be attained even when parental earnings are predicted imperfectly; and (ii) exogenous instruments alone are insufficient to guarantee asymptotic equivalence between TSTSLS and OLS IGE estimates. Furthermore, our condition suggests that strong first-stage instruments might amplify TSTSLS bias if they are also strongly endogenous in the child's earnings equation. This last result provides a formal criterion for choosing first-stage predictors under the assumption that TSTSLS IGE estimates exhibit upward bias. Additionally, we theoretically study the biases of the two-sample stochastic multiple imputation and cell multiple imputation (MI) procedures, identifying conditions under which MI procedures outperform the traditional TSTSLS estimator. Finally, we validate our results through an empirical Monte Carlo exercise using administrative data from the Chilean formal private sector.
JEL Classification: J31, J61, J62.