Federated learning carries out cooperative training without local data sharing, the obtained global model performs generally better than independent local models. Benefifiting from the free data sharing, federated learning preserves the privacy of local users. However, the performance of the global model might be degraded if diverse clients hold non-IID training data. This is because the different distributions of local data lead to weight divergence of local models. In this paper, we introduce a novel teacher-student framework to alleviate the negative impact of non-IID data. On the one hand, we maintain the advantage of the federated learning on the privacy-preserving, and on the other hand, we take the advantage of the centralized learning on the accuracy. We use unlabeled data and global models as teachers to generate a pseudo-labeled dataset, which can signifificantly improve the performance of the global model. At the same time, the global model as a teacher provides more accurate pseudo labels. In addition, we perform a model rollback to mitigate the impact of latent noise labels and data imbalance in the pseudo-labeled dataset. Extensive experiments have verifified that our teacher ensemble performs a more robust training. The empirical study verififies that the reliance on the centralized pseudo-labeled data enables the global model almost immune to non-IID data.