Convolutional neural network(CNN) with a complex architecture and a large number of layers has high accuracy. However, in some occasions that require high real-time detection, it is not suitable for direct application of these complex neural networks. Thus, some researchers propose pruning to get a sparse model. Current pruning methods mainly focus on static pruning with fixed pruning rate. In this paper we propose an iterative pruning framework to compress model dynamically. To maintain the performance of model, we incorporate knowledge distillation into the whole process. Experimental results prove that our framework can effectively compress the number of parameters of the model with the accuracy not greatly reduced, and the framework has a certain degree of robustness to structural pruning method.