Most of the existing research focuses on the recognition of micro-expressions, and few studies how to recognize the action units of micro-expressions. This is due to the low intensity of the facial action unit, which is not easily to be recognized. To solve this problem, we proposed a micro-expression action unit recognition algorithm based on dynamic image and spatial pyramids. First, the video is passed through the dynamic image generation module to generate a dynamic image and extract the motion information contained in all frames. Then, given the subtle movement properties of micro-expressions, different levels of semantic features are obtained through spatial pyramids. It is also known that micro-expressions appear in the small range and are concentrated in local area of the face, so the regional feature network and attention mechanism are used for the image features of each layer. Finally, due to the weak correlation between each action unit, our models are trained separately. Experiments on CASME and CAS(ME)2 datasets verify that the proposed algorithm has shown better action unit recognition performance compared with other advanced methods.