Accurate short-term wind speed prediction is of great significance for wind power generation. Due to the insufficient of traditional wind speed prediction methods to mine nonlinear features of information, an improved nonlinear time series prediction method is proposed by combining Variational Mode Decomposition (VMD) and Deep Learning (CNN-BiLSTM-AttNTS) with the Nutcracker Optimization Algorithm (NOA). Firstly, NOA is used to optimize VMD and CNN-BiLSTM, respectively. Secondly, we apply NOA-VMD to decompose the wind speed data into different Intrinsic Mode Functions(IMFs). Then, phase space reconstruction (PSR) is utilized to identify chaotic characteristics of the components. Finally, the NOA-CNN-BiLSTM-AttNTS model is built up to predict wind speed. Under the same hyperparameters and network structure settings, compared with traditional machine learning methods and state-of-the-art hybrid models, the results show that the R-squared of NOA-VMD-CNN-BiLSTM-AttNTS combination model proposed in this paper exceeds 90%, with good prediction accuracy and generalization performance. The research result can provide reference and guidance for short-term wind speed prediction.