Cloud detection is an important step in remote sensing image processing and a prerequisite for subsequent analysis and interpretation of remote sensing images. Compared with traditional cloud detection methods, the cloud detection method based on deep learning can effectively improve the accuracy of cloud detection by automatically acquiring the features of remote sensing images. However, it is still difficult to distinguish accurately between clouds and snow, which are very similar in color, texture and other characteristics. In this paper, the features of cloud and snow in remote sensing images are deeply extracted, and an accurate cloud and snow detection method is proposed based on the advantages of Unet3 + network in feature fusion. Firstly, color space conversion is performed on remote sensing images, RGB images and HIS images are used as input of Unet3 + network, and feature information of images in different color spaces is extracted respectively to enhance the difference between cloud and snow in remote sensing images in color and texture. Resnet50 is used to replace the Unet3 + feature extraction network to extract remote sensing image features at a deeper level, and add the Convolutional Block Attention Module in Resnet50 to improve the network's attention to cloud and snow. Finally, the weighted cross entropy loss is constructed to solve the problem of unbalanced sample number caused by high proportion of background area in the image. By constructing the cloud and snow detection dataset of remote sensing images, the proposed method is trained and tested. The results show that the proposed method has strong adaptability and moderate computation, and can effectively eliminate all kinds of interference information in remote sensing images of different landforms, and accurately detect cloud and snow in images.