Deep convolutional neural networks (CNNs) have great improvements for single image super resolution (SISR). However, most of the existing SISR pre-training models can only reconstruct low-resolution (LR) images in a single image, and their upsamling factors cannot be non-integers, which limits their application in practical scenarios. In this paper, we propose a multi-scale cross-fusion network (MCNet) to accomplish the super-resolution task of images at arbitrary scale. Specifically, we construct a multi-scale cross-fusion module (MSCF) to enrich spatial information and remove redundant noise, which uses deep feature maps of different sizes for interactive learning. A large number of experiments on four benchmark datasets show that the proposed method can obtain better super-resolution results than existing arbitrary scale methods in both quantitative evaluation and visual comparison.