Image sensing often relies on a high-quality machine vision system with a large field of view and high resolution. It requires fine imaging optics, has high computational costs, and requires large communication bandwidth between image sensors and computing units. In this paper, we propose a novel image-free sensing framework for resource-efficient image classification, where the required number of measurements can be reduced by up to two orders of magnitude. In the proposed framework for single-pixel detection, the optical field for a target is first scattered by an optical diffuser and then two-dimensionally modulated by a spatial light modulator. The optical diffuser simultaneously serves as a compressor and an encryptor for the target information, effectively narrowing the field of view and improving the system’s security. The one-dimensional sequence of intensity values, which is measured with time-varying patterns on the spatial light modulator, is then used to extract semantic information based on end-to-end deep learning. The proposed sensing framework is shown to obtain over a 95% accuracy at sampling rates of 1% and 5% for classification on the MNIST dataset and the recognition of Chinese license plates, respectively, and the framework is up to 24% more efficient than the approach without an optical diffuser. The proposed framework represents a significant breakthrough in high-throughput machine intelligence for scene analysis with low bandwidth, low costs, and strong encryption.