Compressive sensing (CS)1–3 is a sensing technique that allows sub-Nyquist sampling rates of natural signals. Therefore, CS was found to be very useful for imaging systems that require large acquisition efforts, such as capturing large images, multidimensional images, or when exhaustive scanning is required1. A principal property of CS theory is that it provides the user with guidelines about the number of compressive samples needed to reconstruct an image of a given size, N. However, a common problem CS practitioners encounter is that it is often difficult to predict the desirable quality of the reconstructed image before its capture. If an over-optimistic N is assumed when designing the acquisition step, the resulting reconstruction may not exhibit sufficient detail. Then, the sampling process might be repeated from scratch, but with finer detailed compressive patterns under the assumption of a larger N. Figure 1 demonstrates such a scenario. In Figure 1 iteratively refined compressive imaging4 of a multi-story building is illustrated. Let us assume, for example, that our task is to count the number of floors in buildings in a city. For low-rise buildings, usually a 64 by 64 resolution image would be sufficient. Therefore, there is no need to always sample with the highest resolution. However, when an image of a tall building is taken, we may find the upper floors indistinguishable. Therefore, we would change the CS patterns to match the increased resolution and sample the image again, repeating this process until a satisfactory image is obtained. There are two main problems with this practice. First, the acquisition process is extremely inefficient because the samples taken in each refinement step are agnostic to the information captured in the previous trial. Second, a large reconstruction effort is required, because it is extremely time consuming to repeatedly reconstruct the image using conventional iterative algorithms.
In this paper, we propose a new CS procedure that prescribes which set of samples should be added to the previously captured ones in order to improve the resolution by the desired amount. For this scenario, we developed a deep learning (DL) approach for fast and efficient image reconstruction.
With our approach, we sample the image progressively, capturing only the necessary additional finer scale sensing patterns, while still using the previously taken coarse-detail information. We increase the resolution while decreasing the compression ratio, as the higher resolution images can be better compressed. In the example in Figure 1, if the resolution is still not sufficient to count the number of stories, we increase the resolution again, while decreasing the compression ratio.
This scanning scenario is especially relevant for systems such as a light detection and ranging (LIDAR) system. For LIDAR systems designed for very long-range 3D scanning (more than 10km) it is often required to improve the resolution in order to detect certain hidden objects. Therefore, instead of scanning the scene from the start to fit a higher resolution, it is much more efficient to add patterns while still using the previous information, and thus to reduce the sampling time.
Notice in Figure 1 that the number of compressive samples, M, necessary to sample the image at 1024 by 1024 resolution is around 105 samples, which is almost half the total number of samples taken, denoted by Mtot. Therefore, if we take the compressive samples from scratch at each resolution, the total number of samples would have been almost double the number taken in the highest resolution image from the start. Instead, with our proposed method, we sample the image progressively, and only add the number of samples needed to achieve the higher resolution, reaching the same number of samples as if we had sampled at the highest resolution from the start.
An important property of the proposed method is that it allows us to compressively sample large images. In general, compressive sampling of high-resolution images is challenging, during both the sampling stage and the reconstruction stage. In order to sample high-resolution images, the sensing patterns have to be generated iteratively on the fly, because storing pre-set patterns in the memory is not feasible in practice. For example, to compress the 1024 by 1024 image in Figure 1 by a ratio of 10:1, using random binary patterns5, a sensing matrix with more than \(1{0}^{11}\) entries is necessary, which, if stored in double precision, requires almost 1Tb of computer RAM. To solve this, the Hadamard basis6, 7 can be used; this has a fast generative formula, thus preventing the need to store the whole set of patterns in the computer memory. Here, we combine the multiscale property of the Hadamard basis8 with the multiscale CS sampling concept9 to selectively use the set of Hadamard samples required for each resolution refinement. To facilitate a systematic process that is scalable for large images, we developed a simple method for choosing the multiscale samples needed to capture the 2D compressed images, and we proved the multiscale property in two dimensions (Supplement A).
Another important property of the proposed approach is that it reduces the reconstruction time dramatically compared to classical CS iterative algorithms. This is achieved by a DL reconstruction algorithm that we developed for our scenario. Recently, several DL methods have been applied in the field of CS to reduce the reconstruction time over iterative minimization methods10–17, and even real-time CS has been introduced for low-resolution images16. Most of the methods work by reconstructing the compressive samples with a fully connected first layer that maps the samples to the image. These methods work fairly well for low-resolution (e.g., 128 by 128 pixels), highly compressed images, or sensing and reconstruction in patches. However, the existing methods are not designed to be used for both high-resolution compressed images and progressive change in resolution. In principle, DL methods developed for small images can be applied to large images by dividing the compressive image sampling into small, compressed patches. However, such a solution is suboptimal in the CS sense, because the number of compressive samples M needed to reconstruct the signal of length N, is proportional to \({log}\left(N\right)\) 1–3, therefore the compressibility increases with the signal length. Dividing the image into small parts significantly reduces the compressibility. On the other hand, if we wish to reconstruct the full-resolution images, without employing patch-wise processing, severe limitations on the computer memory can arise during the training of the network because of the size of the high-resolution images. We solve this by reconstructing the image from coarse to fine resolution, according to the progressive acquisition process. This enables training on small patches that reconstruct only high-frequency details per each scale of the compressed image, while the low-frequency details are inherited from the previous, lower scales. Thus, our approach offers the advantage of sampling the complete field of view of the image by the spatially multiplexed Hadamard samples, as well as the advantage of DL reconstruction which is not limited to patches.
We analyze our progressive CS method using numerical simulations and test it experimentally with a Single Pixel Camera (SPC) system. The SPC is very useful for applications such as infra-red imaging, microscopy, ultrasonic imaging, 3D LIDAR imaging, hyperspectral imaging and many more 1, 18–25. We demonstrate around x10 improvement in reconstruction time and around 1.5dB improvement in Peak Signal to Noise Ratio (PSNR) over the iterative reconstruction method in simulation studies on 512 by 512 test images, as well as a x18 improvement in reconstruction time and around 1.8dB improvement in PSNR on a 2048 by 2048 exemplary test image. To the best of our knowledge, this is the first time a full 4-megapixel compressive image has been reconstructed using the DL method.