Deep learning for digital pathology is hindered by the extremely high spatial resolution of whole slide images (WSIs). Most studies adopt patch-based methods which, however, require well annotated data for training. These are typically done by laboriously free-hand contouring on the WSI by experts. To both alleviate annotation burdens of experts and enjoy benefits from scaling up amounts of data, we develop a whole-slide training method for entire WSIs to classify types of lung cancers using slide-level diagnoses. Our method leverages unified memory to offload the excessive amount of memory consumption to host memory to train a classifier by entire hundreds-of-million-pixels slides. Experiments were conducted on the lung cancer dataset which contains 9,662 digital slides with various main types. The results showed that the proposed method can achieve an AUC of 0.950 and 0.924 for adenocarcinoma and squamous cell carcinoma on a separate testing set respectively. Furthermore, critical regions highlighted by the class activation map (CAM) technique of our model reveals a high correspondence to cancerous areas annotated by pathologists.