Human subjects
The subjects used in this study were patients who were clinically diagnosed with COVID-19 based on their reverse transcription polymerase chain reaction (RT-PCR) test results (Extended Data Table 1). Blood samples were collected as residual coagulation test samples (with 3.2% citrate) after the completion of requested clinical laboratory tests for diagnosing COVID-19 at the University of Tokyo Hospital. The negative control group was composed of 4 healthy subjects. Blood samples from the healthy subjects were drawn multiple times on 67 different dates for preparing control samples. Likewise, the positive control group was composed of 7 hospitalized patients under no anticoagulant therapy and with no abnormality confirmed by their coagulation tests, which indicated PT-INR (prothrombin time and international normalized ratio), APTT (activated partial thromboplastin time), and fibrinogen levels of 0.88 - 1.10, 24 - 34 sec, and 168 - 355 mg/dL, respectively, and D-dimer levels of less than 1 µg/mL. Subjects under anticoagulant therapy were excluded. Clinical information and laboratory test data were obtained from the electronic medical records of the patients using a standardized data collection form. For comparison, blood samples from subjects with other diseases that elevate D-dimer levels, such as cancer, thrombosis, and sepsis were also analyzed. This study was conducted with the approval of the Institutional Ethics Committee in the School of Medicine at the University of Tokyo [no. 11049, no. 11344] in compliance with the relevant guidelines and regulations. Written informed consent for participation in the study was obtained from the patients using an opt-out process on the webpage of the University of Tokyo Hospital. Patients who refused participation in our study were excluded. Written informed consent was obtained from the healthy subjects as well. The demographics, clinical characteristics, and laboratory findings of patients with COVID-19 are shown in Extended Data Table 1. The demographics, clinical characteristics, and laboratory findings of patients with other diseases (e.g., cancer, non-COVID-19 thrombosis, non-COVID-19 infectious diseases) are shown in Extended Data Table 3.
Sample preparation
Single platelets and platelet aggregates were enriched from whole blood by density-gradient centrifugation to maximize the efficiency of detecting platelets and platelet aggregates, as described in our previous report32 with minor modifications. As shown in Extended Data Figure 2, for analyzing the concentration of platelet aggregates, 500 µL of blood was diluted with 5 mL of saline. Platelets were isolated by using Lymphoprep (STEMCELLS, ST07851), a density-gradient medium, based on the protocol provided by the vendor. Specifically, the diluted blood was added to the Lymphoprep medium and centrifuged at 800 g for 20 min. After the centrifugation, 500 µL of the sample was taken from the mononuclear layer. Platelets were immunofluorescently labeled by adding 10 µL of anti-CD61-PE (Beckman Coulter, IM3605) and 5 µL of anti-CD45-PC7 (Beckman Coulter, IM3548) to the blood sample to ensure the detection of all platelets or platelet aggregates in the sample. Then, 500 µL of 2% paraformaldehyde (Wako, 163–20145) was added for fixation. The operation was performed at room temperature.
Laboratory tests
Blood samples from the patients were used for routine laboratory tests, such as leukocyte count, platelet count, alanine transaminase (ALT) concentration, creatinine concentration, lactate dehydrogenase concentration, C-reactive protein concentration, and D-dimer level (Extended Data Table 1). All the D‐dimer tests were conducted on a CN6500 automatic coagulation analyzer (Sysmex, Japan) with a latex‐enhanced photometric immunoassay (LIAS AUTO D-dimer NEO, Sysmex, Japan). The laboratory reference range of the D-dimer test at the University of Tokyo Hospital was 0 - 1.0 µg/mL.
Intelligent platelet morphometry machine
The intelligent platelet morphometry machine mainly consists of two parts: image acquisition and digital image analysis (Figure 1a, Extended Data Figure 1a). For image acquisition, the FDM microscope was used for acquiring high-speed, blur-free, bright-field images of flowing cells (e.g., red blood cells, leukocytes, platelets, platelet aggregates, cell debris) with 67 x 67 pixels per image at a high throughput of 100 - 250 events per sec (eps), where an event is defined as a single platelet or a platelet aggregate since red blood cells, leukocytes, and cell debris were not detected as events. The throughput was chosen to avoid clogging the microchannel although the theoretical throughout of the machine was >10,000 eps. The FDM microscope used a broadband, spatially distributed, optical frequency comb to illuminate flowing cells in a microfluidic channel. The image-encoded time-domain signal was detected with a single-pixel photodetector, converted into a serial digital data stream by a digitizer, and recovered as bright-field images by a home-made LabVIEW program. For each blood sample, objective areas in 25,000 acquired images were plotted in a size distribution histogram.
Optical frequency-division-multiplexed microscope
The FDM microscope is a high-speed, blur-free, bright-field imaging system based on a spatially distributed optical frequency comb as the optical source and a single-pixel photodetector as the image sensor35, 36. Since the optical frequency comb is composed of multiple beams which are spatially distributed, it is capable of simultaneously interrogating the one-dimensional spatial profile of a target object (e.g., a platelet, a platelet aggregate). In addition, since each discrete beam of the optical frequency comb is tagged by a different modulation frequency, a spatial-profile-encoded image can be retrieved by performing Fourier transformation on the time-domain waveform detected by the single-pixel detector. As shown in Extended Data Figure 1b, we used a continuous-wave laser (Cobolt Calypso, 491 nm, 100 mW) as the laser source. Emitted light from the laser was split by a beam splitter, deflected and frequency-shifted by acousto-optic deflectors (Brimrose TED-150-100-488, 100-MHz bandwidth), and recombined by another beam splitter. The resultant optical frequency comb was focused by an objective lens (Olympus UPLSAPO20X, NA:0.75) onto objects (e.g., single platelets, platelet aggregates) flowing at 1 m/s in a customized hydrodynamic-focusing microfluidic channel (Hamamatsu Photonics). Light transmitted through the flowing objects was collected by an avalanche photodiode (Thorlabs APD430A/M) and processed by a home-made LabVIEW program to reconstruct the bright-field images. The line scan rate, spatial resolution, field of view, and number of pixels were 3 MHz, 0.8 μm, 53.6 μm x 53.6 μm, and 67 x 67 pixels, respectively. Fluorescence emitted from platelets labeled by anti-CD61-PE was also collected and used for triggering image acquisitions. Fluorescence emitted from leukocytes labeled by anti-CD45-PC7 was collected to identify platelet aggregates containing leukocytes.
Statistical analysis
The regions of objects (i.e., platelets, platelet aggregates) in bright-field images were segmented in MATLAB for calculating the concentration of platelet aggregates in each sample. First, a 10x interpolation by the interp2 function (MATLAB) was applied to each image for achieving segmentation results. Then, the outlines of the object regions were detected by using the edge detection function with the Canny method in MATLAB. Morphological operations like dilate, fill, and erode were applied to fill and refine the object regions for obtaining their masks as well as for eliminating the background noise. After the segmentation, the size of the object (i.e., a single platelet, a platelet aggregate) in each image was calculated by multiplying the pixel size of the segmented region, which was extracted by the regionprops function, and the pixel resolution (80 nm/pixel) after interpolation. All the images with objective areas larger than 48 µm2 were considered as the images of platelet aggregates in all the patient and healthy subject (control) samples. The concentration of platelet aggregates was defined by the ratio of the number of acquired images containing platelet aggregates to the total number of acquired images containing single platelets and platelet aggregates (n = 25,000) in each sample. In all 110 patient datasets, 106 of them have 25,000 images while only 4 of them (no. 6, 10, 12, 14) have 20,000 images due to a data-recording error, but this should not influence the statistical accuracy of our data analysis since the number of acquired images is significantly large. The presence of leukocytes in platelet aggregates was also identified by analyzing the morphology of the platelet aggregates by their images. For confirmation, the fluorescence signal of anti-CD61-PE and anti-CD45-PC7 was also used. CD61-PE/CD45-PC7 double-positive events were counted as platelet aggregates containing leukocytes. CD61-PE positive and CD45-PC7 negative events that had CD61-PE signal intensity greater than a threshold value were counted as platelet aggregates excluding leukocytes. The presence of excessive platelet aggregates was determined by calculating the mean and standard deviation of the distribution of the concentration of platelet aggregates in the control samples and evaluating if the concentration of platelet aggregates in a patient sample exceeded the threshold (mean + standard deviation). If a tighter threshold (mean + two standard deviations) was used to calculate the presence of excessive platelet aggregates, then the ratio of the number of patients with excessive platelet aggregates to the number of all patients is 75.5%, while the ratio of the number of patients with excessive platelet aggregates and low D-dimer levels (≤1 µg/mL) to the number of all patients with low D-dimer levels (≤1 µg/mL) is 62.8%.
High-dimensional analysis
In both Figure 3i and Figure 3j, a fully connected classifier was trained on image data from non-COVID-19 patient blood samples under specific classes to build a visual feature extractor. The class label was represented as a one-hot vector. Images were picked with size gating and normalized to zero mean and unit variance. The images were input into VGG-16 with pre-trained weights on ImageNet dataset. The output of the fourth pooling layer was flattened into an 8192-dimensional vector and input into the classifier. The classifier was optimized by Adam optimizer with loss function of mean absolute error. To improve classification accuracy for under-represented classes, the loss function was weighted by inverse class frequency. The learning rate was reduced from 0.01 by a factor of 0.31 when the validation loss stopped decreasing for more than 10 epochs. The training was terminated when there was no significant improvement for 100 epochs. The training and validation set was randomly selected with an 80/20 ratio. After the training, the final layer was removed and the rest was used as a feature extractor, which output a 48-dimensional vector. The dimension of feature vectors was reduced into minimum dimension by principal component analysis, so that cumulative contribution rate of the selected principal components was more than 0.99. For Figure 3i, 13770 images of platelet aggregates from 16 thrombosis patients and 12651 images of platelet aggregates from 7 infectious disease patients were used for the training and validation and displayed in the uniform manifold approximation and projection (UMAP) plot. 150116 images of platelet aggregates from 110 COVID-19 patients were added to the UMAP plot. The feature extractor was implemented in Tensorflow. VGG-16 was used as the application in the library. Figure 3i was plotted with parameters n_neighbor = 1000 and min_dist = 1. For Figure 3j, 19649 images of platelet aggregates from 15 patients with risk factors for venous thrombosis (hypertension, diabetes, hyperlipidemia, smoking, and history of atherosclerosis) 20850 images of platelet aggregates from 18 patients with risk factors for arterial thrombosis (cancer, post-surgery, long-term bed rest, obesity, and heart failure) were used for the training and validation and displayed in the UMAP plot. Patients in each group did not have risk factors for the diseases in the other groups. To show their relations with the diseases, 22890 images of platelet aggregates from 22 COVID-19 patients without thrombosis-related underlying conditions were added to the UMAP plot. Figure 3j was plotted with n_neighbor = 4000 and min_dist = 1.
Data availability
The source data (Source Data 1) supporting the findings of this study are available at http://doi.org/10.5281/zenodo.4700112 and are also available from the corresponding authors upon reasonable request.
Code availability
All the code used for data taking and analysis in this study is available at http://doi.org/10.5281/zenodo.4700072 and is also available from the corresponding authors upon reasonable request.