This study was approved by the Medical Research Ethics Committee of Tokyo Medical and Dental University and written informed consent was obtained from all participants. The protocols were enrolled in a database of the National University Hospital Council of Japan (UMIN000031924, UMIN000032826) and disclosed.
Magnetic resonance imaging (MRI)
Images were collected with a 3.0-T MRI (Achieva 3.0T TX; Philips) using 16-channel flex coils. The cartilage data were extracted by imaging the sagittal plane of the knee joint using both a fat-suppressed spoiled gradient echo (SPGR) sequence (repetition time, 20 msec; echo time, 1st 7 msec, 2nd 13.8 msec; matrix, 256 × 256; flip angles, 90 deg; slice thickness, 0.3 mm; field of view, 150 mm × 150 mm; Actual Water Fat Shift/ Bandwidth(WFS/BW), 2.002 pix/217.0 Hz; total examination time, 7 min 34 sec) (Fig.2A). The meniscus and bone data were extracted with a proton density weighted imaging 3D fast spin echo/ turbo spin echo (PDWI 3D FSE/TSE) sequence (repetition time, 1000 msec; echo time, 35 msec; matrix, 256 × 256; flip angles, 35 deg; slice thickness, 0.3 mm; field of view, 150 mm × 150 mm; WFS/BW, 0.836 pix/519.4 Hz; total examination time, 7 min 30 sec) (Fig.2B)(Table.1).
Automatic segmentation algorithm of 3D MRI
A 3D Convolutional Neural Network (3D-CNN) algorithm for segmentation of cartilage (Fig.2C), meniscus (Fig.2D), and bone was constructed based on U-Net containing an encoder and a decoder [19] (Fig. 3). The encoder contains four blocks, each consisting of two 3×3×3 convolution layers, a batch normalization layer, and a rectified linear unit layer. The first 3 blocks also have a max pooling layer with a stride of 2. The decoder contains three blocks; each one had an up-sampling layer, a fusion layer, and two 3×3×3 de-convolution layers. We used two 3×3×3 convolutions, instead of a 5×5×5 convolution, because they can achieve the same reception field with a smaller number of parameters. The inputs were the PDWI 3D FSE/TSE MRI image for meniscus and bone segmentation and the SPGR MRI image for cartilage segmentation. The outputs were probability maps of target regions, including the background region. Two models with the same structure were trained individually on the PDWI 3D FSE/TSE and SPGR MRI image, the former for bone and meniscus segmentation and the latter for cartilage segmentation.
The model was implemented in TensorFlow (https://www.tensorflow.org/). The MRI images were inputted to the CNN, and probability maps of target regions, including the background region were outputted. The ground truth of the target regions and background region were also represented as probability maps with values of 0 or 1. The model was trained by maximizing the dice rate between the probability maps of the ground truth and that outputted by the CNN using the Adam optimizer available in TensorFlow. After the CNN model was trained, the image for segmentation was inputted into the trained model to obtain probability maps of target regions and background region. For each pixel in the image, we found the number of probability maps having maximal probability at the specific pixel and assigned that number as the region label of the pixel to then get a segmentation of the image.
For neural network training, we randomly chose 10 healthy volunteers and 103 patients with knee pain who had visited our hospital between July 7, 2012, and July 24, 2018. These data were manually segmented by two authors (A.H. and H.A.) who had both trained as orthopedic surgeons for six years and had experience in the manual correction of over 200 knees. A.H. manually segmented the femoral cartilage and H.A. manually segmented the tibial cartilage and meniscus. These segmentation data were converted by professional engineers (K.S. and J.Mas.) to train the neural network. The network was trained to construct a region of interest (ROI) of the femoral subchondral bone and the medial/lateral tibial plateau by manually segmenting the ROI using a reconstructed 3D knee model.
We ran a validation test for our algorithm by randomly selecting 108 of the 113 subjects were randomly selected for training, and other 5 subjects were used for a validation test by computing the Dice similarly coefficient [20]. Because of small sample size, we performed the validation test three times, selecting 108 different subjects for training and 5 different subjects for each test. After completing three validation tests, the software was trained by all 113 subjects and was then used for the cross-sectional research in this study.
Kanagawa Knee Study
The purpose of Kanagawa Knee Study is to clarify the epidemiology and natural history of knee OA, to obtain evidence for the development of diagnosis and treatment, and to identify specific target groups for cartilage and meniscus regenerative medicine for knee OA. The main inclusion criteria are (1) employees of the Kanagawa Prefectural Office, retired employees of the Kanagawa Prefectural Office, or those who work in Kanagawa Prefecture or live in the Tokyo metropolitan area; (2) those who work at a desk for at least 4 hours per day or perform similar work during their employment; and (3) those who are able to come to the Tokyo Station area. The main exclusion criteria are those who have (1) a history of surgery on either the left or right knee; (2) a past history of consecutive visits to the hospital for more than 3 months for knee injuries on either the left or right; (3) a history of OA or fractures in either the left or right lower limb (from hip to foot); (4) rheumatoid arthritis or other collagen diseases; and (5) an awareness that they perform strenuous sports on a daily basis, such as full marathons, triathlons, and weightlifting. The main data collected for the study include (1) a questionnaire that covers height, weight, history of knee pain, activity level, Knee Injury and Osteoarthritis Outcome Score (KOOS), and Numerical Rating Scale (NRS); (2) MRI and radiographs of the right knee; and (3) urine output.
We collected 561 datasets including more than 50 females and 50 males per age group (30s, 40s, 50s, 60s, and 70s). The subject size was based on the study budget. We announced recruitment of these subjects at the Kanagawa Prefectural Government between September 1, 2018 and August 30, 2019. Participants joined our study voluntarily. For the first data set, we collected questionnaires, knee radiographs, and MRIs between November 3, 2018, and September 28, 2019, at the AIC Yaesu clinic of Tokyo. We plan to collect these data twice, with an interval of one year. A second data set is currently being collected. Only the first data set was analyzed in this paper.”
Cartilage and meniscus extrusion measurements
The software we used for MRI analyses was a 3D image analysis system volume analyzer (SYNAPSE 3D, Collaborative version, FUJIFILM Corporation, Tokyo, Japan). We quantified the cartilage by projecting the femoral cartilage cylindrically and dividing it into three regions inside the ROI based on the femoral bone (Fig. 4A). The tibial cartilage was vertically projected and divided into two areas inside the ROI at the medial tibia and lateral tibial plateau (Fig. 4B). Each area was automatically divided into 3×3 subregions at equal intervals [21].
Our software automatically computed the average cartilage thickness (ThC), cartilage volume (VC), and projected cartilage area ratio (PCAR) in each region and subregion. Our software could also display the cartilage thickness mapping (Fig. 1D). PCAR represented the ratio of the projected cartilage area to the total area of the ROI. We evaluated PCAR for the threshold of cartilage thicknesses at > 0.0 mm, > 0.5 mm, > 1.0 mm, and > 1.5 mm. The PCAR values for the thresholds of cartilage thicknesses at each of these measurements were designated PCAR0.0, PCAR0.5, PCAR1.0, and PCAR1.5, respectively. The medial meniscus coverage ratio (MMCR) were automatically computed (Fig. 4C).
Statistical analysis
We evaluated the accuracy of automatic segmentation by calculating the Dice similarity coefficient (DSC) between manual segmentation and automatic segmentation [20]. For each validation test, the DSC was computed for five test subjects at the femoral bone, tibial bone, femoral cartilage, tibial cartilage, medial/lateral meniscus, femoral subchondral bone ROI, and medial/lateral tibia plateau ROI. After the three validation tests, we calculated the mean DSC of each test.
We evaluated the correlation between each quantitative value and other quantitative value using Spearman’s rank correlation test. All statistical analyses were performed using JMP® 14 (SAS Institute Inc., Cary, NC, USA). P values < 0.05 were considered statistically significant.