Artificial intelligence is making rapid advances in medicine. Already, there are machine learning algorithms that can outperform doctors in some medical fields. There’s only one fairly big problem: experts aren’t quite sure how these algorithms work.
While designers know full well what goes into the A-I systems they build and what comes out, the learning part in between is often too complex to comprehend. To their users, machine learning algorithms are effectively black boxes.
Now, researchers from the RIKEN Center for Advanced Intelligence Project in Japan are lifting the lid. They’ve developed a deep-learning system that can outperform human experts in predicting whether prostate cancer will reoccur within one year. More importantly, the deep learning system they developed can acquire human-understandable features from unannotated pathology images to offer up critical clues that could help humans make better diagnoses themselves.
The team trained their system on a portion of more than 13,000 pathological images of whole prostates gathered from a hospital in Tokyo. The remaining images served as a test set for determining how well the system could perform.
On those images, human pathologists received an accuracy, known as AUC, Area Under the Curve score of about 74%. The RIKEN team’s algorithm scored an impressive 82%. That result suggests that adding an AI step in the clinic could help doctors decide whether a patient needs additional treatments or simply needs to be monitored.
But this particular system goes one step further. Baked into the system’s code is a way for it to tell users what image features correlate positively or negatively with recurrence. And that’s crucial, because those features, as the team found, aren’t always textbook. Signs of recurrence were often found lurking outside of the cancerous regions pathologists are taught to focus on. While certainly robust, the new system won’t replace human analysis any time soon. The more likely scenario is a close partnership between human and machine. Together, the RIKEN team’s system and human pathologists scored a prognosis accuracy of 84%, higher than either alone.
Another critical point is that the deep-learning system had similar accuracies when it was used in two other hospitals in different prefectures in Japan. “This is a very significant result,” says Yoichiro Yamamoto, the team leader of the project. He says it reveals the potential for very broad use of their system. Going forward, the researchers hope to extend their technique to other areas. Right now, they are adapting their system to study breast cancer and rare cancers. But that could be just the beginning. The tool’s discerning eye might perform well in non-medical fields involving large amounts of images.