Objective:
Neurologists lack a metric for measuring the distance between neurological patients. When neurological signs and symptoms are represented as neurological concepts from a hierarchical ontology and neurological patients are represented as sets of concepts, distances between patients can be represented as inter-set distances.
Methods:
We converted the neurological signs and symptoms from 721 published neurology cases into sets of concepts with corresponding machine-readable codes. We calculated inter-concept distances based a hierarchical ontology and we calculated inter-patient distances by semantic weighted bipartite matching. We evaluated the accuracy of a k-nearest neighbor classifier to allocate patients into 40 diagnostic classes.
Results:
Within a given diagnosis, mean patient distance differed by diagnosis, suggesting that across diagnoses there are differences in how similar patients are to other patients with the same diagnosis. The mean distance from one diagnosis to another diagnosis differed by diagnosis, suggesting that diagnoses differ in their proximity to other diagnoses. Utilizing a k-nearest neighbor classifier and inter-patient distances, the risk of misclassification differed by diagnosis.
Conclusion:
If signs and symptoms are converted to machine-readable codes and patients are represented as sets of these codes, patient distances can be computed as an inter-set distance. These patient distances given insights into how homogeneous patients are within a diagnosis (stereotypy), the distance between different diagnoses (proximity), and the risk of diagnosis misclassification (diagnostic error).