Smart AMD prognosis through cellphone: an innovative localized AI-based prediction system for anti-VEGF treatment prognosis in nonagenarians and centenarians

Age-related macular degeneration (AMD) is one of the most common reasons for blindness in the world today. The most common treatment for wet AMD is the intravitreal injections for inhibiting vascular-endothelial-derived growth factor (VEGF). This treatment usually involves multiple injections and thus multiple clinic visits, which not only causes increased cost on national health services but also causes exposure to the hospital environment, which is sometimes high risk considering current COVID crisis. The treatment, in spite of the above concerns, is usually effective. However, in some cases, either the medicine fails to produce the anticipated favourable outcome, resulting in waste of time, medication, efforts, and above all, psychological distress to the patients. Hence, early predictability of anatomical as well as functional effectiveness of the treatment appears to be a very desirable capability to have. A machine learning approach using adaptive neuro-fuzzy inference system (ANFIS) of two-sample prediction model has been presented that requires only the baseline measurements and changes in visual acuity (VA) as well as macular thickness (MAC) after four months of treatment to estimate the values of VA and MAC at 8 and 12 months. In contrast to most of the AI techniques, ANFIS approach has shown the capability of the algorithm to work with very small dataset as well, which makes it a perfect candidate for the presented solution. The presented model has shown to have a very high accuracy (> 92%) and works in near-real-time scenarios. It has been converted into a smart phone App, OphnosisAMD, for convenient usage. With this App, the clinician can visualize the progression of the patient for a specific treatment and can decide on continuing or changing the treatment accordingly. The complete AI engine developed with the ANFIS algorithm is localized to the phone through the App, implying that there is no need for internet or cloud connectivity for this App to function. This makes it ideal for remote usage, especially under the current COVID scenarios. With a smart AI-based App on their fingertips, the presented system provides ample opportunity to the doctors to make a better decision based on the estimated progression, if the same drug is continued with (good/fair prognosis) or alternate treatment should be sought (bad prognosis). From a functional point of view, a prediction algorithm is triggered through simple entry of the relevant parameters (baseline and 4 months only). No internet/cloud connectivity is needed since the algorithm and the trained network are fully embedded in the App locally. Hence, using the App in remote and/or non-connected isolated areas is possible, especially in the secluded patients during the COVID scenarios.


Introduction
Age-related macular degeneration (AMD) is a disease process affecting people usually over 50 years of age. It affects the central part of the retina called the macula and can progress to result in loss of central vision. It is the commonest cause of blindness in the western world. It has two basic types: (a) dry AMD, which is often very slowly progressive, and (b) wet AMD, which is quite aggressive. In wet AMD, new blood vessels grow underneath and invade the central macular region by affecting the foveal area and can result in irreversible visual loss if left untreated. Increased vascular endothelial cell growth factor (VEGF) levels are the main drivers of neovascular response due to wet AMD.
Annual incidence (estimated from prevalence) of late AMD in American population was 3.5 per 1000 aged C 50 years equivalent to 293 000 new cases in Americans per year. Incidence rates approximately quadrupled per decade in age. Annual rates of incidences of neovascular AMD were 1.8 per 1000 [1]. Prevalence of late AMD in age-group 55-59 years is 0.1%, and in age-group C 85 years is 9.8%. By 2040, the number of individuals in Europe with late AMD is estimated to grow between 3.9 and 4.8 million [2].
Current standard of AMD treatment is repeatintravitreal injections of anti-VEGF agents to neutralize VEGF within the eye, driving the neovascular process. Commonly used anti-VEGF drugs are bevacizumab (Avastin TM ), ranibizumab (Lucentis TM ), and aflibercept (Eylea TM ). While ranibizumab and Eylea have been FDA-and NICE-approved for use in the eye, Genentech, the company that manufactures bevacizumab, as well as ranibizumab, has not sought FDA approval for bevacizumab to be used as treatment of wet AMD. There are many head-to-head large randomized controlled trials, which have shown non-inferiority of bevacizumab over the more expensive anti-VEGF agents including CATT [3], IVAN [4], LUCAS [5], GEFAL [6], MANTA [7]. Similarly, Lotery et al. [8] reported equal efficacy of ranibizumab and aflibercept. The systemic safety profile of anti-VEGF agents has been reported to be excellent [9].
Duration of adequate response to each injection can last several weeks. Treatment regimens with monitoring by OCT (Optical Coherence Tomography, Spec-tralisÒ) scans for the patients have varied from monthly visits with appropriate injections, to T&E (Treat and extend) visits. OCT scans are an excellent way to non-invasively monitor the exudative and sight-threatening manifestations of active new blood vessels and is the current standard of care in neovascular AMD management.
In addition to the clinical or wet-laboratory solutions, AMD problem has also been tackled by the computing community in broad ranges of two classes of work: (1) detection and classification of the AMD state and (2) modelling of the biological findings, predominantly, from rodents. However, to the best of our knowledge, it is very clear that the presented work in this paper has not been attempted before neither any type of similar work has been reported in the computing literature. Several related computing techniques have been reported that can be utilized for further understanding of the AMD state and its degradation levels. The biggest group of reported works has been in the field of AMD classification that attempts to use the retinal and fundus images to classify the presence and the level of progression using a number of computing techniques. Agurto et al. [10] used a combination of customized image processing algorithms to generate specific features from the images, which are then classified using k-nearest neighbour algorithm. Kubicek et al. [11] have shown the use of fuzzy segmentation in classifying the macular lesions in OCT images. Several other machine learning algorithms have also been used where a number of techniques were utilized in classification of the AMD grades and stages [12][13][14][15]. In these techniques, various types of retinal images have been used with either feature extraction first, using some form of image pre-processing and then using data-centric classifiers to distinguish various degradation stages, lesions, and overall AMD progression. On the other hand, using the deep learning neural networks, it was also shown that appreciable results for AMD grading and progression can be obtained directly without using any image processing before classifying them using the neural networks.
However, when it comes to the actual treatment, the emphasis has been mostly on a specific type of intraocular injections, which were found to be very effective recently [16][17][18]. These reported studies have shown significance of the intraocular medicines as very effective treatment for AMD and have compared various types of the drugs used in these treatments. Out of these drugs, the most effective ones, such as bevacizumab and ranibizumab, have been comparatively studied as well with a large number of general aged population and have been shown to provide excellent improvement in AMD patients [19,20]. The interesting contrast that can be observed in these studies is the age-group of the population taken into consideration. The average age is in the range of 50-60 years.
The most relevant study found so far, similar to the type of work being presented in this paper, is related to the use of deep learning algorithms to understand AMD better. Mulyukov et al. [21] has used a Bayesian approach and defined informative-prior distributions around plausible values from an indirect response pharmacokinetic (PK)/pharmacodynamic model. However, the study has only shown a conservative classification of the progression or absence of progression in terms of visual acuity measurements.
Comparing all the above-reported studies with the presented work in this paper, the uniqueness of the patient group becomes quite evident, since in this work patients above 85 years of age (nonagenarians and centenarians) have been considered. This unique data set was collected at James Paget hospital UK, which is located in a town which has a large population of senior citizens with ages above 85 years. This provided a unique opportunity to be able to study this age-related health issue faced by a large percentage of this population. The scenario is equally applicable to many aging communities in the world where the nonagenarians and centenarians are in significant percentage. The proposed methodology results in a quantified understanding of how the intraocular treatment is progressing. Contrary to other studies in the literature, the age-group under study here shows a large degree of subjective variability. Hence, the treatment may not produce positive outcome even after a year-long treatment with several injections in the eye. The cost in terms of the drugs and services is only one aspect of the related overheads. The human cost in terms of clinical visits, treatment-related pain and fatigue in the eyes, unexpected allergies, and overall apprehensions add to the monetary price-tag of the treatment. The study is aimed to predict the treatment outcome in 30% of the patients' cohort, using initial observations only, so that specific decisions related to continuing, changing, or discontinuing any treatment be made objectively.
The presented dataset is part of the clinical audit. It is although relatively small but is unique in many ways. Firstly, very few reports have been found that have targeted this age-group for their specific healthcare issues. To the best of our knowledge, very few centres are dedicated for this age-group that actually perform specific research on the related healthcare issues of the group. None of the published reports have made the data public for general use. Secondly, the focused age group is above the average age limits that most of the diagnostic and predictive techniques consider as significant feature. Hence, their relevant issues are very unique and have not been fully explored. Thirdly, this age-group was maximally affected by the COVID restrictions as they were unable to attend hospitals for treatment visits. They missed out several of their visits and medications in this process. With the presented technique, their projection-based visit adjustment is possible such that the missed out visits could be compensated without unnecessary exposure to the hospital environment, thus reducing the COVID risk.
One concern is related to the study size being too small. As mentioned earlier, this is a specific subgroup of patients aged 90 years or older. Typically, the average age in wet AMD trials is between 77 and 82 years. This study was done on a very specialized set of patients in such a high age group that is extremely rare to find. The type and nature of the input features were also limited due to the available clinical measurements of visual acuity and macular thickness, only. All of these factors contribute to making the dataset a very challenging one to work with.
Following preprocessing steps were taken before actual model training was carried out. a. Outlier removal b. Normalization of individual fields (inputs) c. Simpler model designs.
These steps have been reported in a number of resources in the literature [22][23][24], to handle small datasets. The above steps assure the estimates to be confined within the normal data space. Data augmentation cannot be applied to this dataset, which is usually applied to either images or discrete data with known correlations with the outcomes. Hence, the individual fields can be manipulated to generate more data samples. For instance, the images can be slightly shifted or rotated to produce cloned images in as many multiples as possible. The dataset under study in this work does not fall into any of the above categories. The actual correlation of the behaviour of the progression is not known with the input samples.
The data is composed of the following fields with corresponding statistical spread: The proposed system is essentially an AI-based prediction algorithm that can estimate the outcomes of relevant clinical measurements very early in the treatment, which provides a foresight for ophthalmologists into the ongoing year-long treatment. The intelligent system has been developed into a phone-App which provides the immense data processing power at the fingertips of the doctors and can assist them in their work to make better decisions related to the treatment.
For designing a predictive model, the baseline and examination at the 4 month after the loading phase of three monthly injections are used. This dual-sample prediction system is based on the initial behaviour of the underlying treatment in the form of the gradients of the two variables, VA and MAC. Any prediction model would start in this manner since single point will result in infinite-dimensional hyperplane of possible trajectories, which will be inconclusive. This model represents the dynamics of the changes happening within the eye with the use of intravitreal anti-VEGF injections affecting the visual acuity and central macular thickness. This model is then validated with the data of 8 months and 12 months. The main objective of the predictive model would be to assist the physicians decide on continuing or changing the treatment if the projected prognosis is not resulting into a positive impact of the medicine. To the best of our knowledge, no such model exists at this point, specially for senior citizens. In addition, the dynamics of the underlying processes are not known analytically such that no simple dynamical model could approximate the given data. Hence, a machine learning approach is proposed here where the data patterns are learned through extensive training of model parameters using the input and output data. The technique used for this work is known as adaptive neuro-fuzzy inference system (ANFIS) and is explained further in the following section.
Adaptive neuro-fuzzy inference system (ANFIS) ANFIS represents a type of supervisory classifiers that uses the training power of the conventional multi-layer perceptron (MLP) type neural network and combines it with the fuzzy set theory to develop the classification rules based on data clustering heuristics. Essentially, this is an inspiration from the learning and perceiving behaviours of human brain. The fuzzy inference component of ANFIS represents the heuristic reasoning aspect of learning that our brain exhibits. This implies that we usually do not work with exact values, rather a range of overlapping (or fuzzy) inputs, and yet come up with correct decisions. This fuzziness in data is achieved through clustering of the input data into similarity groups through the Gaussian mapping of raw data into membership clusters. The neural network part is similar to the way human neurons make inter-connections to 'memorize' or 'learn' a specific sequence of impulses from the sensory data organs. The strength of a specific impulse is imitated as the learning weights that are the interconnections between several layers of the neural network nodes. Figure 1 represents an artistic view of the ANFIS structure used in this work as well as the actual blockdiagram implementation of the system in MATLAB environment. The ANFIS model proposed in this work is based on well-known Takagi-Sugeno fuzzy inference model [25], the first hybrid combination appeared in the scientific community in early 1990s [26].
The main structural elements of this system are listed below:

Rubrics
In order to present the performance of the proposed technique, sample-based plots would visually present the closeness of the estimates to the actual values. However, three statistical rubrics of similarity were used to determine the closeness of these estimates to their original data values (VA 8 , MAC 8 , VA 12 , and MAC 12 , respectively). These measures include: 1. Correlation coefficients: r xy ¼ S xy S x S y { r xy is the correlation coefficient between two sets; x and y, S x and S y are standard deviations of the sets, and S xy represents the covariance between the two sets.} 2. Mean-squared error: MSE ¼ 1 N P N i¼1 y i À b y i ð Þ 2 {y i is the actual data and b y i is its estimated value through the model} 3. P-value-based significance. {Null hypothesis: Very small or No correlation between the original and estimated sets}. Typically calculated using Pearson's test.
Since the presented model is neither a binary classifier, nor an ordinal classifier, therefore the commonly used statistical measures, such as AUC, sensitivity and specificity, cannot be calculated for this system. The mapped values are analog in nature and cannot be considered true or false either as raw data points or ordinal data points in case of thresholded values.

Model development and results
The ANFIS model presented in this work uses 60% of the available clinical data for training and remaining 40% for testing. Training implies using the randomly selected training entries and corresponding output entries in the initialized ANFIS network to train its interconnecting weights (W L,k ) to 'learn' the pattern in the data as a trained model. W L,k represents the value of the k th weight in L th layer of the ANFIS structure. The process of model development can be divided into three phases: (1) preprocessing, (2) training, and (3) testing.

Preprocessing
The original anonymized clinical data is first preprocessed by removing the outliers and then normalized based on the highest value in that specific range of feature values. For VA, this step was skipped since the readings are between 0 and 1 naturally. Then, 60% of these features were randomly selected for training the ANFIS model. Out of these 60%, ANFIS uses 5% to validate the intermediate training outcomes.

Training
The randomly selected data are then supplied to the initialized ANFIS structure. The first layer of fully connected neurons train to cluster the input features in to fuzzy memberships using the neural learning algorithm. Typically, Gaussian functions are used to overlap/convolve the data groups, and neural weights are trained with the data such that they form the variances and means of the Gaussian membership functions. The resulting set of clusters will group input values according to the nearness within the boundaries of that cluster.
The membership functions are then subjected to fuzzy rules in order to map the input data into output clusters. This stage involves establishing input cluster to output cluster maps using simple min-max allocated overlaps for each class.
Each time a forward pass is completed, i.e. the weights are applied on the inputs to produce an output, the predicted output is then compared with the desired value of the output and errors values are calculated. These error gradients are then fed back into the network to fine-tune the weights for better fits. The process continues for the complete training space until the rule-base and the neural weights are converged to the final model values.

Decision surfaces
Once the training is completed, decision surfaces are calculated for a range of all possible input values (with finite increments). Due to the high-dimensional nature, i.e. each output is a four-dimensional model of the input parameters [3-inputs and 1-output], the visualization has to be done with two variables taken at a time, thus resulting in 12-3D plots for the 4-outputparameters. These surfaces are shown in Fig. 2. One can easily appreciate the complexity that has been modelled into these surfaces and that is why the predictability has a lot of confidence in the output values.
Testing Once the ANFIS model is trained, it is applied with the remaining 40% of the data that this model has not seen during the training phase. Figure 3b-e shows the results of original test data and the corresponding estimated values from the model. Figure 3a also shows the actual data spread in terms of boxplots that represent the values with respect to their statistical properties. Similarly, Fig. 3f-g represents the outcomes of training and testing, respectively, as boxplots. The estimated values are extremely closed to the original values. This shows a converged training of the initial neural architecture and a promising model for the rest of the data.
As can be seen that in all practical scenarios, the estimated outputs represent a reasonable degree of agreement with the actual values. Hence, the actual value of the four output parameters (VA 8 , MAC 8 , VA 12 , and MAC 12 ) can be estimated based on the four months of treatment data. This provides a very useful tool to the doctors to make critical decisions in terms of how long to continue the treatment, how many more injections should be given, etc., based on the projected improvement in the patients visual acuity and macular thickness. Table 1 summarizes the validation statistics between the actual and the estimated values.
Comparison with linear-prediction and linearregression methods Figure 3 can give an impression that the data are partially skewed linear and could be modelled as linear equation(s). On the contrary, the above results prove positively that the nonlinear ANFIS model structure was able to capture the data patterns in order to predict values for the forthcoming temporal data outcomes. To investigate the feasibility of standard linear and statistical modelling approaches, the following two approaches were selected based on the simplicity of the design.

Simple linear predictor model
This model is, essentially, based on the equation of straight line defined for homogenous inputs only where VA and MAC can be calculated for any value of time (T) as follows:

Multiple input regression-model
This model is based on simple least square estimator (LSE) defined for a multi-input-single-output (MISO) system. This model includes the age, and the first two samples of VA and MAC are used in the following structure: Fig. 2 Various decision surfaces that were resulted after the completion of the ANFIS training process. Each row represents one output as mapped by three input variables Writing each equation in a concise matrix form, we use 60% of the data for calculating the unknown coefficients. Once the coefficients are calculated, these are used with remaining 40% data in the same structure as shown in the above equations, and calculated as the expected outcomes VA 8 , VA 12 ,   Figure 4 shows the comparison of linear estimator, linear-regression, and ANFIS models in terms of MSE calculated between the actual outcomes and the estimated outcomes. Also shown is the comparison of correlations between the original data and their estimates for each technique. It was observed that ANFIS outperformed the other two methods.

The smart app development
The completed neuro-fuzzy system is finally converted into a user-friendly phone-App for easy usage and prompt-decision support. The App has been named as Ophnosis AMD , being the first of a series of Apps for the ophthalmologists. The word Ophnosis is a combination of ophthalmology and prognosis, reflecting the exact nature of the developed tool. The use of neuro-fuzzy intelligence resulted in several decision surfaces, as shown in Fig. 2. These surfaces are, essentially, matrices in multiple dimensions and hence can be stored as look-up tables (LUTs) in the phone memory. The inputs are first preprocessed in order to scale them into the surface matrices limits and are then mapped onto the table for the required output values. This unique feature of the system renders the whole data processing engine as a local resource for the user's phone and does not need any cloud connectivity for its computing power. This also enables near-real-time results due to very little onthe-fly computations. The finished App is shown in Fig. 5. The user is asked to supply the input data, Age, VA-base, VA-4, MAC-base, and MAC-4. As the user presses the COMPUTE button, the output values are produced based on the LUT values and are displayed in the output measurement fields. In addition, the heuristical understanding of the doctor has been incorporated as well in making a trinary prognosis: good, fair, and bad. Essentially, the increase in the visual acuity and decrease in the macular thickness at the 12th month in comparison with the baseline measurements would mean a good prognosis from the treatment. The exact opposite for both measurements would result into bad prognosis, while a mixed change would result in fair prognosis.

Conclusion
In this paper, a smart treatment-prognosis estimation system has been presented, which is based on the machine learning approach using adaptive neurofuzzy inference system. The technique has been applied on the retrospective data set related to the progress of wet AMD patients treated with specific anti-VEGF agents. Values correspond to the visual acuity and macular thickness at the beginning and at 4 months during the treatment were used as inputs to the training system. Using these input parameters, the actual values at 8 th and 12 th months were predicted. From Table 1, the mean-squared errors in the estimates for VA and MAC for both 8th and 12th months are less than 1%. This shows the level of accuracy in the estimates compared with their actual values in the dataset. Similarly, all p-values are \ 0.0001, thus proving the estimates to have very high correlation with the actual values. Correlation of the estimates with the actual output for both VA and MAC was all found to be 90% or above proving the close resemblance of the estimated values with the corresponding actual values.
It is postulated in this work that the developed machine learning model can be very useful for the physicians in their decision-making process for continuing or discontinuing a specific medicine based on the projected outcomes only after four months of treatment. This will, not only, be of great advantage for the patient whose treatment can be streamlined at a very early stage, but at the same time it will save a large amount of money and resources in administering these injections, thus reducing unnecessary load on the public healthcare system. Funding This work is NOT part of any academic, industrial or governmental funded agency.

Declarations
Conflict of interest Authors certify that THEY have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers' bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.