Predictors for Outcomes related to upper Extremity Musculoskeletal Disorders in a healthy Working Population: a methodological Approach

To investigate the association between different clinical endpoints and the presence of upper extremity work-related musculoskeletal disorders (WMSDs) in a healthy working population. Furthermore, the inuence of socio-demographic, work-related and individual predictors on different endpoints was examined. Two self-completion questionnaires were administered to 70 workers and employees. In addition, a standardized physical examination and an industry test were performed in this cross-sectional study. Correlations between WMSDs and clinical endpoints were analysed with the Spearman method. Depending on the type of dependent endpoint, linear or logistic multivariate regression models were used to study the strength of associations with a pre-dened set of potential inuencing factors. between correlation between

only consider some of these risk factors simultaneously, which makes it di cult to make a statement about the in uence of competing factors [16].
Furthermore, the study designs and research methods used to collect and process data via WMSDs are extremely heterogeneous. Our literature search revealed that out of 262 original research articles, only about 4% of the studies contained a clinical examination of the subjects with or without structured questionnaires / interviews including a control group [17].
The aim of this analysis was to investigate the association between different clinical endpoints which are likely to be measured in the clinical routine, but usually just represent surrogates for an actual health problem, and the presence of upper extremity WMSDs in a healthy working population. It also intended to take into account potentially predisposing sociodemographic, work-related and individual characteristics ("risk factors").

Study Population
The study was conducted at the headquarters and main production site of Aesculap AG, Tuttlingen / Germany with about 3500 workers and employees. This company is the world market leader in the eld of surgical instruments. The study population was divided into three groups based on their occupational activities: group I = grinding and polishing characterised by repetitive and forceful exertions ("grinding"), group II = inspection and packaging characterised by repetitive exertions without force ("packaging"), and group III = all other white-collar and blue-collar employees as a cross-section of the company as a control group ("control").

Study design and data collection
Random samples of active white-collar and blue-collar workers were drawn from the three groups using statistic software and recruited between September 2017 and March 2018. No incentives were offered. Eligibility criteria were applied according to our previous publication [17]. The participants were asked to ll in two standardised self-administered questionnaires. The rst questionnaire obtained demographic and personal data, considered as predictors in the analyses, such as sex, handedness, secondary occupation, sporting and physical hobbies, age, height and weight (body mass index), volume of employment, and years of service.
Clinical endpoints and signs for WMSDs DASH score As second questionnaire, the validated Disabilities of the Arm, Shoulder and Hand (DASH) outcome measure was applied to assess physical function and symptoms [18,19]. The results of this 30-item questionnaire were used to calculate a scale score ranging from 0 (no disability) to 100 (most severe disability).

VAS
Instrument to measure the intensity of pain on a Visual Analogue Scale. A Likert scale between 1 (no pain) and 10 (maximal pain) points was used to collect the subjective evaluation for pain at rest and pain under strain. Results were analysed as continuous parameters.

ROM
Range of motion (ROM) measurements of wrist joint mobility of both hands were performed using a goniometer in the three planes extension / exion (E/F), supination / pronation (S/P) and ulnar / radial abduction (U/R) according to the neutral zero method [20].

Grip strength
Measurement of the maximum hand force using the Jamar dynamometer in three consecutive passes per side.

PPB Test
The neurophysiological Purdue Pegboard (PPB) Test was performed to determine the dexterity of the participants [21]. As a former industrial test, it now serves primarily to assess disabilities and limitations. The pegboard consists of a board with two parallel rows of 25 holes each into which cylindrical metal pegs are placed by the examinee. The test involves a total of four trials [22]. The subsets for preferred, non-preferred, and both hands require the test person to place the pins in the holes as quickly as possible, with the score being the number of pins placed in 30 sec. Purdue Pegboard trial number 4 was chosen as the representative clinical endpoint as it summarises trials numbers 1 to 3 well by adding them up, and therefore should show the potential differences more clearly.

Subjective complaints
The presence of subjective complaints on the upper extremity of a test person was assessed in the form of a yes / no type of self-reported question. were made based on pathognomonic clinical signs for upper extremity pathologies, after selection by a multidisciplinary team consisting of an occupational physician, hand surgeon and occupational therapist [23]. These included pressure pain on the radial side of the wrist together with Finkelstein's test for De Quervain's tenosynovitis, lateral epicondyle pain and Maudsley's test for tennis elbow, and the combination of Hoffman-Tinel sign and static two-point discrimination (2-PD) for nger sensibility for nerve entrapment syndromes, adding Phalen's test speci c for Carpal tunnel syndrome [24][25][26][27]. The presence of any of the above diagnoses in a study participant was considered to be a WMSD diagnosis, which served as gold standard for correlation analyses with other clinical endpoints and predictors.

Statistical Analysis
The study was initially powered for the proof of the hypothesis of clinical equivalence of occupational groups in regard to the DASH score [17]. This hypothesis could be con rmed by primary analysis.

Correlations
In the current analysis, correlations between variables were assessed using Spearman correlation coe cients and their pvalues. Prior to the calculation of correlation coe cients, bivariate scatter plots were visually examined in order to investigate interrelationships between the clinical endpoints. Prediction ellipses were applied to the scatter plotsBecause the ellipse is centred at the two-dimensional mean and expanded to cover the maximal part of the data points, it can visually indicate the strength of interrelation as well as outliers in the data. A stretched tilted ellipse indicates highly correlated variables, whereas an ellipse that is nearly circular indicates little correlation. A multivariate regression modelling of clinical endpoints was applied in order to identify their signi cant predictors, based on the p-value of the effect. Linear or logistic regression was used dependent on a respective continuous or dichotomous type of the endpoint variable.

Repeated measurements
Some endpoints in our selection enclose repeated measurements in the same subject like left-and right-hand measurements of hand force or range of motion. Additionally, the repetition scheme may have included 3 subsequent attempts with each arm (hand force) or a recording of 3 dimensions (E/F, S/P, U/R) for range of motion. Adequate use of such repeated data structures required consideration of more sophisticated repeat measurement modelling methods, rather than just using the mean of the three attempts, in order to avoid loss of information and statistical power.

Handedness
Considering handedness was more challenging than it appears. The study population included four types as follows: For analytic purposes, left and right-handed subjects were put together with their corresponding mixed types.
Measurements referring to the right or left body side were analyzed as referring to the primary or to the non-primary hand in order to evaluate the effect of the handedness.

Descriptive statistics
Mean values and standard deviations (SD) in brackets were described with approximately normally distributed continuous data. Median and interquartile ranges in brackets were shown for non-normally distributed continuous data [28]. We calculated absolute and relative frequencies for categorical variables. Numerators and denominators of the calculations were always given in parenthesis when reporting percentages in categorical data.
Signi cance level As the study sample of 70 subjects might lack power to detect small effects, the regression results must be considered with caution. The classic 5% signi cance level does not seem appropriate for exploration purposes, so effects showing p-values up to 0.1 will be closely considered. This procedure follows our aim to gure out contributors to clinical endpoints for future studies rather than provide evidence for correlations in our study sample.
SAS software version 9.4 with Enterprise Guide 7.1 GUI was used for analyses.

Recruitment and baseline characteristics
The total population consisted of 63 persons in group I, 208 in group II and 2501 in group III. Random samples were drawn from these groups after completion of the questionnaires and proving eligibility, so that a total of 70 subjects (grinding n = 20, packaging n = 24, control n = 26) were included in the study. The subjects were predominantly men (67% (47/70)) and righthanded individuals (83% (55/70)). Only few had a secondary occupation (9% (6/70)) and 61% (43/70) reported having sporting or physical hobbies. The three groups had comparable demographic data with regard to age (42.1 (12.2) years), body mass index (BMI) (26.2 (5.0) kg/m 2 ), full volume of employment (91% (64/70)) and years of service at the company (16.1 (9 to 28) years). For the study population owchart and detailed demographic data in the individual groups, please refer to our previous publication [17].

Clinical endpoints
The DASH score, clinical parameters and the Purdue Pegboard (PPB) Test score are shown in Table 1. Our mean scores are in good agreement with the normative DASH score [29]. Note that lower DASH scores are associated with a better situation. Three DASH questionnaires were excluded from the analysis because of incompleteness. We did not expect any relevant differences between the grip strength of the right and left hands, therefore we used the parameter dominant / non-dominant hand for further analysis. When comparing grip strength with reference values from a healthy population, subdivided according to sex and age group, below-average values were found [30]. This was also the case with the PPB test [31].  (Table 2). Bilateral manifestation was present in 34% (24/70), and 14% (10/70) of the subjects had two or more different pathologies in the ipsilateral limb. In the case of a positive Hoffmann-Tinel sign, 91% (21/23) of these were located at the medial elbow as a sign of ulnar tunnel syndrome.

Correlation of WMSDs with other clinical endpoints
The relationship between WMSDs and the DASH score is shown as an example using a scatter plot with a prediction ellipse ( Fig. 1) [32]. Further correlations between grip strength and ROM E/F, and between VAS at rest and under strain, DASH and subjective complaints became apparent.
A closer look into the scatter plot reveals an obvious bottom effect in the DASH distribution. DASH values of probands with WMSDs concentrate more in the 10-point range, whereas the "no WMSD"-group very often has a DASH score of 0. This causes the slight tilt of the ellipse, correlating with the Pearson correlation coe cient, which, in the rst approximation visualizes an interrelation between the DASH score and WMSDs. However, the non-normality contradicts the application of the parametric Pearson regression and causes a switch to the non-parametric Spearman correlation. Its coe cient is indeed much more sensitive to this effect, due to its robustness towards non-normal distribution of the DASH values. For the same reason, the normality assumption is not given for ROM and VAS distributions, so the Spearman method has to be preferred for evaluation of interactions between WMSDs and these endpoints.
Bivariate correlation coe cients between clinical endpoints and the diagnosis of WMSDs are shown in Table 3. Only two of the collected clinical endpoints were associated with a p-value below 0.1 (presented in bold letters): the DASH score and VAS under strain. Both endpoints are positively correlated with WMSDs, meaning that higher scores are associated with a WMSD diagnosis. These results also make sense from a clinical point of view since both DASH scores and VAS associate higher scores with worse outcomes.

Multivariate regression analyses of predictors
The relationship between the pre-speci ed predictors and the occurrence of WMSDs was investigated through the analysis of a binary logistic regression ( Table 4). None of these factors seem to be a risk factor for WMSDs, the minimal p-value was 0.14.  In addition to known and obvious correlations such as gender dependence for DASH with higher values in women, or higher grip strength in men, there was a negative correlation between ROM and years in service, more frequent subjective complaints of the upper extremities with increasing age, pain under strain on the VAS at a higher BMI, and higher grip strength when secondary occupation and/or physically demanding hobbies were performed [33].
A statistical effect may be observed in Table 5 with by far the lowest p-values concentrated in the analysis of grip strength and, to some extent, of the ROM. This is not really surprising as there were 3 repetitions recorded for every measurement. The additional information gave the analysis of variance more statistical power i.e. more sensitivity for effects detection.

Discussion
We chose to survey a population of actively employed surgical device mechanics and compared them with a group of employees believed not to be exposed to repetitive hand and arm movements to such a large extent.
The clinical endpoint values of our study population were largely consistent with the reference values of the general population, but in some cases (grip strength, PPB tests) also showed below average values. This is surprising and contradicts the study situation, as our cohort tends to have an above-average physical load [51]. Reasons could be mechanical support, shorter working hours and a historical shift in populations' reference values without the rst two points mentioned.
The aim of our analysis was to evaluate different clinically established endpoints representing surrogates for an actual health problem, for association with upper extremity WMSDs in terms of a technical evaluation of the survey methods. We also investigated the in uence of socio-demographic, work-related and individual independent predictor variables as potential risk factors for WMSDs. The methodological approach is intended to be a proposal for a standardized procedure for future crosssectional studies of this kind.
In the bivariate analysis we found a correlation between the DASH score and WMSDs as well as VAS under strain and WMSDs. A simple clinical explanation for this could be that pathology (WMSDs) manifests itself through pain, especially when the hand is used forcefully. This aspect is a common feature of the DASH questionnaire and also manifests itself with VAS under strain. VAS at rest, ROM, grip strength, PPB Test and the indication of subjective complaints by the study participants were not suitable to detect WMSDs in our study. However, the question of subjective complaints is a central component of many studies and the basis for their interpretation.
Regarding the upper extremity, the validity of questionnaires for WMSDs has not been clari ed and it is not known how an optimal questionnaire can be constructed and what information can be obtained [52]. A purely technical investigation using measurement data without clinical examination would have the advantage of resource optimization but could not be related to WMSDs either [53]. This is also supported by the relatively weak correlation between measured clinical endpoints and WMSDs in our study. Accordingly, a clinical examination based on a predetermined set of diagnostic criteria remains the gold standard for cross-sectional investigations in order to keep well-de ned disorders separate from more diffuse conditions.
Although the clinical examination is time consuming and hard for both the subject and the examiner, it seems to be necessary so far to detect de ned WMSDs.
Although the p-value has not quite reached the conventional signi cance level of 0.05 in our study (p = 0.056), the correlation we found between WMSDs and the DASH score may not therefore be considered non-existent and could be of interest for the design of future studies [54]. The validated DASH score as a self-administered, region-speci c outcome instrument for upperextremity disability and symptoms, was tested against the gold standard of WMSD detection (i.e. the clinical examination). To our knowledge, there are no studies focusing on the correlation between WMSDs and the DASH score, whereas this is the case for some upper extremity pathologies other than WMSDs [55]. According to our analysis, the DASH score has the potential to replace the resource-intensive clinical examination as a screening tool. In case of conspicuous DASH scores, the latter could be used in a focused manner for diagnosis, with therapeutic and preventive measures derived from it. In comparison, the Nordic Musculoskeletal Questionnaire (NMQ), often used in cross-sectional studies, is a simple validated questionnaire that refers to complaints in 9 body parts, including the hand/wrist/elbow [56]. Its content is in no way comparable to the detailed questions of the DASH with its focus on the upper extremity, and hardly exceeds the yes/no question on subjective complaints in our study. To what extent further questionnaires are suitable for the detection of upper extremity pathologies will be the subject of future studies.
For the endpoint WMSD, the multivariate analysis of our study did not show any independent predictors signi cant at a 0.1level. This is partially in contrast to previous studies, in which work-related and sociodemographic characteristics have been determined as predisposing upper extremity disorders [34]. The rst include static postures, excessive force and strain, vibration, repeated pushing, pulling and lifting, overuse of particular anatomical structures or regions, poor posture or improper positioning, awkward movements, long duration of pressure, rapid work pace, short recovery periods, low decision latitude, years of service and job satisfaction [57][58][59][60][61]. Socio-demographic characteristics predicting WMSDs include factors like sex, age, marital status, work experience, body mass index and physical activities [62][63][64][65][66][67]. It is advisable to select the characteristics of personal factors, physical body functions, environmental factors and mental body functions based on the International Classi cation of Functioning, Disability and Health (ICF) [68]. The scatterplot matrix with prediction ellipses has proven to be a fast graphic analysis and a preliminary stage for a detailed statistical evaluation in our study.
Multivariate analyses were also used to examine the independent predictors for other clinical endpoints. Signi cant positive correlations between ROM and years in service, more frequent subjective complaints of the upper extremity with increasing age, higher VAS under strain with a higher BMI, and higher grip strength with the presence of a secondary occupation and/or physically demanding hobbies are particularly noteworthy. These ndings are relevant for future investigations, since the relationships of independent variables to each other may disturb the identi cation of risk factors for WMSDs, acting as confounders. These relationships should therefore be considered when analyzing any WMSD-related outcomes.
From a statistical point of view, the study has provided good statistical pre-conditions with regard to the proportion of WMSDs in our collective of 56%. As this proportion was close to the perfect balance between WMSD-positive and WMSD-negative cases, the power for group comparison was near to maximum for the given sample size. The very low p-values for effects in the analysis of grip strength (see Table 5) probably resulted from the increase in power due to the higher number of individual measurements (2 × 3 measurements per subject, i.e. total n = 420 in 70 subjects). However, the variance analysis employed in regression models with other simply measured values, lacks such amount of information regarding the variance of the measurements.
We intentionally tried to avoid the term "signi cant" in regard to this analysis, in order not to refer our reported p-values to the conventional 5% signi cance level which may be prone to misinterpretation [54,69]. Using the concept of hypothesis testing in the scienti cally accurate way, setting a signi cance level would require a multiplicity correction for a number of pre-de ned tests. Such explicit correction would, on the other hand, take away our exibility to follow effects and relations in our data, which contains a complex network of endpoints and their predictors. For this reason, this secondary analysis was clearly explorative. This means that the p-values shown in the tables are not referred to any signi cance level. They rather provide continuous information of how effects are related or ranked according to their strengths. In this respect, Table 3 in the results section has to be considered as identi cation of two potential candidates for appropriate surrogate measures for WMSD prevalence. However, real evidence for the adequacy of any of these candidates for WMSD detection in clinical use has to be generated by a dedicated study.
The DASH score was considered to be the surrogate endpoint of choice for our primary analysis of WMSD prevalence among medical device manufacturing employees. The current analyses con rm that DASH still has to remain as the closer choice, when assessing the WMSD status of a population.
Regarding the limitations of our study, it should be noted that cross-sectional studies always represent a snapshot and no statement can be made about the duration of an existing WMSD. In particular, it is not possible to clearly distinguish between chronic, recurrent, or acute diseases. As in other studies, we also focused on a manageable number of potential risk factors, since an increasing number of predictors increases the probability of false positive effects, especially in smaller samples. This makes it di cult to assess these effects as a whole. Due to the single investigator approach, there is a risk of systematic error for over-sensitive detection of WMSDs, which is indicated by the higher number of diagnoses compared to symptoms in our study. On the other hand, the examination was performed by the same hand surgeon, which may have led to the diagnosis at an earlier stage than in a clinical setting. However, reducing the systematic error by a multiple investigator approach would have brought an inter-observer error into play, arising as a result of different teaching backgrounds and subjective assessments. Repeating the physical examination tests by having two investigators examine the same person was not an option for the authors A major reason for this is that most test results depend on the announcement of symptoms (pain, numbness, etc.) and subjects learn during follow-up examinations, which limits the objectivity of such study designs [53]. Even though the cross-sectional design of our study does not permit causal inference, the observed relations provide valuable evidence for further research and policy making. For further limitations with regard to the three occupational activities, we would like to refer to our previous publication [17].

Conclusions
Methods used to collect, process and interpret data on WMSDs are extremely heterogeneous, so the comparability between studies is poor. This study evaluated survey methods and assessment tools for the detection of upper extremity WMSDs and its associations in a healthy working population.
While the most frequently used questionnaires focus on subjective complaints that do not seem to be related to WMSDs, the DASH questionnaire could prove to be an e cient screening method. However, the gold standard for the detection of WMSDs, but also for the derivation of prophylactic, therapeutic and rehabilitative measures, is still the standardised physical examination based on a predetermined set of diagnostic criteria.
Our analysis has not identi ed any risk factors for WMSDs in the study data. Possibly, the effects of investigated risks were too small to be detected by our relatively small study sample.
In order to make epidemiological research on upper extremity WMSDs more comparable, a uniform study design in regard to endpoint selection is recommended. We hope that the methodological results of our work will help other researchers to obtain more e cient and consistent tools for the research on upper extremity WMSDs.

Authors' contributions
The following authors have made substantial contributions to the conception: OL, JM and TL; design of the work: TL and JM; statistics: OL and VB, acquisition and analysis: TL and OL; interpretation of data: OL, JM and VB, drafting and revision of the work: OL and VB. All the authors read and approved the nal manuscript.