We found strong correlations between the composite scores derived by the two methods for both the RAND-36 and RAND-12, and that the effect sizes of the associations between the oblique versus the unweighted composite scores and other variables had comparable magnitudes, also indicating similar convergent validity.
To the best of our knowledge, this is the first study to report both the criterion validity and convergent validity of unweighted RAND-36/12 composite scores. However, two prior studies have reported the criterion validity of the RAND-36 or RAND-12 composite scores using two other methods for constructing unweighted scores. Grassi et al.  used data from the European Community Respiratory Health Survey and compared SF-36 composite scores derived from oblique PCA with those from an unweighted scoring system. The unweighted PCS was calculated as the sum of 18 items, while the MCS included 19 items. The correlation between the oblique and unweighted PCS was 0.97, and 0.96 between oblique and unweighted MCS. The correlation between the unweighted PCS and MCS was 0.61.
Hagell et al.  applied data from people with Parkinson’s disease and stroke to compare SF-12 composite scores derived from the RAND-12 HSI algorithm that produced similar results to scores based on oblique PCA. The unweighted PCS was calculated as the raw sum of six items, while the MCS was from six other distinct items. The correlation between the weighted and unweighted PCS was 0.99, and 0.99 between the weighted and unweighted MCS. The correlation between the unweighted PCS and MCS was 0.68.
The scoring methods in these two studies differed slightly from ours by using the sum of items to create raw scores, while we used unweighted linear combinations of subscale scores, based on items that were standardized, ranging from 0 to 100. We think that a two-step method that initially scores the subscales, and then uses them to create composite scores is more intuitive, considering that the subscales have a different number of items. However, the practical difference between our approach and the two other unweighted approaches for scoring composite scores seems to be minor. These findings are not surprising, given the strong correlations between the items that contribute to the RAND-36/12 composite scores.
We found that the correlations between the unweighted RAND-36/12 PCS and MCS were weaker than those created from oblique PCA. A reason for this is that oblique PCA produces weights for creating PCS and MCS that increase the correlation between these scores . In the unweighted approach, no restraints are imposed, and the PCS and MCS are completely free to correlate. This could be a strength favouring unweighted RAND-36/12 composite scores, as correlations approaching 0.80 may induce multicollinearity if the PCS and MCS are used as independent variables in the same model .
Regarding convergent validity, the associations between the oblique versus the unweighted RAND-36/12 composite scores and other variables had comparable magnitudes. An exception was that age was more strongly correlated with the unweighted PCS scores, than the oblique ones. This could reflect that the oblique PCS scores were based on all sub scales being either negatively, neutral, or positively correlated with age. There also seems to be a subtle tendency for the oblique PCS and MCS to have more similar effect sizes than the unweighted PCS and MCS. This probably reflects the stronger correlations between the oblique PCS and MCS.
The strengths of this study include a sufficiently large sample from a general population and that convergent validity was examined. A limitation of the study is that weight, height, physical activity, rheumatic disease, and depression were assessed by self-reports. However, the included measures have been shown to have acceptable validity [29–31].
The main implication of this study is that we can keep the calculation of the RAND-36/12 composite scores simple. This has several advantages, such as the standardization of scoring across studies and populations. In this paper, we calculated composite scores ranging from 0 to 100, but the data can easily be converted to T-scores, if needed. It might also be possible to merge datasets with composite scores derived from both the RAND-36 and RAND-12 using T-scores. It should be emphasized that our findings do not imply that weighted composite scores of HRQoL are never useful, or that prior studies using different oblique composite scores for the RAND-36/12 have led to erroneous results. However, we propose that when creating composite scores from highly correlated subscale scores, weighting is likely to be redundant. This knowledge should also be useful to consider when developing composite scores for new HRQoL instruments.