Eastern Mediterranean Health Journal | Past issues | Volume 25, 2019 | Volume 25, issue 11 | Unsupervised neural network for evaluating the ability of the SF-36 instrument to differentiate individuals

Unsupervised neural network for evaluating the ability of the SF-36 instrument to differentiate individuals

Print PDF

PDF version

Saeedeh Pourahmad,1,2 Peyman Jafari2 and Sara Ghodsi2

1Bioinformatics and Computational Biology Research Center, Shiraz University of Medical Sciences, Shiraz, Islamic Republic of Iran. 2Biostatistics Department, Shiraz University of Medical Sciences, Shiraz, Islamic Republic of Iran. (Correspondence to: S. Pourahmad: This e-mail address is being protected from spambots. You need JavaScript enabled to view it ).


Background: Health-related quality of life (HRQoL) and well-being refer to the positive, subjective state that is contrary to illness. HRQoL instruments include some common questionnaires, which may often be understood differently depending on the level of individuals’ knowledge.

Aims: To investigate the ability of 36 Short Form Health Survey (SF-36) as a well-known questionnaire in evaluating people’s well-being.

Methods: We compared unsupervised artificial neural networks with a self-organized map learning algorithm and k-means clustering method. Understanding of the content of the questionnaire was also checked according to age group and sex. The study included 1087 people aged > 18 years (640 healthy individuals and 447 patients with chronic diseases) in Shiraz, Islamic Republic of Iran between 2011 and 2013.

Results: The eight subscale scores of the SF-36 instrument were not able to evaluate the well-being of people. The ability of all 36 items in the questionnaire was > 60% in both self-organized map and k-means methods. The self-organized map learning algorithm evaluated people better than the k-means clustering method, based on the accuracy rate in prediction. The SF-36 instrument was better understood by young people.

Conclusions: Differences in people’s health conditions may not appear on the SF-36 subscale scores; therefore, the findings from the subscale scores of SF-36 should be cautiously interpreted.

Keywords: SF-36 instrument, unsupervised artificial neural network, k-means clustering, self-organized map, learning algorithm

Citation: Pourahmad S; Jafari P; Ghodsi S. Unsupervised neural network for evaluating the ability of the SF-36 instrument to differentiate individuals. East Mediterr Health J. 2019;25(11):769-774. https://doi.org/10.26719/2019.25.11.769

Received: 03/04/17; accepted: 28/11/17

Copyright © World Health Organization (WHO) 2019. Some rights reserved. This work is available under the CC BY-NC-SA 3.0 IGO license (https://creativecommons.org/licenses/by-nc-sa/3.0/igo).


Health-related quality of life (HRQoL) and well-being refer to the positive, subjective state that is contrary to illness. These concepts are often described in terms of a multidimensional index including physical, social, emotional or psychological, intellectual, and spiritual components (1). Well-being is defined as the combination of positive–negative affect balance and satisfaction with life. Measures of HRQoL and well-being comprise measurement of overall functioning. The World Health Organization (WHO) has developed a number of questionnaires to measure these terms. These instruments have been translated into many languages (1).

An important question in this field includes whether these questionnaires are able to evaluate and distinguish individuals’ well-being. HRQoL measure is an independent predictor of health conditions and it can be used as a screening tool in clinical practice (1). Hence, it would be valuable to assess the ability of these instruments to evaluate individuals’ well-being. To investigate this ability, a wide range of statistical methods may be utilized, including clustering and classification approaches (2). Recently, soft computing techniques such as artificial neural networks have been applied to HRQoL research. artificial neural networks are used to solve uncertain or vague problems and are helpful for analysis of massive data. These methods and other similar data-mining techniques are widely used in different medical fields. Some recent works include disease diagnosis (3), imaging analysis (4,5), predicting disease status (6), identification of important risk factors for disease (7), and modelling survival of patients (8). A few studies with artificial neural networks have included finding cut-off scores for HRQoL of people with incontinence problems (9), identifying heart failure (10), HRQoL in diabetes (11), HRQoL after breast cancer surgery (12), HRQoL of Parkinson’s disease (13), and predicting the response to a standard 4-week interdisciplinary pain programme (14).

Accordingly, the main objective of the present study was to investigate the ability of the 36 Short Form Health Survey (SF-36) to differentiate healthy and unhealthy individuals. Two different statistical methods were compared for this purpose: an unsupervised artificial neural networks with a self-organization map learning algorithm and the k-means clustering method. The accuracy rate and the area under the receiver operating characteristic (ROC) curve were used as the evaluation criteria. In addition, the present study investigated whether comprehension of the questions differed by age and sex.



The study included 1087 people (56.9% female) aged >18 years [mean age 49.3 (standard deviation 13.8) years]. There were 640 healthy individuals and 447 patients with chronic diseases. Participants were selected randomly from patients referred to clinics affiliated with Shiraz University of Medical Sciences, Islamic Republic of Iran, during 2011–2013. The validity and reliability of the translated Persian language version of the SF-36 questionnaire (Cronbach’s α 0.7–0.9 in diverse studies) have been demonstrated in previous research (15). In the current study, one of the authors (PJ) was responsible for clarifying the possible questions of participants about the instrument and purposes of the research. The participants all gave signed informed consent. In addition, the study was approved by the Ethics Committee of Shiraz University of Medical Sciences. Healthy people were selected randomly from the parents of high-school students without any physical or psychological diseases according to their own admission.

HRQoL instrument

The SF-36 includes 36 items and covers 8 domains: physical functioning; role limitations due to physical problems; bodily pain; general health perceptions; vitality; social functioning; role limitations due to emotional problems; and perceived mental health. Most of the items were scored on a Likert response scale. However, this questionnaire has multiple choice questions as well, and higher total scores represent better HRQoL. All the domains were transformed to a 0–100 scale. The health status of participants (healthy/unhealthy) was considered as the output and 36 items of SF-36 were applied as the predicting variables.

Statistical analysis

Artificial neural networks are a simulated version of human biological neural systems and generally consist of layers. Each layer is composed of the smaller units linked together named neurons. Typically, 3 layers are considered for a network including an input layer, an intermediate layer (it may be >1 layer) and an output layer. The number of neurons in the input layer is equal to the number of predicting variables. For the output layer, it depends on the output structure or target variable. Each neuron in a layer connects to 1 or more neurons of the next layer with different weights. The magnitude of the weights (w_(i,j)) represents the influence of 2 neurons on each other. These weights and some values named bias (b_i) are treated as the system parameters. Estimating the network parameters is done in the learning process. Indeed, the network learns the relations among inputs and outputs by updating the initial weights based on the learning algorithm. The training process stops when the mean square error among the system outputs and the target outputs are minimized (16). The arrangement of various elements of the network including neurons, layers and the links is called topology.

Learning in neural networks is based on 2 approaches, supervised and unsupervised. In the supervised method, the input and output data are both given to the system, while in the unsupervised method, only the inputs are at hand. The network attempts to discover the pattern of the input data.

The self-organized map is a well-known learning algorithm in unsupervised artificial neural networks. It works based on winner neuron logic. The model in this type of network is produced by a learning algorithm that automatically orders the inputs on a 1 or 2D grid according to their mutual similarity. In a self-organized map, the winner neuron is determined and then updated in iterative steps. In each step, the neighbouring neurons of the winner neuron are also updated based on the Kohenen rule (17). the self-organized map learns to recognize clusters of similar inputs in such a way that neighbouring neurons in the layer respond to similar inputs. Each neuron in a self-organized map is represented by a d-dimensional weight vector (Equation 1):


In each training step, one sample x from the input data set is chosen. Then, the distances between x and all the weight vectors of the self-organized map are computed. The neuron whose weight vector is closest to the input vector is called the winner neuron (mc) (Equation 2):

‖x—mc‖=mini ‖x—mi ‖, where ‖.‖ represents the distance value

After finding the winner neuron, the weight vectors are updated so that the winner neuron is moved closer to the input vector in the input space. Also, the topological neighbours of the winner neuron are treated similarly.

The self-organized map update rule of the weight vector of unit i is (Equation 3):

mi (t+1)=mi (t)+α(t) hci (t)[x(t)-mi (t)]

Where, x(t) is input vector chosen at time t and hci (t) is the neighbourhood function that defines the kernel around the winner neuron c (Equation 4):

hci=exp(-‖rc-ri2/2σ2 (t))

Also, σ(t) is the neighbourhood radius at time t and α(t) is the learning rate at time t. It is a linear function such as: α(t)=α0 (1-) where α0 is the initial learning rate and T is triangular length inversely proportional to time

α(t)=, with A,B as the suitable constants (17).

The basic characteristics of the network include map size and topology, weight initialization, type of training algorithm (batch or sequential), learning rate, neighbourhood and distance functions, and radius. In the present research, the self-organized map utilized for evaluating the wel-lbeing of the people with and without health conditions was a 3-layer network (with 1 intermediate layer) in which linear and hex top topologies were compared with each other. In addition, cosine and Euclidian distance functions were applied and learning rate was set at 0.02. For the other characteristics, the default values in MATLAB version 7.1 were used. The self-organized map ability to investigate the well-being of the people was also compared with the results of the k-means clustering method, which is a well-known clustering approach in classical statistics. MATLAB version 7.1 was used for both methods.

In the present research, the input layer of the self-organized map model included individuals’ answers to the SF-36 questionnaire. These inputs were converted to weighted inputs on the intermediate layer. During the learning process, these weighted inputs were connected to 1 of the 2 neurons in the output layers that were labelled as healthy or unhealthy.


A total of 1087 participants answered all the items in SF-36. Table 1 describes the participants based on their sex and type of disease. The scores of the 8 subscales of SF-36 were considered as the predicting variables (inputs) at first. However, these scores were approximately equal for all individuals in the sample study (data not shown). Therefore, 36 items were used as the predicting observations. k-means clustering and self-organized map approaches were utilized to categorize participants into 2 groups, with or without health conditions, according to their answers to the items. Table 2 summarizes the performance of both methods compared with true status of the people (unhealthy or healthy status as the target output). As a result, k-means correctly identified 64.9% and self-organized map 62.4% of unhealthy individuals. For healthy people, these values were 67% and 80.6%, respectively (Table 2 and Table 3).

For the self-organized map method, different network structures with respect to the topology, number of intermediate layers and neurons in each layer, and distance function were examined, and the result of the best one according to the accuracy rate is shown in Table 3. Accordingly, the accuracy (the proportion of true results) of the self-organized map in predicting the true status of individuals (healthy or unhealthy) based on the SF-36 instrument was higher than that with k-means clustering (73.1% vs 66.1%). In addition, positive and negative predictive values were both higher in the self-organized map. However, k-means clustering was more sensitive than the self-organized map in predicting unhealthy status. The characteristics of the selected network involved 2 intermediate layers with 5 neurons in each, Euclidian distance function, and hex top topology. Table 4 represents the ROC curves for both methods. The area under the ROC curve was higher for k-means clustering than the self-organized map (0.794 vs 0.699). In the other words, k-means clustering had a 79.4% chance to distinguish between unhealthy and healthy individuals, compared with 69.9% for the self-organized map. In order to determine the effect of age and sex on answering the items, the performance of these methods was compared according to age and sex groups (Table 5). Both methods showed more accuracy in individuals aged 25–35 years. In addition, k-means clustering had more accuracy than the self-organized map in men.


To the best of our knowledge, this is the first study to evaluate ability of SF-36 to investigate the well-being of healthy subjects versus patients with a specific disease. Therefore, we could find no comparable studies in the literature. However, some studies have investigated the application of artificial neural network methods in HRQoL (7–14); however, their main objectives were different from those of the present study. Indeed, no one has evaluated the sensitivity and specificity of HRQoL instruments.

Our findings revealed that the SF-36 instrument

is moderately able to evaluate the well-being of individuals with and without a health condition (sensitivity, specificity and the accuracy rate were generally >60%). An interesting result based on a preliminary analysis (not reported in the present study) was that the subscale scores were not informative enough to be used to evaluate people’s health condition. Accordingly, considerable caution is warranted when using the SF-36 subscale scores for clustering people

in different groups. This fact was confirmed by previous studies based on differential item functioning (18,19)

and was investigated by different approaches in the present study.

The self-organized map neural network and k-means clustering had similar sensitivity, but the former had significantly higher specificity. Furthermore, the results of the 2 methods revealed that SF-36 differentiated younger people (aged < 35 years) more accurately. However, the accuracy was higher for women using the self-organized map method and for men using the k-means clustering method.

Despite the strengths of the present study in methodology and application, this study had a potential limitation. In order to obtain sufficient sample size, people with different chronic diseases were considered in the unhealthy group. This may have led to some heterogeneity among the unhealthy group. Hence, diseases should be assessed separately for future studies. Moreover, evaluating the performance of other HRQoL instruments may be valuable, and different methods in classification approaches, such as decision trees, regression models, and hierarchical clustering methods, are suggested.


The main objective of the present study was to evaluate the ability of SF-36 to differentiate people with and without health conditions. Two clustering methods were compared in terms of sensitivity and specificity values. The results indicated that the subscale scores of SF-36 were not able to evaluate health condition. Instead, better performance was achieved based on all 36 questions in the SF-36 instrument. Our results reveal that differences in health conditions may not appear on the SF-36 subscale scores; therefore, such scores should be interpreted with caution.


This article was extracted from Sara Ghodsi’s Master of Science thesis. The authors are thankful to Z. Sharafi, S. Rafatti and M. Safe for their help in data gathering. Also, we would like to thank Dr. N. Shoukrpour from the Research Consultant Center, Shiraz University of Medical Sciences for editing this manuscript.

Funding: Grant number 92-6894 from Shiraz University of Medical Sciences Research Council.

Competing interests: None declared.

Utilisation de réseaux de neurones non supervisés pour l’évaluation de la capacité du questionnaire SF-36 en tant qu’instrument de différenciation individuelle


Contexte : la qualité de vie liée à la santé et le bien-être font référence à l’état subjectif positif qui est contraire à la maladie. Les instruments relatifs à la mesure de la qualité de vie liée à la santé incluent des questionnaires communs, qui peuvent souvent être interprétés différemment selon le niveau de connaissance de l’individu.

Objectifs : Examiner l’aptitude de la forme abrégée du questionnaire généraliste SF-36 (qualité de vie) en tant qu’outil bien connu pour l’évaluation du bien-être des individus.

Méthodes : Nous avons comparé des réseaux de neurones non supervisés artificiels à l’aide d’un algorithme d’apprentissage par carte auto-adaptive et de la méthode du partitionnement en K-moyennes. La compréhension du contenu du questionnaire a également été vérifiée en fonction du groupe d’âge et du sexe. L’étude regroupait 1087 personnes âgées de plus de 18 ans (640 individus en bon état de santé et 447 patients souffrant de maladies chroniques) à Shiraz (République islamique d’Iran) entre 2011 et 2013.

Résultats : Les scores des huit sous-échelles du questionnaire SF-36 ne permettaient pas d’évaluer le bien-être des individus. La capacité des 36 items du questionnaire était supérieure à 60 % dans les méthodes de la carte auto-adaptive et du partitionnement en K-moyennes. L’algorithme d’apprentissage par carte auto-adaptive a mieux évalué les individus par rapport au partitionnement en K-moyennes, en fonction du taux de précision de la prédiction. Le questionnaire SF-36 était mieux assimilé par les jeunes.

Conclusions : La différence entre les états de santé des individus peuvent ne pas apparaître dans les scores des sous‑échelles du questionnaire SF-36 ; les résultats doivent donc être interprétés avec prudence.

شبكة عصبية غير خاضعة للرقابة من أجل تقييم قدرة أداة "المسح الصحي القصير المكون من 36 بنداً" (SF-36) على التمييز بين الأفراد بناء على حالتهم الصحية

سعيدة بور أحمد، بيمان جعفري، سارا قدسي


الخلفية: تشير جودة الحياة الصحية والرفاه إلى الحالة الشخصية الإيجابية التي تتنافى مع الاعتلال. وتشتمل أدوات قياس جودة الحياة الصحية على بعض الاستبيانات الشائعة، التي قد تُفهم في الغالب فهماً مختلفاً حسب المستوى المعرفي للأفراد.

الأهداف: هدفت هذه الدراسة إلى استقصاء قدرة المسح الصحي القصير المُكوَّن من 36 بنداً (SF-36)، بوصفه استبياناً مشهوراً، على تقييم رفاه الناس.

طرق البحث: جرى استخدام ومقارنة شبكة عصبية اصطناعية غير خاضعة للرقابة مع خوارزمية تعلم ذات "خريطة ذاتية التنظيم" (SOM) وطريقة التقسيم العنقودي k-means. وجرى أيضاً التحقق من فهم محتوى الاستبيان حسب الفئة العمرية ونوع الجنس.

النتائج: أظهرت نتائج هذه الدراسة التي أُجريت على 1087 شخصاً تجاوزت سنه 18 عاماً (640 شخصاً مُعَافًى و447 مريضاً بمرض مزمن في مدينة شيراز بجمهورية إيران الإسلامية خلال الفترة من 2011 إلى 2013) أن درجات المقاييس الفرعية الثمانية في أداة SF-36 لا تستطيع تقييم رفاه الناس، في حين أن قدرة البنود الستة والثلاثين كانت أكثر من 60% بكلتا الطريقتين. وإضافةً إلى ذلك، قدمت الخريطة الذاتية التنظيم (SOM) تقييماً أفضل للأشخاص استناداً إلى معدل الدقة في التنبؤ. وعلاوة على ذلك، فهم الشباب هذه الأداة فهماً أفضل.

الاستنتاجات: يُستنتج من ذلك أن فروق الأحوال الصحية بين الأشخاص قد لا تظهر في درجات المقاييس الفرعية، ولذلك ينبغي توخي الحذر عند تفسير النتائج المأخوذة من المقياس الفرعي في المسح الصحي القصير المُكوَّن من 36 بنداً (SF-36).


  1. Michalos AC (editor). Encyclopedia of quality of life and well-being research. Springer Reference; 2014.
  2. Dumuid D, Olds T, Lewis LK, Martin-Fernández JA, Katzmarzyk PT, Barreira T, et al. Health-related quality of life and lifestyle behavior clusters in school-aged children from 12 countries. J Pediatr. 2017 Apr;183:178-183.e2. http://dx.doi.org/10.1016/j.jpeds.2016.12.048 PMID: 28081885
  3. Arabasadi Z, Alizadehsani R, Roshanzamir M, Moosaei H, Yarifard AA. Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm. Computer Methods Programs Biomed. 2017 Apr;141:19–26. https://doi.org/10.1016/j.cmpb.2017.01.004
  4. Arunprasath T, Rajasekaran M P, Kannan S, George SM. Performance evaluation of PET image reconstruction using radial basis function networks. In: Suresh LP, Dash SS, Panigrahi BK, editors. Artificial intelligence and evolutionary algorithms in engineering systems. Springer India; 2015:481–9.
  5. Payan A, & Montana G. Predicting Alzheimer’s disease: a neuroimaging study with 3D convolutional neural networks. 2015; arXiv:1502.02506. https://arxiv.org/pdf/1502.02506.pdf
  6. Licastro F, Ianni M, Ferrari R, Campo G, Buscema M, Grossi E et al. A new risk chart for acute myocardial infarction by a innovative algorithm. Curr Pharmacogenomics Pers Med. 2014;12(3):159–66. http://dx.doi.org/10.2174/1875692113666150116232904
  7. Pourahmad S, Khosravi B, Mohamadianpanah M. Effective attributes in colorectal cancer relapse using artificial neural network and cox proportional hazards regression. Ann Colorectal Res. 2014 Jun 30;2(2):e22329. http://dx.doi.org/10.17795/acr-22329
  8. Khosravi B, Pourahmad S, Bahreini A, Nikeghbalian S, Mehrdad G. Five years survival of patients after liver transplantation and its effective factors by neural network and cox poroportional hazard regression models. Hepat Mon. 2015 Sep 1;15(9):e25164. http://dx.doi.org/10.5812/hepatmon.25164 PMID:26500682
  9. Corcos J, Behlouli H, Beaulieu S. Identifying cut-off scores with neural networks for interpretation of the incontinence impact questionnaire. Neurourol Urodyn. 2002;21(3):198–203. PMID:11948712
  10. Behlouli H, Feldman D, Ducharme A, Frenette M, Giannetti N, Grondin F et al. Identifying relative cut-off scores with neural networks for interpretation of the Minnesota Living with Heart Failure questionnaire. Conf Proc IEEE Eng Med Biol Soc. 2009;2009:6242–6. http://dx.doi.org/10.1109/IEMBS.2009.5334659 PMID:19965089
  11. Rao MR, Sridhar GR, Madhu K, Rao AA. A clinical decision support system using multi-layer perceptron neural network to predict quality of life in diabetes. Diabetes Metab Syndr Clin Res Rev. 2010 Jan–Mar;4(1):57-9. https://doi.org/10.1016/j.dsx.2009.04.002
  12. Shi HY, Tsai JT, Chen YM, Culbertson R, Chang HT, Hou MF. Predicting two-year quality of life after breast cancer surgery using artificial neural network and linear regression models. Breast Cancer Res Treat. 2012 Aug;135(1):221–9. http://dx.doi.org/10.1007/s10549-012-2174-6 PMID:22836876
  13. Borchani H, Bielza C, Martı P, Larranaga P. Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: an application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39). J Biomed Inform. 2012 Dec;45(6):1175–84. http://dx/doi.org/10.1016/j.jbi.2012.07.010 PMID:22897950
  14. Salgueiro M, Basogain X, Collado A, Torres X, Bilbao J, Doñate F et al. An artificial neural network approach for predicting functional outcome in fibromyalgia syndrome after multidisciplinary pain program. Pain Med. 2013 Oct;14(10):1450–60. http://dx.doi.org/10.1111/pme.12185 PMID:23915306
  15. Montazeri A, Goshtasebi A, Vahdaninia M, Gandek B. The Short Form Health Survey (SF-36): translation and validation study of the Iranian version. Qual Life Res. 2005 Apr;14(3):875–82. PMID:16022079
  16. Hagan MT, Demuth HB, Beale MH. Neural network design. Boston: PWS Publishing; 1996.
  17. Kohonen T. Self-organizing maps. Springer Science & Business Media; 2001.
  18. Horner-Johnson W, Krahn GL, Suzuki R, Peterson JJ, Roid G, Hall T et al. Differential performance of SF-36 items in healthy adults with and without functional limitations. Arch Phys Med Rehabil. 2010 Apr;91(4):570–5. http://dx.doi.org/10.1016/j.apmr.2009 PMID:20382289
  19. Yu YF, Yu AP, Ahn J. Investigating differential item functioning by chronic diseases in the SF-36 health survey: a latent trait analysis using MIMIC models. Med Care. 2007 Sep;45(9):851–9. PMID:17712255