Close
Home
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Predicting autism severity classification by machine learning models
(USC Thesis Other)
Predicting autism severity classification by machine learning models
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
PREDICTING AUTISM SEVERITY CLASSIFICATION BY MACHINE LEARNING MODELS
by
Qi Zhang
A Thesis Presented to the
FACULTY OF THE USC KECK SCHOOL OF MEDICINE
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
BIOSTATISTICS
May 2024
Copyright 2024 Qi Zhang
ii
TABLE OF CONTENTS
List of Tables.......................................................................................................................iii
List of Figures ..................................................................................................................... iv
Abstract ................................................................................................................................ v
Chapter1 Introduction........................................................................................................... 1
1.1 Autism Spectrum Disorder ...................................................................................................... 1
1.2 Assessment Tools in Autism Spectrum Disorder: ADOS-2 and SPARK................................. 1
1.3 Research Objectives................................................................................................................. 3
Chapter 2 Machine Learning Algorithms............................................................................. 5
2.1 Fundamentals of Machine Learning ........................................................................................ 5
Chapter 3 Materials and Methods ........................................................................................ 6
3.1 Sample Characteristics............................................................................................................. 6
3.2 Statistical Methods................................................................................................................... 6
Chapter 4 Exploratory Data Analysis and Logistic Regression ........................................... 8
4.1 Exploratory Data Analysis....................................................................................................... 8
4.2 Logistic Regression ............................................................................................................... 13
Chapter 5 Machine Learning.............................................................................................. 15
5.1 Pre-processing and Data Splitting ......................................................................................... 15
5.2 Methods ................................................................................................................................. 17
5.3 Results of Training and Validation......................................................................................... 18
5.4 Results of Testing................................................................................................................... 21
Chapter 6 Conclusion and Discussion................................................................................ 23
References.......................................................................................................................... 25
iii
List of Tables
Table 1: Statistical Characteristics....................................................................................................11
Table 2. Results for the Logistic Regression Model........................................................................ 13
Table 3: Results for Validation set................................................................................................... 19
Table 4: Results for Test set............................................................................................................. 22
iv
List of Figures
Figure 1: Flowchart of Participant Selection..................................................................................... 8
Figure 2: Autism Severity Level Proportions by Module ............................................................... 10
Figure 3: Flowchart of Data Splitting.............................................................................................. 15
Figure 4: ROC curves for Elastic Net Regression, KNN, LDA, and SVM..................................... 20
Figure 5: ROC curves for Random Forest....................................................................................... 21
v
Abstract
Autism, as a common neurodevelopmental disorder, has a negative impact on patients and their
families. Utilizing machine learning to analyze and predict the severity of autism in patients is
significant for early diagnosis of autism. Machine learning techniques have been widely applied in
the medical field. The goal of this study is to employ various machine learning models to investigate
the different severity levels of autism spectrum disorder and to identify the most effective model for
accurately predicting the severity of symptoms. This work not only provides scientific evidence for
early diagnosis and intervention but also paves new pathways for future research directions. This
study will use the SPARK (Simons Foundation Autism Research Initiative) cohort as its data source,
covering a range of machine learning methods including Linear Discriminant Analysis, Support
Vector Machine, Random Forest, and K-Nearest Neighbors. By constructing models to estimate the
severity of autism, this research offers valuable insights for the medical field and inspires future
research directions.
1
Chapter1 Introduction
1.1 Autism Spectrum Disorder
Autism spectrum disorder (ASD), often referred to simply as autism, is a neurodevelopmental
disorder related to brain development. It manifests as a combination of social and communication
impairments that severely affect the way an individual interacts, communicates, learns and behaves.
Specifically, people with autism may experience difficulties with conversation, understanding the
perspectives or actions of others, repetitive speech patterns (such as echoing), and an intense and
focused interest in specific topics. The intellectual functions of autistic patients vary greatly, and
some autistic patients are accompanied by intellectual disability, language impairment, or behavioral
disorders. At the same time, some patients have special skills beyond ordinary humans, most
commonly in memory, music, and mental arithmetic (Hirota et al., 2023).
According to the World Health Organization (WHO), characteristics of autism can be detected
in early childhood, but diagnosis is often much later. The WHO reports that approximately 1% of
children worldwide have autism. Research into the prevalence of autism began as early as the 1960s
and 1970s, with initial estimates of 0.5-0.9 cases per 10,000 people. However, autism prevalence
has experienced a substantial increase in the early 2000s (Talantseva et al., 2023).
1.2 Assessment Tools in Autism Spectrum Disorder: ADOS-2 and SPARK
The scientific assessment of an individual's autism spectrum disorder symptoms is critical to
understanding and diagnosing the disorder. There are several methods for assessing ASD symptoms,
such as the Childhood Autism Rating Scale (CARS). This scale contains 15 main items and focuses
more on psychometrics to assess autism symptoms, emphasizing restricted and repetitive behaviors.
The Social Responsiveness Scale (SRS) focuses more on children's social functioning. (Huang et al.,
2024) In addition to this, the Autism Diagnostic Observation Schedule, Second Edition (ADOS-2)
2
stands out as a play and interaction-based assessment. The Autism Diagnostic Observation Schedule
Second Edition is a complex autism spectrum disorder assessment tool originally developed by Lord
et al. in 2012. The ADOS-2 methodology was designed to provide a more objective assessment of
ASD symptoms by reducing respondent bias typically associated with self-report questionnaires and
clinical interviews. The adaptability of the ADOS-2 to diverse populations further enhances its
diagnostic utility. The study by Phillips et al. (Phillips et al., 2022) highlights that ADOS-2 has been
effectively adapted and validated for use in the assessment of deaf children and adolescents,
reflecting its broad applicability and confirming its diagnostic integrity across diverse populations.
This study will utilize the ADOS-2 to diagnose and analyze ASD symptoms. The ADOS-2 is
applicable to a wide range of populations, from minimally verbal one-year-olds to fluent adults. The
ADOS-2 is based on real-time observations and provides clinicians with a highly accurate picture
of current ASD-related symptoms and has demonstrated strong predictive validity. (Adamou, 2021)
In addition, this study will incorporate data from SPARK (Simons Foundation Support for
Autism Research in Knowledge), a large cohort that includes individuals with autism and their
family members and provides extensive data on the autism population. SPARK contains not only
ADOS-2 scores, but also a variety of other scales and assessment tools, such as the Social
Communication Questionnaire (SCQ), which was developed in 1999 as an autism screening
questionnaire. The SCQ is completed by a parent or caregiver and is used to assess certain social,
communication, and atypical language use behaviors in patients (Chesnut et al., 2017). The SRS,
which continuously measures the severity of autism symptoms, is also completed by parents or
caregivers. It contains 65 items in five content areas that address social deficits, including social
awareness, cognition, communication, motivation, and autistic habits (Chan et al., 2017). In addition,
repetitive and restrictive patterns of behavior or interests (RRB) are one of the main symptoms of
3
autism spectrum disorders. The Repetitive Behavior Scale Revised (RBS-R) Score is an important
indicator to assess RRB and higher RBS-R Score means more severe repetitive behavior (Hooker et
al., 2019). Developmental Coordination Disorder (DCD) is another neuromotor disorder commonly
seen in children, often resulting in difficulties with fine or gross motor skills that severely impact
daily activities and academic performance. The Developmental Coordination Disorder
Questionnaire (DCDQ) is completed by parents or caregivers and is designed to identify motor skill
challenges in children between the ages of 5 and 15 (Zwicker et al., 2012). These questionnaires are
housed in SPARK and are used to analyze and understand ASD.
1.3 Research Objectives
In the field of autism spectrum disorder (ASD) research, the widespread application of 'big
data' techniques has lagged behind other disciplines. However, there is a growing body of work
focusing on the use of machine learning to diagnose ASD, explore genetic factors, and develop
effective treatment strategies. Given the high incidence and diversity of ASD, an increasing number
of researchers are turning to machine learning over traditional statistical methods for data analysis
(Hyde et al., 2019).
The primary aim of this study is to utilize various machine learning models to investigate the
differing severity levels of ASD and to identify the most effective method for accurately predicting
autism symptom severity. ADOS-2 (Autism Diagnostic Observation Schedule) diagnostic interview
is the current gold standard for assessing the severity of autism, but unfortunately, only a limited
number of autistic individuals have this information, restricting our comprehensive understanding
of autism severity. We first conduct exploratory analysis to assess the heterogeneity of ADOS-2
autism severity by age, sex, race, and module through logistic regression analysis. Then, we will
select an optimal machine learning model to perform data imputation on autism severity using other
4
available mental and behavior questionnaires and medical conditions, with the objective of
extending this evaluation method to over a hundred thousand other autistic individuals in the SPARK
database. Through this approach, we can categorize the severity of all individuals with autism,
thereby identifying factors associated with autism severity, intervention, and prevention efforts, and
furthering research in the field. Additionally, this research will identify key predictors influencing
the severity of ASD. The analysis not only helps us to classify the degree of autism in patients but
also reveals which factors have the most predictive value. By examining these factors in depth, we
hope to provide vital insights for early detection, intervention, and prevention of autism.
5
Chapter 2 Machine Learning Algorithms
2.1 Fundamentals of Machine Learning
Machine learning is one of the domains in the modern computational world. It is a branch of
artificial intelligence that endows computers with the ability to learn and improve at performing
tasks without explicit programming. Machine learning relies on different algorithms to process and
analyze data. Depending on the desired outcome of the algorithm, machine learning approaches can
be broadly divided into two main types: supervised and unsupervised learning. In supervised
learning, algorithms are trained using datasets that come with explicit labels. These labels are
defined by humans and belong to a finite set of categories. On the other hand, unsupervised learning
algorithms do not utilize predefined labels. The primary task in unsupervised learning is to develop
classification labels autonomously, without relying on human-defined categories (Nasteski, 2017).
2.2 Machine learning in Autism
The applications of machine learning are vast, ranging from everyday uses like
recommendation systems and voice recognition to more intricate scientific research such as
genomics and drug discovery. Recent advancements in artificial intelligence and machine learning
have spurred predictive efforts in autism at early stages. For instance, Crippa et al. adopted a
supervised machine learning approach to assist in accurate classification by analyzing simple upper
limb movements in low-functioning children with ASD aged 2 to 4 (Crippa et al., 2015).
Additionally, Moradi and colleagues developed a machine learning algorithm that addressed
clustering issues, demonstrating its practicality through the application to the Autism Brain Imaging
Data Exchange (ABIDE) database. They predicted the severity of symptoms based on cortical
thickness measurements from 156 ASD patients across four different locations (Moradi et al., 2016).
6
Chapter 3 Materials and Methods
3.1 Sample Characteristics
The data used in this study were from the SPARK (Simons Foundation Powering Autism
Research) database v10, which was updated in July 2023. The SPARK project is facilitated by the
Simons Foundation for Autism Research (SFARI) and aims to assemble a community of 100,000
individuals with autism within the United States and their families (more details can be found on the
project website: https://www.sfari.org/resource/spark/). Community members' participation in the
study includes providing health and behavioral information, sending saliva samples for genetic
analysis, and having the opportunity to learn about genetic findings related to autism, as well as
having the option to participate in future research projects.
SFARI works with 31 university-affiliated research clinics in 26 U.S. states to coordinate
research activities. Researchers can submit applications for access to de-identified genetic and
phenotypic data through SFARI Base. Currently, approved researchers have access to phenotypic
data for 328,973 individuals, including more than 100,000 individuals with autism spectrum
disorders. Included in these data are 22,811 individuals with adult ASD and 109,161 individuals
with ASD in children under the age of 18, as well as 166 participants with ASD of unknown age.
Among this population, only 1,489 individuals associated with specific clinics had ADOS-2 (Autism
Diagnostic Observation Schedule) service information. These 1,489 records will be the focus of this
analysis to ensure that the findings are representative and scientifically valid.
3.2 Statistical Methods
The aim of this study is to use a data-driven approach to reveal the different severity levels of
Autism Spectrum Disorder (ASD) and to explore effective machine learning techniques for
predicting symptom severity. We will combine exploratory data analysis, logistic regression, and
7
machine learning methods to delve into multidimensional data from ASD patients and build
predictive models. The analysis process will be performed using the R programming language
(version 4.3.2) and analyses using machine learning will rely on the caret package, which provides
a complete set of functions for model training and outcome evaluation to ensure that the best models
are selected to accurately predict symptom severity in ASD patients.
8
Chapter 4 Exploratory Data Analysis and Logistic Regression
In this chapter, our preliminary examination of the ASD dataset through exploratory data
analysis (EDA) is presented. EDA is instrumental in uncovering the distribution of the autism
severity and identifying key variables related to autism severity classification, setting the stage for
more complex analyses such as logistic regression and machine learning techniques.
4.1 Exploratory Data Analysis
Figure 1: Flowchart of Participant Selection
We obtained data from the SPARK database for 328,973 study participants covering 1,333
variables. These data were carefully screened and 1,489 individuals who completed the ADOS-2
questionnaire were selected. After removing duplicate records from the database, 1,485 study
subjects were retained. Considering that the goal of the study was to categorize the severity of autism
based on related mental and behavior surveys, we excluded data from patients who didn’t complete
any of these surveys, leaving 1,178 patients after screening for subsequent analysis. Figure 1
displays a flowchart of the process of selecting 1,178 participants with complete data among 276
predictor variables from the 328,973 ASD participants in the SPARK consortium.
9
Descriptive statistics stratified by autism severity classifications are provided in Table 1. Key
variables underwent recoding for analytical clarity. Patient ages were calculated based on the year
of registration up to 2023, with the age range spanning from 4 to 26 years, the distribution of ages
was centered around 10 years. Racial categories were transformed into categorical variables, with
designations from 1 to 7 representing various ethnic backgrounds. Notably, most of the sample
identified as white, followed by African American and Asian individuals.
In the ADOS-2 assessment, the severity of autism is categorized into three levels: little-to-no,
mild-to-moderate, and moderate-to-severe, based on ADOS-2 scores. The Calibrated Severity Score
(CSS) is standardized on a scale of 1-10, where scores 1-3 usually correspond to a little-to-no autism
severity classification, 4-5 are correspond to a mild-to-moderate autism severity classification, and
6-10 indicate to a moderate-to-sever autism severity classification. CSS also includes subscores for
Social Affect (SA) and Restricted and Repetitive Behaviors (RRB), aiding in a more accurate
reflection of different autism dimensions (Gotham et al., 2009). In our database, most patients fell
into the moderate-to-severe category. Due to our research focus on severity classification,
individuals without a specified severity level were omitted from the analysis.
The Autism Diagnostic Observation Schedule (ADOS-2) is organized into modules, each of
which targets individuals of different ages and language proficiency levels. Module 1 is for children
with limited language skills, usually those who do not use phrases. Module 2 is for individuals who
use phrases but are not fluent. Module 3 is for children and adolescents who are fluent in the
language. In addition, there is a toddler module, specifically designed to assess toddlers between the
ages of 12 and 30 months, focusing on the early signs of ASD. In our sample, about 34% patients
were assessed with module 3, while about 31% were assessed with module 1. Approximately 20%
of the participants are classified in module 2. The least represented group is module toddler,
10
comprising only 15.2% of the sample. The proportion of autism severity categories among patients
within each ADOS-2 module varies greatly. The p-value for the Chi-squared test for the module
variable is 0.014. This indicates that there is statistically significant evidence to suggest that the
distribution of the severity of autism spectrum disorders varies significantly across different modules.
This marked difference in the distribution among the modules suggests that module may influence
subsequent modeling and strategic choices. Figure 2 shows the proportion of different autism
severity levels across various modules. It can be observed that in different modules, over 85% of the
patients fall into the moderate-to-severe level, with the remaining two levels having a very small
percentage.
Figure 2: Autism Severity Level Proportions by Module
Younger patients tend to exhibit more severe symptoms of autism. The sex distribution across
severity levels remained constant, with a male-to-female ratio close to 3:1, which is consistent with
the general population. Although white individuals were more frequently observed in each severity
level followed by African American individuals, statistical analyses revealed no significant racial
11
differences in autism severity. The Chi-squared tests assessing the association between autism
severity classification and the categorical variables of sex and race resulted in p-values of 0.691 and
0.302, respectively. These results suggest that there is no statistically significant evidence to indicate
a difference in the distribution of the severity of autism spectrum disorders across different sex or
races. Hence, the distribution of autism severity does not show evidence of disparity when stratified
by sex or race.
Furthermore, the assessment scores such as the Restricted, Repetitive Behaviors Calibrated
Severity Score (RRB CSS), Social Affect Calibrated Severity Score (SA CSS), and Total Calibrated
Severity Score (TOTALCSS) increased with ASD severity. However, the SCQ Score, which reflects
the level of social communication challenges where a higher score indicates more severe social
communication issues, did not differ significantly across ASD severity groups. This suggests no
statistical difference in social communication challenges among these groups. The DCDQ Score,
representing motor impairment severity where a lower score indicates more severe motor
impairment, was marginally higher in the mild patients, though this difference was not statistically
significant. In contrast, the RBS-R Score, a measure of repetitive behavior severity where a higher
score denotes more severe repetitive behaviors, decreased with increasing ASD severity. This trend
diverges from expected patterns and warrants further in-depth analysis to understand these findings
better.
Table 1: Statistical Characteristics
Autism Severity Classification
Variables Total
(N=1178)
(% =
100%)
little-to-no
(N=48)
(% = 4.07%)
mild-tomoderate
(N=104)
(% =8.83%)
moderate-tosevere
(N=1026)
(%=87.10%)
p-value
12
Age [mean(sd)] 10.88(3.37) 12.17 (3.57) 11.12 (3.06) 10.79 (3.38) 0.016
Total CSS [mean(sd)] 7.45(1.95) 2.44 (0.77) 4.52 (0.50) 7.99 (1.40) <0.001
RRB CSS [mean(sd)] 7.68(2.18) 3.88 (2.68) 6.16 (2.09) 8.01 (1.92) <0.001
SA CSS [mean(sd)] 7.18(1.99) 2.90 (1.22) 4.42 (1.09) 7.66 (1.58) <0.001
SCQ
Score [mean(sd)]
20.42(6.70) 21.87 (6.17) 20.06 (7.27) 20.38 (6.66) 0.353
DCDQ Score
[mean(sd)]
38.04
(11.86)
37.21 (11.36) 40.70 (11.54) 37.77 (11.91) 0.182
RBS-R Score
[mean(sd)]
32.69
(19.24)
41.66 (21.95) 34.60 (19.90) 32.00 (18.91) 0.015
Sex [N (%)] 0.691
Female 258(21.90) 10 (20.8) 26 (25.0) 219 (21.4)
Male 920(78.10) 38 (79.2) 78 (75.0) 804 (78.6)
Race [N (%)] 0.302
Asian 62(6.85) 0 (0.0) 4 (4.8) 58 (7.4)
African
Americans
97(10.72) 4 (10.8) 9 (10.8) 84 (10.7)
Native
Americans
21(2.32) 0 (0.0) 1 (1.2) 20 (2.5)
Native
Hawaiian
4(0.44) 1 (2.7) 0 (0.0) 3 (0.4)
White 690(76.24) 30 (81.1) 68 (81.9) 592 (75.4)
Other 31(3.43) 2 (5.4) 1 (1.2) 28 (3.6)
Module [N (%)] 0.014
Module1 364(30.90) 5 (10.4) 37 (35.6) 322 (31.4)
Module2 233(19.78) 16 (33.3) 15 (14.4) 202 (19.7)
Module3 402(34.13) 21 (43.8) 39 (37.5) 342 (33.3)
Module
Toddler
179(15.20) 6 (12.5) 13 (12.5) 160 (15.6)
13
4.2 Logistic Regression
This study has developed a logistic regression model to analyze the influencing factors of
severity classification in autism spectrum disorders. Given the small proportion of autistic patients
with little-to-no and mild-to-moderate severity, we combined these two levels of patients together
for subsequent analysis and modeling. The outcome of the logistic regression will be the combined
severity levels, which are categorized into two groups: little-to-moderate and moderate-to-severe.
To explore the relationship between the age, sex, race, language, module, etc. and severity
classification, we used the glm function in R to fit a generalized linear model based on the binomial
distribution. The model used SCQ score, RBS-R score, DCDQ score, language level, age, module,
sex, and race as independent variables to examine whether these factors could be combined for
prediction and construction of machine learning models.
Table 2. Results for the Logistic Regression Model
Independent variables Coefficient Standard
Error
z-Value p-Value
SCQ Score 5.959e-03 2.129e-02 0.280 0.77955
RBS-R Score -1.740e-02 6.904e-03 -2.550 0.01078 *
DCDQ Score -2.283e-02 1.157e-02 -1.973 0.04846 *
Language level -5.374e-01 1.964e-01 -2.736 0.00622 **
Age -4.952e-02 5.185e-02 -0.955 0.33949
Module2 8.351e-01 4.321e-01 1.933 0.05325 .
Module3 8.001e-01 4.396e-01 1.820 0.06877 .
Module Toddler -3.372e-01 4.462e-01 -0.756 0.44983
Race Asian 3.887e-01 7.268e-01 0.535 0.59279
Race Native American 1.479e+01 6.082e+02 0.024 0.98060
Race Native Hawaiian 1.518e+01 1.319e+03 0.012 0.99082
Race White 1.477e-01 3.750e-01 0.394 0.69364
Race Other 1.123e+00 1.104e+00 1.017 0.30908
Sex Male -3.476e-03 3.080e-01 -0.011 0.99099
Significance codes:0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
Table 2 provides preliminary insights into demographics and survey total scores associated with
the severity classification of autism. The coefficients for the module variables reveal moderate
14
relationship on the severity classification of autism, albeit with limited statistical significance.
Relative to the reference level of Module 1, the estimated coefficients for Module 2 and Module 3
are 0.8351 and 0.8001, respectively, suggesting a positive correlation with the severity level of
autism compared to the baseline. However, these associations are not significant, with p-values at
0.05325 and 0.06877, slightly exceeding the conventional significance level 0.05. On the other hand,
Module Toddler is associated with a coefficient estimate of -0.3372, indicating a negative correlation
with the baseline level regarding the severity classification of autism, though this difference also
lacks statistical significance (p-value = 0.44983). There were no significant differences in the
categorization of autism severity levels between patients of different races or different sex, with pvalues well above the significant level 0.05.
Furthermore, the RBS-R Score and the DCDQ Score exhibit a negative correlation with the
severity of autism, which means that lower scores on these assessments correlate with higher levels
of autism severity, and this relationship is statistically significant, both p-values are below significant
level 0.05. Conversely, a higher SCQ Score aligns with greater autism severity, but this association
does not reach statistical significance (p-value = 0.77955).
The language level and age variables are also negatively related to the severity of autism. This
means that higher language levels or an older age are indicative of a less severe manifestation of
autism. However, the trend of decreasing autism severity is not consistent throughout development.
Autism severity may decrease, stabilize, or increase during the school years (Waizbard-Bartov, et
al., 2023). However, communication difficulties have always characterized ASD and are one of the
hallmark features of a diagnosis of ASD. Research findings suggest that the severity of autistic
behaviors will have a significant impact on real-life language processing. (Bavin et al., 2014)
15
Chapter 5 Machine Learning
5.1 Pre-processing and Data Splitting
Figure 3 displays a flowchart of the process of dividing the selected 1,178 participants into
training, validation, and test sets for model development.
Figure 3: Flowchart of Data Splitting
Given the multiple questionnaires included in the SPARK database, each questionnaire
corresponded to questions with varying degrees of missing values. We computed the percentage of
missing values for each question for these 1,178 patients and removed variables with missing values
greater than 40%. We retained 276 predictor variables, of which a total of 418 participants with ASD
had no missing values in their data and all had categorical information on autism severity. We
recoded several categorical predictor variables into ordinal formats to reduce degrees of freedom
and simplify the model. For example, for the highest level of parental education, we converted from
a categorical variable to a numerical variable to reflect the ordinal levels of educational attainment.
We also collapsed more than twenty different types of congenital defects with scarce case occurrence
into a single variable to indicate whether a patient had any congenital defects. We used the
16
'dummyVars' function from the Caret package in R to create dummy variables for categorical data.
The purpose of this was to convert non-numerical features into a numerical format that algorithms
can process, while avoiding the erroneous interpretation of numerical relationships between
categorical data by the algorithm.
In the preprocessing phase, we identified and removed zero-variance predictor variables and
near-zero-variance predictor variables. A zero-variance predictor variable means that this variable
has the same value in all samples, while near-zero-variance means that the variable has almost no
variation. These variables contribute nothing to the model's predictions, may instead increase
computational complexity, and could lead to model overfitting, and thus were removed. Finally, we
centered and normalized predictor variables, which means we adjusted variables to have a mean of
zero and standardized the scale of variables to the same magnitude. This can reduce the
disproportionate influence of variables with a wide range of values on the results and helps to
improve the accuracy of certain algorithms, such as SVMs.
After completing the data preprocessing, we obtained a refined dataset. This dataset contained
418 individuals and 276 predictor variables, as well as our outcome variable, the severity
classification of autism. These predictor variables were rigorously selected, a large number of
missing values were removed, and through data transformation and recoding, the integrity of the
dataset and the accuracy of model analysis were ensured.
In order to ensure the accuracy and generalizability of model training, we used a random
stratified sampling method to split the data into training, validation and test sets. When allocating
the dataset, we used the createDataPartition function to randomly divide the sample data into
training, validation, and test sets in the ratio of 70%, 15%, and 15% while retaining consistent case
control proportions. This function performs stratified sampling based on the distribution of the target
17
variable, the severity classification of autism, to maintain the proportion of categories of the target
variable in each subset similar to that of the original dataset. After this processing, we obtained a
training set containing 294 samples, and validation and test sets containing 62 samples each. In each
subset, 16% of patients had mild autism and 83% had severe autism. This splitting strategy aims to
provide a stable and reliable benchmark to evaluate and compare the performance of different
models.
5.2 Methods
To construct predictive models for the severity classification of ASD, we selected the featurerich caret (Classification And REgression Training) package (Kuhn, 2008) as our primary tool. This
package facilitates model training and optimizing parameters, streamlining the model selection and
evaluation processes.
Since this study is a case-control study with binary outcome variables, we faced the issue of
imbalance case-control. There were 152 mild patients and 1,026 severe patients. For this reason, we
performed stratified sampling based on the distribution of the target variable and used the ROSE
(Random Over-Sampling Examples) technique (Menardi et al., 2014) for data balancing. Synthetic
samples are generated according to a smoothed bootstrap approach, which improves the predictive
power of the model for niche categories and reduces the risk of over-fitting.
The study experimented with various learning methods, comparing their outcomes to select the
optimal model. These methods included Linear Discriminant Analysis (LDA), which seeks the best
discriminant boundary between two or more classes. The elastic net regression is a regularization
technique for linear regression that combines both L1 and L2 penalties. Random Forest, an ensemble
learning algorithm, improves prediction accuracy and stability by building multiple decision trees
18
and aggregating their predictions. Support Vector Machine (SVM) is a supervised learning algorithm
used for classification and regression tasks; it classifies by finding a hyperplane that maximizes the
margin between different classes and is an effective method for high-dimensional data classification.
K-Nearest Neighbors (KNN) is an instance-based learning method where classification decisions
are determined by the sample's nearest observations; a sample's class is decided by the majority vote
of its K closest neighbors.
For the training set, each model's Receiver Operating Characteristic (ROC) Curve, sensitivity,
and specificity will be appraised. The ROC Curve is a significant tool for judging the quality of
classification models, while the sensitivity and specificity reflect the model's ability to correctly
identify positive and negative cases, respectively. The Kappa statistic will also be calculated to
measure the consistency of the model's classification accuracy, and balanced accuracy is a modified
metric for assessing accuracy in the presence of case control imbalance. We will use the same
evaluation method in both the validation set and the test set.
5.3 Results of Training and Validation
Before applying ROSE, the training set consisted of 49 mild-to-moderate autism patients (17%),
and 245 severe autism patients (83%). After ROSE, the training set was balanced with 146 mild
cases (49.7%) and 148 severe cases (50.3%), effectively addressing the imbalance issue during the
model training.
Table 3 displays the performance metrics for the validation sets across various methods, which
provide a detailed comparison of the results, facilitating a comprehensive evaluation of the models'
effectiveness.
19
Table 3: Results for Validation set
Method AUC Kappa Accuracy
(95% CI)
Balanced
Accuracy
Sensitivity Specificity
LDA 0.331 0.0658 0.3387
(0.2233, 0.4701)
0.3231 0.3462 0.3000
Elastic Net
Regression
0.756 0.2383 0.6774
(0.5466, 0.7906)
0.6865 0.6731 0.7000
Random
Forest
0.669 0 0.1613
(0.0802, 0.2767)
0.5000 0.0000 1.0000
SVM 0.767 0.1622 0.6452
(0.5134, 0.7626)
0.6269 0.6538 0.6000
KNN 0.581 -0.019
2
0.6129
(0.4807, 0.734)
0.4865 0.6731 0.3000
Combining the results in Table 3 with the AUC (0.756) and kappa (0.2383) of the validation
set, the elastic net regression demonstrates superior performance. Despite a kappa value of 0.2383
on the validation set, which may not represent a high level of absolute agreement, it is the highest
among the five methods in this study. The kappa statistic accounts for chance agreement present in
the data, making it a more reliable performance metric than mere accuracy, especially in the presence
of imbalanced sample distributions.
As an evaluation metric that is unaffected by class distribution, AUC is an important standard
for model assessment. Figure 4 illustrates that the Elastic net regression model achieved an AUC of
0.756 on the validation set, significantly surpassing the Linear Discriminant Analysis (LDA) at
0.331 and K-Nearest Neighbors (KNN) at 0.581. This demonstrates its consistent ability to correctly
identify true positives and true negatives across varying thresholds. In terms of precision and
balanced accuracy, the elastic net regression also exhibits robustness, with an accuracy of 0.6774 and
a balanced accuracy of 0.6865. This indicates reliable performance by the model in identifying both
minority (case) and majority (control) classes, which is particularly crucial in the context of highly
imbalanced positive and negative samples.
20
Figure 4: ROC curves for Elastic Net Regression, KNN, LDA, and SVM
The elastic net regression compelling performance on the validation set, with a high AUC of
0.756 and a kappa of 0.2383, underscores the importance of considering these metrics when
evaluating model performance amidst imbalanced samples. The AUC reflects the model's ability to
discriminate between positive and negative cases across all possible classification thresholds, while
the kappa provides an accuracy assessment corrected for random guessing, which is particularly
vital for datasets with imbalanced classes.
In addition to elastic net regression, SVM’s performance is also relatively good. SVM boasts a
high AUC score of 0.77on the validation set. Moreover, its kappa score of 0.1622 ranks as the highest
following that of Elastic net regression. The accuracy and balanced accuracy exceed the 0.5
21
threshold, recorded at 0.6452 and 0.6269, respectively. Figure 5 showcases the ROC (Receiver
Operating Characteristic) curve for the Random Forest, illustrating its classification capability.
Figure 5: ROC curves for Random Forest
Furthermore, we observed signs of overfitting in the Random Forest model. While it
demonstrated perfect performance on the training set, with an AUC of 1.00 and sensitivity of 1.00,
there was a stark decline in its performance on the validation set, where the AUC dropped to 0.669,
and the sensitivity fell to 0, indicating a failure to correctly identify any positive cases. This
significant fluctuation in performance, particularly the drop in sensitivity from perfect to none, is a
clear indicator of overfitting. Overfitting suggests that the model has excessively learned the
idiosyncrasies and noise present in the training data rather than capturing underlying patterns that
generalize to new, unseen data. Consequently, this leads to poor performance on novel datasets,
where the model is unable to make accurate predictions.
5.4 Results of Testing
Table 5 presents the performance metrics of various methods on the test set. The performance
on the test set shows some differences compared to the validation set. Both SVM and Elastic net
22
regression experienced a decline in performance, with their AUC, accuracy, and balanced accuracy
hovering around 0.5, suggesting that the predictive capabilities of these models are comparable to
random guessing. Notably, the LDA method outperforms others, with the highest AUC value of
0.638, indicating a relatively better ability to discriminate between positive and negative classes.
Additionally, it boasts the highest Kappa value at 0.1622 among all the methods. The accuracy and
balanced accuracy of LDA also exceed 0.6, recorded at 0.6452 and 0.6269 respectively. On the other
hand, the Random Forest exhibits a sensitivity of 0 and a specificity of 1, but with a Kappa value of
0 and an accuracy of only 0.1613, further confirming overfitting issues within the model on the test
set.
Table 4: Results for Test set
Method AUC Kappa Accuracy
(95% CI)
Balanced
Accuracy
Sensitivit
y
Specificity
LDA 0.638 0.1622 0.6452
(0.5134, 0.7626)
0.6269 0.6538 0.6000
Elastic Net
Regression
0.512 -0.0024 0.5645
(0.4326, 0.6901)
0.4981 0.5962 0.4000
Random
Forest
0.468 0.0000 0.1613
(0.0802, 0.2767)
0.5000 0.0000 1.0000
SVM 0.515 0.1009 0.6290
(0.4969, 0.7484)
0.5769 0.6538 0.5000
KNN 0.394 -0.0356 0.5161
(0.3856, 0.645)
0.4692 0.5385 0.4000
23
Chapter 6 Conclusion and Discussion
In this study, we employed a variety of machine learning methods, including Linear
Discriminant Analysis (LDA), elastic net regression, random forest, Support Vector Machines
(SVM), and k-Nearest Neighbors (k-NN). The performance of these methods varied, but generally,
none of the models performed ideally. Elastic net regression and SVM achieved a more balanced
performance between sensitivity and specificity. However, the random forest model was less
satisfactory; it demonstrated good specificity on the test set but low sensitivity, resulting in an overall
low kappa score, indicating a tendency to overfit.
The predictive factors of the model encompassed a wide range of categories, including medical
diagnoses, school feedback, special education needs, family history of disorders, cognitive
impairments, and language levels of patients, as well as numerous specific indicators from the RBSR Score and SCQ Score. Although age and language levels were normalized, potential differences
between modules still might exist. These discrepancies may contribute to the combined predictive
factors' ineffectiveness in constructing a robust model, inconsistent with individual module
performances observed previously.
Previous regression analysis has suggested that the module used to assess a patient was not
significantly related to their autism severity level, whereas certain scores have shown significant
associations with the severity levels. Simultaneously, language level should also be standardized in
accordance with CSS scores and classifications, ensuring uniformity in assessment metrics across
language skills. Although some predictors were statistically correlated with severity, they may not
be sufficient for effective prediction of severity, or their interrelationships might be more intricate.
The imbalance in the classification of autism severity could also lead to the model’s overfitting
toward the dominant category. Future improvements might include refining the classification
24
method, for instance, by employing a three-category system to delineate the spectrum of severity
more finely. Initially, due to a large number of cases, we attempted ROSE. Additionally, other sample
balancing methods, such as up-sampling, down-sampling and Synthetic Minority Oversampling
Technique (SMOTE) (Chawla et al., 2002), could be explored in subsequent studies. Investigating
different sample balancing techniques might help improve model performance.
In summary, the challenges faced in this research include dealing with the imbalance in data
classification, differences between modules, and potential limitations inherent in the predictors
themselves. Future work could enhance the predictive capability of the model by adopting more
sophisticated classification approaches, exploring various sample balancing techniques, and
analyzing the interrelations of predictors more thoroughly.
25
References
Hirota, Tomoya, and Bryan H. King. 2023. “Autism Spectrum Disorder: A Review.” JAMA.
American Medical Association. https://doi.org/10.1001/jama.2022.23661.
Talantseva, Oksana I., Raisa S. Romanova, Ekaterina M. Shurdova, Tatiana A. Dolgorukova,
Polina S. Sologub, Olga S. Titova, Daria F. Kleeva, and Elena L. Grigorenko. 2023. “The Global
Prevalence of Autism Spectrum Disorder: A Three-Level Meta-Analysis.” Frontiers in Psychiatry.
Frontiers Media S.A. https://doi.org/10.3389/fpsyt.2023.1071181.
Huang, Chien Yu, Kuan Shu Chen, Kuan Ying Lee, Chien Ho Lin, and Kuan Lin Chen. 2024.
“Different Autism Measures Targeting Different Severity Levels in Children with Autism
Spectrum Disorder.” European Archives of Psychiatry and Clinical Neuroscience 274 (1): 27–33.
https://doi.org/10.1007/s00406-023-01673-z.
Phillips, Helen, Barry Wright, Victoria Allgar, Helen McConachie, Jennifer Sweetman, Rebecca
Hargate, Rachel Hodkinson, et al. 2022. “Adapting and Validating the Autism Diagnostic
Observation Schedule Version 2 for Use with Deaf Children and Young People.” Journal of Autism
and Developmental Disorders 52 (2): 553–68. https://doi.org/10.1007/s10803-021-04931-y.
Adamou, Marios, Sarah L. Jones, and Stephanie Wetherhill. 2021. “Predicting Diagnostic
Outcome in Adult Autism Spectrum Disorder Using the Autism Diagnostic Observation Schedule,
Second Edition.” BMC Psychiatry 21 (1). https://doi.org/10.1186/s12888-020-03028-7.
Chesnut, Steven R., Tianlan Wei, Lucy Barnard-Brak, and David M. Richman. 2017. “A MetaAnalysis of the Social Communication Questionnaire: Screening for Autism Spectrum Disorder.”
Autism. SAGE Publications Ltd. https://doi.org/10.1177/1362361316660065.
Chan, Wai, Leann E. Smith, Jinkuk Hong, Jan S. Greenberg, and Marsha R. Mailick. 2017.
“Validating the Social Responsiveness Scale for Adults with Autism.” Autism Research 10 (10):
1663–71. https://doi.org/10.1002/aur.1813.
Hooker, Jessica L., Deanna Dow, Lindee Morgan, Christopher Schatschneider, and Amy M.
Wetherby. 2019. “Psychometric Analysis of the Repetitive Behavior Scale-Revised Using
Confirmatory Factor Analysis in Children with Autism.” Autism Research 12 (9): 1399–1410.
https://doi.org/10.1002/aur.2159.
Zwicker, Jill G., Cheryl Missiuna, Susan R. Harris, and Lara A. Boyd. 2012. “Developmental
Coordination Disorder: A Review and Update.” European Journal of Paediatric Neurology.
https://doi.org/10.1016/j.ejpn.2012.05.005.
Hyde, Kayleigh K., Marlena N. Novack, Nicholas LaHaye, Chelsea Parlett-Pelleriti, Raymond
Anden, Dennis R. Dixon, and Erik Linstead. 2019. “Applications of Supervised Machine Learning
in Autism Spectrum Disorder Research: A Review.” Review Journal of Autism and Developmental
Disorders. Springer New York LLC. https://doi.org/10.1007/s40489-019-00158-x.
26
Nasteski, Vladimir. 2017. “An Overview of the Supervised Machine Learning Methods.”
HORIZONS.B 4 (December): 51–62. https://doi.org/10.20544/horizons.b.04.1.17.p05.
Crippa, Alessandro, Christian Salvatore, Paolo Perego, Sara Forti, Maria Nobile, Massimo
Molteni, and Isabella Castiglioni. 2015. “Use of Machine Learning to Identify Children with
Autism and Their Motor Abnormalities.” Journal of Autism and Developmental Disorders 45 (7):
2146–56. https://doi.org/10.1007/s10803-015-2379-8.
Moradi, Elaheh, Budhachandra Khundrakpam, John D. Lewis, Alan C. Evans, and Jussi Tohka.
2017. “Predicting Symptom Severity in Autism Spectrum Disorder Based on Cortical Thickness
Measures in Agglomerative Data.” NeuroImage 144 (January): 128–41.
https://doi.org/10.1016/j.neuroimage.2016.09.049.
Gotham, Katherine, Andrew Pickles, and Catherine Lord. 2009. “Standardizing ADOS Scores for
a Measure of Severity in Autism Spectrum Disorders.” Journal of Autism and Developmental
Disorders 39 (5): 693–705. https://doi.org/10.1007/s10803-008-0674-3.
Waizbard-Bartov, Einat, and Meghan Miller. 2023. “Does the Severity of Autism Symptoms
Change over Time? A Review of the Evidence, Impacts, and Gaps in Current Knowledge.”
Clinical Psychology Review. Elsevier Inc. https://doi.org/10.1016/j.cpr.2022.102230.
Bavin, Edith L., Evan Kidd, Luke Prendergast, Emma Baker, Chery Dissanayake, and Margot
Prior. 2014. “Severity of Autism Is Related to Children’s Language Processing.” Autism Research
7 (6): 687–94. https://doi.org/10.1002/aur.1410.
Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical
Software, 28(5), 1–26. https://doi.org/10.18637/jss.v028.i05
Menardi, G., Torelli, N. Training and assessing classification rules with imbalanced data. Data Min
Knowl Disc 28, 92–122 (2014). https://doi.org/10.1007/s10618-012-0295-5
Chawla, Nitesh v, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. “SMOTE:
Synthetic Minority Over-Sampling Technique.” Journal of Artificial Intelligence Research. Vol. 16.
https://doi.org/10.1613/jair.953
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Machine learning-based breast cancer survival prediction
PDF
Analysis of factors associated with breast cancer using machine learning techniques
PDF
Trustworthy spatiotemporal prediction models
PDF
Multimodal single-cell biology and machine learning to characterize plasma cell neoplasms
PDF
Cell-specific case studies of enhancer function prediction using machine learning
PDF
Variants in MTNR1B and CDKAL1 contributes independent additive effects to GDM-related traits in Mexican Americans
PDF
Scalable latent factor models for inferring genetic regulatory networks
PDF
Forecasting traffic volume using machine learning and kriging methods
PDF
Application of statistical learning on breast cancer dataset
PDF
Identifying and quantifying transcriptional module heterogeneity and genetic co-regulation, with applications in asthma
PDF
Using multi-angle imaging spectroradiometer aerosol mixture properties and meteorology for PM₂.₅ assessment in Iran
PDF
Associations between inflammatory markers and change in cognitive endpoints
PDF
Comparison of Cox regression and machine learning methods for survival analysis of prostate cancer
PDF
Small area cancer incidence mapping using hierarchical Bayesian methods
PDF
Machine learning approaches for downscaling satellite observations of dust
PDF
Flymodeller: an interactive machine learning platform for automatic fly behavior annotation
PDF
Modeling the minor allele frequency and linkage disequilibrium joint architectures of human diseases and complex traits
PDF
Predicting mortality of sepsis with machine learning model approaches
PDF
Understand the distinct patterns of selection in auto-immune diseases with ancient DNA data by the S-LDSC model
PDF
Enhancing model performance of regularization methods by incorporating prior information
Asset Metadata
Creator
Zhang, Qi
(author)
Core Title
Predicting autism severity classification by machine learning models
School
Keck School of Medicine
Degree
Master of Science
Degree Program
Biostatistics
Degree Conferral Date
2024-05
Publication Date
04/17/2024
Defense Date
04/16/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
autism,autism severity level,machine learning,OAI-PMH Harvest
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Shu, Chang (
committee chair
), Mancuso, Nicholas (
committee member
), Street, Kelly (
committee member
)
Creator Email
hannah042377@gmail.com,qizhang7@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113880184
Unique identifier
UC113880184
Identifier
etd-ZhangQi-12826.pdf (filename)
Legacy Identifier
etd-ZhangQi-12826
Document Type
Thesis
Format
theses (aat)
Rights
Zhang, Qi
Internet Media Type
application/pdf
Type
texts
Source
20240418-usctheses-batch-1142
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
autism
autism severity level
machine learning