You are here:

Understanding Opioid Use Disorder (OUD) using tree-based classifiers

posted 01/03/2020


Adway S.Wadekar

Get rights and content


•  We developed a method to identify adults likely to develop Opioid Use Disorder.

•  Early initiation of marijuana is a dominant predictor of Opioid Use Disorder.

•  The method considers demographic, socioeconomic, and health related features.

•   public domain datasets can aid in understanding addiction disorders.



Opioid Use Disorder (OUD), defined as a physical or psychological reliance on opioids, is a public health epidemic. Identifying adults likely to develop OUD can help public health officials in planning effective intervention strategies. The aim of this paper is to develop a machine learning approach to predict adults at risk for OUD and to identify interactions between various characteristics that increase this risk.


In this approach, a data set was curated using the responses from the 2016 edition of the National Survey on Drug Use and Health (NSDUH). Using this data set, tree-based classifiers (decision tree and random forest) were trained, while employing downsampling to handle class imbalance. Predictions from the tree-based classifiers were also compared to the results from a logistic regression model. The results from the three classifiers were then interpreted synergistically to highlight individual characteristics and their interplay that pose a risk for OUD.


Random forest predicted adults at risk for OUD with remarkable accuracy, with the average area under the Receiver-Operating-Characteristics curve (AUC) over 0.89, even though the prevalence of OUD was only about 1 %. It showed a slight improvement over logistic regression. Logistic regression identified statistically significant characteristics, while random forest ranked the predictors in order of their contribution to OUD prediction. Early initiation of marijuana (before 18 years) emerged as the dominant predictor. Decision trees revealed that early marijuana initiation especially increased the risk if individuals: (i) were between 18–34 years of age, or (ii) had incomes less than $49,000, or (iii) were of Hispanic and White heritage, or (iv) were on probation, or (v) lived in neighborhoods with easy access to drugs.


Machine learning can accurately predict adults at risk for OUD, and identify interactions among the factors that pronounce this risk. Curbing early initiation of marijuana may be an effective prevention strategy against opioid addiction, especially in high risk groups.

get involved

If you are interested in what we do and would like to support our work, find out more ways to get involved.


get our book

Drugs: It’s just not worth it

Drugs: It’s just not worth it

Our 35-page book gives clear and easy to read facts and advice aimed at teenagers and young people.


Buy Now