COGS 137: Practical Data Science in R

Case Study 01: Biomarkers of Recent Usage

Group case study analyzing researcher data on marijuana usage to determine what factors are best for marking recent usage of marijuana. Used metrics such as blood, oral fluid, and breath to examine marijuana dosages and compound differences.
Code snippet

Conducted a detailed case study examining marijuana usage detection methods using advanced statistical analysis in R and RStudio. Leveraging powerful data science packages like tidyverse and ggplot2, we systematically investigated multiple biological markers—including blood, oral fluid, and breath samples to identify the most effective indicators of recent marijuana consumption. By applying rigorous machine learning techniques such as Random Forest modeling, we evaluated the sensitivity and specificity of various marijuana compounds, with a specific focus on potential roadside detection applications for law enforcement. Our findings revealed a clear hierarchy of reliability, with blood samples emerging as the most dependable method for detecting recent marijuana use, closely followed by oral fluid measurements. This research provides critical insights into the complex landscape of marijuana detection methodologies, offering valuable guidance for forensic and medical professionals seeking accurate and timely assessment of recent cannabis consumption.

Due to research privacy protocol, code can't be displayed.