Prediction of Primary Open-Angle Glaucoma in Individuals With Ocular Hypertension using Genetic Risk Factors
Abstract
Primary open-angle glaucoma (POAG) is a leading cause of irreversible vision loss, often preceded by ocular hypertension. The prevalence and severity of POAG may be reduced by treating high-risk individuals before disease onset. However, universal treatment of ocular hypertensive individuals is neither medically nor economically justified due to the high prevalence of ocular hypertension but relatively low conversion rate to POAG.
To support targeted prevention, OHTS introduced a risk calculator based on clinical predictors. This work extends that framework by incorporating genetic risk factors to improve precision in individualized risk estimation. Using Cox proportional hazards models, we developed and compared clinical-only and combined clinical–genetic risk prediction models for 5-year POAG incidence. Model outputs are implemented in an interactive risk calculator designed to support patient-specific decision-making.
By integrating genetic information with established clinical predictors, this work demonstrates how statistical modeling can support precision prevention, improve clinical decision-making, and reduce unnecessary treatment.
Leveraging Predictor Correlations for Faster and Smarter Variable Selection: The Correlation-Based Adaptive Lasso (CBAL)
Abstract
Regularization methods, such as the adaptive Lasso, are widely used for variable selection; however, their performance depends on weight estimation, usually based on ridge regression. This work introduces the Correlation-Based Adaptive Lasso (CBAL), a novel weighting strategy that leverages correlations among predictors and between predictors and the outcome. Each weight for predictor Xj is defined as the sum of the absolute correlations between Xj and all other predictors, divided by the absolute correlation between Xj and the response variable Y. This approach penalizes redundant predictors and enhances interpretability. Through simulations under varying signal-to-noise ratios and sample sizes, CBAL achieved lower mean squared error and improved variable selection compared to adaptive Lasso and SCAD, especially in low-signal conditions. Application to the Boston Housing dataset confirmed comparable predictive accuracy with reduced computation time, demonstrating CBAL’s simplicity, efficiency, and correlation-aware design for high-dimensional regression.
Examining Changes in Iowa High School Standardized Test Proficiencies
Abstract
This poster demonstrates the changes in the proficiency of middle and high school students from 2013 to 2023 as measured by the state of Iowa’s yearly standardized test, with emphasis on the changes from before and after the COVID-19 pandemic. Time series analysis revealed a downward trend in the percentage of students proficient in a given subject. A difference in test proficiencies was demonstrated in relation to household median income by county, where school districts in a county with lower median income on average had more students proficient in both subjects. In conclusion, analysis reveals that test proficiencies have changed over time, and these changes may be associated with the delayed effects from the COVID-19 pandemic.
Function-Targeted Adaptive Shrinkage for Small-Sample Dose-Toxicity Estimation
Abstract
A primary objective of phase I dose-finding (DF) trials is to characterize the dose-toxicity profile for dose-limiting toxicity and maximum tolerated dose (MTD) estimation. However, early-phase studies often enroll a small sample of participants, leading to unstable estimates of the profile and consequently the MTD. To address these challenges, we propose a Bayesian shrinkage framework with two hierarchical specifications for the smoothing parameter—Function-Targeted Adaptive Shrinkage with smooThing (FAST)—to avoid fixed parametric forms by adaptively shrinking nonparametric fits toward prespecified parametric subspaces while allowing for flexible data-driven deviations. To accommodate small-sample DF settings, we develop a numerically stable MCMC algorithm, incorporate curvature smoothing, and impose linear constraints at the boundaries. This framework stabilizes estimation under sparse dose allocations and mitigates the impact of model misspecification for the dose-toxicity relationship. Extensive simulations were conducted under three true dose-toxicity functions, three sample sizes (N ∈ {18, 24, 30}), and 23 scenarios with or without skipped dose in escalation. The FAST approach consistently achieved the lowest root mean squared error (RMSE) with minimal variance and the lowest integrated squared error (ISE), regardless of parametric model specification. Compared with Bayesian P-splines, the proposed methods demonstrated superior robustness, improved RMSE and ISE, and reduced sensitivity to model misspecifications. The two hierarchical specifications of the FAST approach further provided a diagnostic tool for detecting potential model misspecification in early cohorts. These findings highlight adaptive nonparametric shrinkage methods as a promising strategy for reliable dose-toxicity estimation in small-sample DF trials.
Tele-stroke Improves EVT Access for Acute Stroke Patients: A Propensity Score Analysis
Abstract
Stroke is a leading cause of death and disability in the United States, with timely treatment essential for recovery. Endovascular therapy (EVT) increases disability free survival by 71% for eligible patients, but benefits are highly time sensitive, and only half of Americans live within 60 miles of an EVT-capable center. Conventional hospital transport often routes patients to the nearest hospital, or in some systems to comprehensive stroke centers based on heuristic rules.
Tele-stroke connects a remote neurologist with patients to guide diagnosis, treatment options, and transfers. Using a causal modeling framework, we estimated the effect of tele-stroke on treatment timing, EVT access, and recovery outcomes. We conducted a propensity-matched cohort study of acute stroke patients with large-vessel occlusion in the Get With The Guidelines-Stroke Registry (GWTG-S, American Heart Association). Propensity scores included patient (NIH Stroke Scale, age, sex) and hospital factors (rurality, stroke volume, designation). Tele-stroke was associated with faster and more frequent access to EVT, supporting its expansion as a cost-effective strategy to optimize acute stroke care.
Predicting Baseball Offensive Performance Using Contact Network Analysis
Abstract
In the realm of Major League Baseball, there is a lot of talk about the effect that learning from teammates can have on a player’s performance. Baseball analysts often praise the effect an experienced player can have on the success of all his teammates. In this poster, I detail my findings through the development of an undirected, weighted, contact network graph of every offensive Major League Baseball player that has played as a teammate with one another. Using edges weighted for the length of time spent as teammates, I have extracted 7 measures of connectedness per player per year using a graph that iteratively includes every year up to the given year. Using that data, I have fitted a model to estimate an offensive player’s weighted on base average (wOBA) based on those connectedness measures as predictor variables. Evidence has pointed to the idea that the quality and length of player relationships may be much more valuable for a player’s performance than the quantity of relationships.
Random Forest Prediction Set: A Conformal Prediction Approach Using Out-of-Bag Estimates
Abstract
Accurate point predictions are often insufficient for reliable decision-making, as
many applications, especially in high-stakes scenarios, require a measure of uncer-
tainty. Conformal prediction addresses this need by providing a distribution-free way
to construct prediction sets with formal coverage guarantees. A key challenge in apply-
ing conformal prediction is that in order to obtain valid coverage guarantees, we often
hold out a calibration set for conformity score computation, resulting in additional
data and computational overhead. In this work, we present a principled approach to
incorporating conformal prediction into random forests. We build on the fact that
random forests naturally generate out-of-bag (OOB) samples to obtain the required
conformity scores without requiring an additional held-out calibration set. Our OOB
conformal leverages the computational efficiency gains of random forests while main-
taining the coverage guarantee of conformal prediction. We empirically demonstrate
that our OOB-based method produces well-calibrated predicted sets, highlighting the
potential of our OOB conformal approach as a practical and interpretable tool for un-
certainty quantification in random forests. Our results show that the proposed OOB
method typically produces smaller prediction sets than competing approaches while
maintaining marginal coverage close to the nominal confidence level.
Logistic Modeling of Thyroid Cancer Recurrence using Initial Treatment Response
Abstract
In the United States, cancer is the leading cause of death for people under 65 years old, highlighting the ongoing need for targeted research. This study focuses specifically on thyroid cancer, which is projected to account for approximately 44,000 new cases this year. And although thyroid cancer is considered to be a more treatable cancer, unlike many other cancer types with declining mortality rates, thyroid cancer has exhibited a slight upward trend in mortality.
Investigating the predictive effect of initial treatment outcome, our findings indicate that initial treatment outcome is a strong and statistically significant predictor of recurrence when controlling for age, gender, radiotherapy history, and risk classification. We also explore the influence and interrelationships of additional explanatory variables to provide a more comprehensive understanding of recurrence risk in different thyroid cancer patients.
Mapping Ovarian Cancer in Iowa
Abstract
With Iowa having the second highest age-adjusted rate of cancer in the United States and ovarian cancer being a leading cause of cancer deaths among women, disease mapping approaches can aid in prevention education and clinical resource allocation. In this project, we first describe the epidemiology of ovarian cancer in Iowa from 2004 to 2022 at the state level. Second, we determine whether any of the 99 counties in Iowa are more "at risk" for ovarian cancer incidence and mortality, relative to the statewide rate. We accomplish this by implementing a Bayesian Poisson lognormal regression model, incorporating a county-level spatial correlation variable that follows an intrinsic conditional autoregressive (ICAR) process. Finally, we assess any associations between behavioral-level predictors (excessive drinking, smoking, obesity) and county-level ovarian cancer incidence.