Machine Learning Deep Dive

The problem

Universities often identify at-risk students too late: after they've already failed courses or dropped out. Early intervention requires predictive signals that human advisors might miss across hundreds of students. A single advisor managing 200 students can't manually track GPA trajectories, failure patterns, and credit progress every week. The model can.

The approach

A supervised classification model using a Random Forest ensemble to predict whether a student is academically at-risk based on their academic performance metrics.

Why Random Forest?

Handles non-linear relationships between features. A student with 3.5 GPA but 3 failed courses is still at risk: logistic regression would struggle with that interaction, Random Forest doesn't.
Provides feature importance rankings, making predictions interpretable for advisors. "This student is at risk primarily because of low GPA (34%) and three failed courses (26%)" is actionable. "This student is at risk because the model said so" is not.
Robust against overfitting with 100 decision trees (estimators) and bootstrap sampling.
No feature scaling required: works directly with raw GPA and count values. Saves a preprocessing step and a class of bugs.

Training data

Generated 500 synthetic student profiles with realistic distributions modeled after actual university patterns:

Feature	Range	Distribution
GPA	0.5 to 4.0	Uniform
Courses taken	1 to 12	Uniform integer
Courses failed	0 to 5	Uniform integer
Avg grade points	0.5 to 4.0	Uniform
Credits completed	3 to 60	Uniform integer
Semesters enrolled	1 to 8	Uniform integer

Labeling criteria

A student is at risk if any of these apply:

GPA below 2.0
3 or more courses failed
Average grade points below 1.5
GPA below 2.5 AND 2+ courses failed
Fewer than 15 credits completed after 4+ semesters

These rules encode what an experienced academic advisor would flag manually. The Random Forest learns the patterns, then generalizes to combinations the rules don't explicitly cover.

Results

              precision    recall  f1-score   support

 Not At Risk       0.88      0.88      0.88        17
     At Risk       0.98      0.98      0.98        83

    accuracy                           0.96       100

96% accuracy on held-out test set. The slight imbalance (17 not-at-risk vs. 83 at-risk in test) reflects the labeling criteria: the rules are sensitive, so most generated profiles trip at least one flag.

For production deployment, the next step would be retraining on real anonymized university data, where the class balance is typically reversed (most students are not at risk).

Feature importance

Feature	Importance	Interpretation
GPA	33.96%	Strongest single predictor of academic success
Courses failed	26.32%	Direct indicator of academic difficulty
Avg grade points	20.94%	Captures grade trajectory beyond cumulative GPA
Credits completed	8.49%	Progress indicator: slow progress signals risk
Courses taken	5.21%	Course load context
Semesters enrolled	5.07%	Time-in-program context

GPA and failure count together account for 60% of the model's predictive power. This matches advisor intuition: those are the two things they'd check first when reviewing a student's standing.

Integration architecture

The ML model runs as an independent Flask microservice, decoupled from the Java backend:

Staff clicks "Risk Check" on a student in the Java web app
StudentServlet gathers the student's academic metrics from MySQL (GPA, enrollment count, failed courses, etc.)
MLClient utility class sends an HTTP POST to http://localhost:5000/predict with the metrics as JSON
Flask API loads the pre-trained model from disk (student_risk_model.pkl), runs inference, and returns a JSON response with prediction, confidence score, and a human-readable recommendation
JSP renders the result with color-coded risk status, confidence percentage, risk probability meter, and specific concerns

This microservice pattern means:

The ML model can be retrained, updated, or replaced without touching the Java codebase
Python and Java codebases evolve independently
The model can be A/B tested by spinning up a second Flask service on a different port
ML inference (CPU/GPU-bound) scales independently from the Java app (I/O-bound)

It's the same pattern used at companies running production ML: model serving is its own service, not a library import.

Graceful degradation

If the Flask service is unavailable (down, slow, or unreachable), the app doesn't crash. The MLClient has a 5-second timeout and catches connection errors. When inference fails, the Risk Check page shows "Risk assessment service unavailable" instead of a 500 error. The rest of the app: enrollments, grades, dashboards: keeps working.

In distributed systems, every network call is a potential failure point. Designing for graceful degradation from day one is cheaper than retrofitting it after a production outage.

Future enhancements

Train on real anonymized university data for higher real-world accuracy
Add time-series features: GPA trend across semesters, grade improvement/decline patterns
Implement model versioning and A/B testing infrastructure
Add batch prediction for end-of-semester risk reports across the full roster
Explore gradient boosting (XGBoost) for potential accuracy improvements
Add explainability via SHAP values for per-prediction feature attribution

AI-powered Student Management System