Projects
COVID-19 Diagnostic Prediction
Objective
Predict SARS-CoV-2 positivity using clinical and biological data through a binary classification workflow.
Approach - Data cleaning and preprocessing
- Feature engineering
- Binary classification with scikit-learn pipelines
- Hyperparameter tuning
- Recall-oriented evaluation to reduce false negatives
Evaluation - Confusion matrix
- Precision / Recall analysis across decision thresholds
- Cross-validation
Key Takeaways - Importance of recall in healthcare-related classification tasks
- Trade-off between precision and sensitivity
- Value of threshold analysis beyond standard accuracy
Illustrative Results
The final evaluation highlights how model assessment must be aligned with the application context. In a medical screening setting, reducing false negatives can be more important than maximizing overall accuracy alone.
Selected Visuals


These visualizations illustrate the final classification behavior of the model and the threshold-dependent trade-off between precision and recall in a healthcare-oriented setting.
Customer Churn Prediction & Business Insights
Objective
Predict customer churn and provide actionable insights to support retention strategies.
Approach - Exploratory Data Analysis (EDA)
- Feature engineering
- Classification models (baseline + optimized models)
- Model evaluation with business-oriented metrics
- Interpretability whith SHAP ans importance variables
Business Perspective - Identification of high-risk customers
- Analysis of key drivers of churn
- Translation of model outputs into actionable retention strategies
Evaluation - Precision, Recall, F1-score
- Confusion matrix
- ROC / Precision-Recall curves
Key Takeaways - Importance of interpretability in business decision-making
- Trade-off between model performance and actionability
- Value of data-driven insights for customer retention
Selected Visuals



These visuals highlight the interpretability, threshold optimization, and model comparison process used to translate churn prediction into actionable business insights.
Upcoming Projects
Credit Risk Scoring — Statistical & ML Models
- Default risk prediction
- Class imbalance handling
- Model calibration and threshold tuning
- Interpretability and regulatory perspective
End-to-End ML Pipeline — Prediction & Monitoring
- Reproducible ML pipeline
- Training → prediction → monitoring
- Performance tracking
- Data drift detection (planned)
Technical Stack Across Projects
Languages & Tools
Python, Git, Conda
Libraries
pandas, NumPy, scikit-learn, matplotlib, seaborn
Core Skills - Data preprocessing and feature engineering
- Model training and evaluation
- Pipeline design
- Cross-validation and tuning
- Model interpretation