Hi, I'mKhawaja

Data Analyst Intern @ Impiricus

I use code, queries and curiosity to turn raw data into real impact. Recently graduated from DePauw University with a BA in Computer Science with a concentration in Data Science.

Khawaja Ahmed
$150K+

Identified Revenue

Through rigorous data analysis in previous internship

3

Internships

Stealth AI/Tenzer Technology/DePauw University

15+

Data Science Projects

Links below

6x

Dean's List

Semester GPA > 3.50

Education

BA in Computer Science

DePauw University, May 2025

Minor in Data Science

DePauw University, May 2025

Skills Highlight

Data Engineering

ETL, Data Pipelines, Azure

Data Science

ML Models, Analytics, Visualization

Programming

Python, SQL, PySpark, JavaScript

Tools

Git, Docker, Jupyter, Power BI

Work Experience

Data Engineering Intern

Stealth AI

Dec 2024 - Feb 2025

San Francisco

  • Architected an Azure data pipeline processing 10K+ financial transaction records daily, reducing KPI computation time by 40% and identifying $150K+ in recoverable revenue.
  • Established PySpark validation frameworks that flagged 5K+ monthly anomalies, improving downstream analytics reliability by 35% for the finance team.
  • Optimized cloud infrastructure with incremental ETL and dynamic scaling, cutting cloud costs by 20% while maintaining 99.9% SLA for 50K+ daily patient records.
  • Resolved cross-hospital data inconsistencies using SCD2 dimensional modeling, increasing reporting accuracy by 28% and presenting actionable insights to CEO/CTO.
AzurePySparkETLData PipelinesHealthcare Analytics

Data Analysis & AI Intern

Tenzer Technology Center

Feb 2024 - May 2024

Chicago

  • Spearheaded the development of a reinforcement learning system for financial transaction forecasting that outperformed baseline models by ~11%, utilizing RAGAS framework to evaluate the LLM.
  • Designed A/B testing frameworks for user experience optimization, resulting in a 22% improvement in Click-through rate and a ~6% decrease in bounce rate
  • Implemented CI/CD pipeline with automated testing that reduced deployment time from days to hours, enabling rapid iteration across 3 development cycles.
  • Delivered critical production fixes that resolved performance bottlenecks, accelerating system response time by 15% under peak loads.
PythonReinforcement LearningCI/CDModel OptimizationPerformance Tuning

Data Science Intern

Depauw University

Oct 2023 - Jan 2024

Greencastle

  • Built an automated sports analytics platform using Python and pandas that integrated 5+ data sources, delivering insights that improved team performance by 23% across 12+ competitions.
  • Designed custom Git-versioned ETL & EDA workflows processing 300+ player statistics daily, reducing analysis latency by 80%.
  • Developed statistical research for predictive models with 85% accuracy that identified 7 high-potential recruits
PythonpandasSports AnalyticsETLStatistical Modeling

My Projects

Here are some of my recent projects that showcase my skills in Data Engineering, Data Science, and Analytics.

Transportation Time Series Data Dashboard

Transportation Time Series Data Dashboard

Developed forecasting models analyzing transit patterns to predict future demand, delivering actionable recommendations that translated statistical findings into measurable business value while optimizing resource allocation and reducing operational costs.

Time Series AnalysisStakeholder ReportsPython/RData VisualizationMachine LearningStatistical ForecastingPredictive ModelingARIMA/ProphetData CleaningBusiness Intelligence
Healthcare Revenue Cycle Management Optimization

Healthcare Revenue Cycle Management Optimization

Conducted statistical analysis to optimize the revenue cycle management pipeline, identifying $2.4M in potential recapture opportunities and delivering actionable recommendations that reduced claim denials by 22% while improving the overall financial metrics of the healthcare facility.

Machine LearningETLEDAStatistical AnalysisHypothesis TestingOperational OptimizationData VisualizationPythonSQLData Analysis
Lyft Driver Supply Optimization

Lyft Driver Supply Optimization

Built an interactive platform using real-time ride-hailing data (92M+ trips) to forecast 30-minute demand with 87% accuracy, uncovering $3.2M/month in missed revenue. ML-driven strategies improved driver-rider matching—boosting market balance by 9%, cutting incentive spend 4%, reducing wait times up to 10%, and increasing driver earnings by 14%.

Predictive ModelingGeospatial AnalysisTime Series ForecastingSupply-Demand GapMLA/B TestingROI AnalysisKPI Development
Agtech Agricultural Pipeline

Agtech Agricultural Pipeline

Engineered a scalable ETL pipeline that ingested and normalized IoT sensor data from 200+ field devices using Python, SQL, and Apache Airflow, enabling real-time crop yield predictions.

PythonpandasnumpySQLTableauData VisualizationEDAETLData AnalysisAPI Integration
Statistical Analysis of Graduate Debt

Statistical Analysis of Graduate Debt

Conducted in-depth statistical analysis comparing average debt of graduates from small private universities with large private universities using R/RStudio, revealing significant patterns in student loan burdens.

RHypothesis TestingStatistical AnalysisData VisualizationSummary Statistics
Reinforcement Learning Game

Reinforcement Learning Game

Developed a game with an AI opponent that progressively improves using reinforcement learning. Players take turns removing 1-4 sticks from a pile, with the player who takes the last stick winning. The AI agent learns optimal strategies through gameplay.

Artificial IntelligencePythonGame DevelopmentQ-Learning

My Skills

Data Engineering

SQLAzure Data FactoryDatabricksADLS Gen2Apache AirflowETL PipelinesData ModelingData Governance

Data Science

Python (pandas, NumPy, scikit-learn, matplotlib)R/RStudio (ggplot2, tidyverse)Statistical ModelingHypothesis Testing

Machine Learning

Reinforcement LearningRegression AnalysisNLP

Programming

PythonJavaC++SwiftC#Assembly

DevOps & Tools

Git/GitHubLinuxAzure CloudCI/CDExcelTableau

Get in Touch

Have a question or want to work together? Feel free to contact me!

Contact Information

I'm currently seeking full-time roles in Data Engineering, Data Science, or Analytics where I can continue learning and building impactful systems. Whether you have a question or just want to say hi, I'll try my best to get back to you!

Email

ahmed.khawaja.hussain@gmail.com

Location

New Jersey

Phone

(702) 348-9820

Connect With Me

Let's Connect