Summary
Overview
Work History
Education
Skills
Websites
Certification
Languages
Timeline
Generic

Pooja Choudhary

Berlin

Summary

ML Engineer with 7+ years building production-ready data systems,specialized in developing machine learning solutions backed by robust data infrastructure. Expert in deploying ML models that scale to 5M+ daily predictions through deep understanding of data pipelines, MLOps and production systems. Proven track record of translating complex business requirements into scalable ML applications across finance, healthcare, and marketing domains.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Senior Associate Data & ML Engineer

Publicis Sapient
08.2023 - Current

Production ML Model Development

  • Developed and deployed lead scoring ML system using Random Forest algorithm, processing 2M+ customer interactions daily with 92% accuracy, improving sales conversion by 15% and generating measurable revenue impact
  • Built campaign optimization ML engine using XGBoost ensemble models with real-time inference capability (delivering 12% improvement in marketing ROI through intelligent budget allocation
  • Implemented customer segmentation ML pipeline using clustering algorithms on behavioral data from 1M+ users, enabling personalized marketing campaigns with 18% increase in engagement rates

NLP & Conversational AI

  • Architected NLP-based customer service bot using BERT fine-tuning and intent classification, achieving 89% accuracy on customer queries and reducing support workload by 25%
  • Developed semantic search system using sentence transformers and vector databases, improving content discovery relevance by 35% over keyword-based search

MLOps & Infrastructure

  • Established end-to-end MLOps pipeline with automated model training, validation, and deployment using MLflow and Databricks, reducing model deployment time from weeks to hours
  • Built model monitoring framework with drift detection and performance tracking, maintaining production model reliability with 99.5% uptime SLA
  • Mentored junior engineers on ML pipeline design and model evaluation best practices.

Cross-Functional Collaboration: Led technical discussions with product and business stakeholders, translating ML capabilities into actionable business strategies

Knowledge Sharing: Created internal ML engineering documentation and conducted training sessions for 15+ team members on production ML practices.

Data Engineer

Wefox
02.2023 - 05.2023

ML-Focused Data Solutions

  • Engineered Customer360 ML platform using Python, Airflow, and Spark, integrating 15+ insurance data sources for ML model training, enabling 60% faster feature engineering for predictive models
  • Built real-time data APIs with FastAPI and AWS Lambda serving ML applications, achieving
  • Implemented data quality frameworks specifically designed for ML workflows, ensuring model training data reliability with automated validation and monitoring

Senior Consultant - Data & Analytics

KPMG India
01.2022 - 01.2023

Enterprise ML Infrastructure

  • Designed cloud-native ML platforms on Databricks for Fortune 500 clients, building scalable infrastructure supporting ML workloads processing 100TB+ enterprise data
  • Developed automated feature engineering pipelines creating ML-ready datasets from complex multi-source data, reducing model development time by 40%
  • Built streaming ML infrastructure with Kafka processing 5M+ events/day, enabling real-time model inference for fraud detection and risk analytics

Data Analyst

Optum (UnitedHealth Group)
05.2020 - 01.2022

Healthcare Predictive Analytics

  • Built predictive analytics models for healthcare claims processing using PySpark and statistical modeling, analyzing 50M+ records to identify care optimization opportunities
  • Developed automated anomaly detection systems for claims data, reducing manual review workload by 35% while maintaining 94% accuracy in fraud identification
  • Applied advanced statistical analysis on large healthcare datasets, providing insights that informed clinical decision-making processes

Data Engineer

Accenture
01.2018 - 04.2020

ML Data Pipeline Development

  • Designed ML-ready data pipelines processing enterprise data across multiple domains, implementing feature stores and validation frameworks supporting 20+ production models
  • Collaborated with data science teams on model deployment infrastructure, building robust ETL workflows that ensured model training data quality and consistency

Education

Bachelor of Technology - Computer Science & Engineering

Maharaja Surajmal Institute of Technology
Delhi
06-2017

Diploma - Computer Engineering

Kasturba Polytechnic For Women
Delhi
06-2014

Skills

Machine Learning & AI

  • Algorithms: XGBoost, Random Forest, Logistic Regression, KNN Clustering
  • Deep Learning: TensorFlow, PyTorch
  • NLP: Hugging Face Transformers, BERT, GPT integration, Text Classification
  • Model Evaluation: Cross-validation, A/B testing, Performance optimization

MLOps & Production ML

  • Model Management: MLflow, Model versioning, Experiment tracking
  • Deployment: Docker, Kubernetes, FastAPI, Real-time inference systems
  • Monitoring: Model drift detection, Performance tracking, Automated retraining
  • CI/CD: GitHub Actions, Automated ML pipelines

Data Engineering & Infrastructure

  • Big Data: Apache Spark, PySpark, Kafka, Azure Data Factory, Airflow, Snowflake
  • Cloud ML: Databricks ML, Azure ML, AWS SageMaker, Model serving at scale
  • Feature Engineering: Advanced transformations, Real-time feature stores
  • Data Quality: Validation frameworks, Anomaly detection, ML-ready datasets

Programming & Tools

  • Languages: Python, SQL
  • ML Libraries: Scikit-learn, Pandas, NumPy, Matplotlib, Seaborn, Streamlit
  • Infrastructure: Docker, Kubernetes, Git, REST APIs

Certification

  • Databricks Certified Machine Learning Professional
  • Machine Learning Specialization (DeepLearning.AI)
  • Advanced Machine Learning Operations by Databricks
  • Salesforce Certified AI Associate
  • Databricks Certified Associate Data Engineer
  • Databricks Certified Spark 3.0 Associate Developer
  • Databricks Lakehouse Accreditation

Languages

English
Advanced (C1)
German
Intermediate (B1)

Timeline

Senior Associate Data & ML Engineer

Publicis Sapient
08.2023 - Current

Data Engineer

Wefox
02.2023 - 05.2023

Senior Consultant - Data & Analytics

KPMG India
01.2022 - 01.2023

Data Analyst

Optum (UnitedHealth Group)
05.2020 - 01.2022

Data Engineer

Accenture
01.2018 - 04.2020

Bachelor of Technology - Computer Science & Engineering

Maharaja Surajmal Institute of Technology

Diploma - Computer Engineering

Kasturba Polytechnic For Women
Pooja Choudhary