Summary
Overview
Work History
Education
Skills
References
Websites
Timeline
Generic

Rahul Vishnoi

Brussels

Summary

Experienced in stakeholder collaboration and proficient in translating product requirements to technical teams. Expert in big data technologies with a proven track record in architecting and implementing solutions involving data pipelines, data lakes, and warehouses. Skilled in advanced infrastructure setup for large-scale data workloads, managing petabytes of data. Demonstrates strong capabilities in real-time and batch processing frameworks, data exploratory analysis, data governance, data security, and data modeling.

Overview

10
10
years of professional experience

Work History

Sr Manager- Data Engineering

Janio Asia
Brussels
12.2020 - Current

Janio.Asia Customer 360 Development

  • Led the "One Customer View" project at Janio.Asia, aimed at consolidating data from multiple business units to create a unified customer database.
  • Managed data synchronization from various sources, including MySQL, PostgreSQL, SFTP, Kafka, and third-party APls, ensuring high scalability and seamless integration.
  • Reduced operational costs and maintenance overhead, enhancing overall organizational productivity.
  • Role involved full system architecture design, EC2 server installations, Superset and Redshift cluster setups, and comprehensive system testing.
  • Technologies: Spark Scala, Debezium, Databricks, Athena, Redshift, S3, EC2, Spark Streaming Kafka, Airflow, Superset, Metabase.

Databricks Infrastructure Setup and Management

• Configured and managed scalable Databricks clusters, integrated seamlessly with existing IT infrastructure to support sophisticated data processing and analysis.

Data Warehouse Construction Using Delta Lake

  • Initiated and led the development of a state-of-the-art data warehouse using the Delta Lake format to improve data reliability, quality, and accessibility.
  • Designed the warehouse to support both batch and real-time data processing for more dynamic and actionable analytics.
  • Utilized Delta Lake's ACID transaction features to ensure data integrity across large datasets and multiple processing tasks.
  • Enhanced data ingestion, storage, and retrieval processes by integrating the warehouse with the existing data ecosystem.
  • Technologies: Delta Lake, Databricks, Spark, Redshift, S3, Airflow.

Data Engineering Manager

Paisabazaar, Gurgaon
Gurgaon
10.2018 - 12.2020

Paisabazaar.com Analytics Engine

  • Led the development of an Analytics Engine to enhance real-time analytics and reporting across multiple channels at Paisabazaar.
  • Integrated Debezium CDC and established a data lakehouse architecture to support scalable analytics and real-time data synchronization.
  • Achievements include enhanced order tracking, improved conversion rates, and more effective competitor analysis.
  • Technologies: Hadoop, MapReduce, Druid, Orc, Zeppelin, Spark, Scala, Debezium CDC, Hortonworks Cluster installation.

Bureau Reports Query Optimization

  • Directed the optimization of bureau report queries, improving data retrieval efficiency and report speed.
  • Developed and implemented query optimization techniques and indexing strategies to enhance system performance.
  • Resulted in a faster, more efficient reporting system aiding timely decision-making.

Advantage:

  • Real-time tracking of Customer's orders.
  • Increase conversion rates by better identification of successful sales transactions.
  • Enhance the inventory management system.
  • Easy comparison of competitor's offers.
  • Consumer behaviour learning and retargeting.

Staff Engineer

Shopclues, Gurgaon
Gurgaon
12.2015 - 10.2018

Real-Time Data Processing System

  • Developed and optimized a real-time analytics system using Apache Hadoop, HBase, Kafka, and Open Replicator.
  • Role: System design, streaming implementation, performance optimization.
    Technologies: Hadoop, HBase, Kafka, Aerospike, Java.

Email Personalization System

  • Built a system for personalized email and push notifications based on user behavior, enhancing engagement and conversion rates.
  • Role: Architecture design, coding, testing, team management.

Advantage:

Makes conversion easier

  • Fewer email yields more customers
  • Builds a passionate Audience

Education

Master of Computer Applications - Computer Science

IIT BHU
Varanasi
06-2013

Skills

  • AWS
  • GCP
  • Redshift
  • MySQL
  • PostgreSQL
  • Athena
  • DMS
  • Scala
  • Spark
  • Kafka
  • Databricks
  • Debezium
  • Hadoop
  • Hive
  • HBase
  • Apache Ranger
  • Apache Superset
  • Metabase
  • Airflow
  • Data architecture
  • Real-time analytics
  • Cloud infrastructure
  • Data warehousing
  • ETL processes
  • Database management
  • Team leadership
  • Interpersonal Skills
  • Agile Project Management
  • Cross-functional collaboration
  • Strategic planning
  • Problem solving
  • Project management
  • Reporting management
  • Data analytics
  • Data-driven decision making
  • Documentation and reporting
  • Multitasking Abilities
  • Analytical skills
  • System architecture
  • Data integration
  • Data governance
  • Data pipeline development
  • Big data processing
  • Stakeholder management

References

  • Jagmal Singh from ex-CTO of Janio Asia, Paisabazaar

Timeline

Sr Manager- Data Engineering

Janio Asia
12.2020 - Current

Data Engineering Manager

Paisabazaar, Gurgaon
10.2018 - 12.2020

Staff Engineer

Shopclues, Gurgaon
12.2015 - 10.2018

Master of Computer Applications - Computer Science

IIT BHU
Rahul Vishnoi