About Me

Currently working as Staff Data Engineer in CVS Health. Leveraging Cloud technologies to deliver awesome Retail Pharmacy experience😎

Let's talk about my past:

During my time at Lyft as Software Data Engineer I used several cloud and database related tools like AWS, Airflow to develop Data Pipelines using Python and SQL. My day to day combines the following things: Working with cross functional partners, gathering stakeholder requirements, writing awesome technical documents, coming up with strategies, proposing solutions, coding, deploying, monitoring, enhancing systems.

I love volunteering and helping out others. Recently We organized HackNYU, which is the official hackathon of New York University. I was part of the organizing committee for this hackathon which managed 400+ participants. I am a part of IEEE as well. During my undergrad we hosted events, talks, and seminars at the university level. I also served as Chairperson of Nirma University’s IEEE Student Branch in 2018 and as Treasurer for the year 2017.

Always open for opportunities that resonate with my interest and help me contribute towards the overall progress. I wish you can also take advantage of such a curious and enthusiastic mind like me. Looking forward to connecting with you.


Technologies I am skilled in:
Languages :
 Python, SQL, Java, C++, HTML, CSS, JavaScript, React.JS
Platforms :
 AWS, Azure, GCP, Snowflake, Teradata, DOMO
Databases :
 Trino, Spark, Hive, MongoDB, PostgreSQL, MySQL, T-SQL
Tools :
 Airflow, Kubeflow, Kubernetes, Docker, Kafka, Jenkins, Flask, FastAPI, Dash, Streamlit, Splunk,
 Grafana, Datadog, Tableau, Mode, Power BI, PagerDuty, Git, REST APIs, Postman, SSIS, SSMS,
 Jupyter Notebook, JIRA, Rally

Work Experience

CVS Health
Staff Data Engineer - Retail Patient Safety Team

January 2023 – Present
Technology: Python, SQL, Azure, GCP, K8, Kubeflow, Airflow, Snowflake

  • Maintained agile scrum calls, status reports, team backlog to achieve OKRs for data engineering initiatives for retail business.
  • Led architecture reviews and alignment discussions with cross functional teams to help set timelines for deliverables.
  • Mentored 6+ junior data engineers in data engineering best practices and offered technical guidance on projects.
  • Spearheaded development of common framework which is now used across data organization by 4+ teams for faster SDLC.
  • Built interactive data apps using Streamlit and Dash to enhance data accessibility and to provide easy visualization insights.
  • Architected automation workflows with Flask API Endpoints to perform real time interactions with Generative AI LLM models.
  • Designed MLOps development process for 14+ Data Scientists to develop & productionize ML models as Kubeflow Notebooks.
  • Developed pipelines to process incremental updates on 150M+ user records in MongoDB to support live microservices.
  • Designed near-real time pipeline to capture 20M+ daily events with 10B+ historical events related to IoT devices.
  • Led Hadoop Edge Node deprecation, Teradata deprecation as part of cost saving initiative to migrate data to Snowflake.
  • Lyft
    Software Engineer - Transit, Bikes & Scooters Team

    July 2021 – December 2022
    Technology: Python, SQL, AWS, Airflow, Trino, Spark, Hive

  • Redesigned ETL process to incorporate new events and snapshot data sources for hardware repair dataset.
  • Led migration initiative to transfer 80+ Airflow DAGs from Hive to Spark as part of cost saving initiative.
  • Developed datasets to track SIM cards activities by working with cross functional teams to save $1M+ in subscription charges.
  • Enhanced monitoring utilities by adding Airflow log links in PagerDuty to reduce troubleshooting time during On Call.
  • Contributed new features into custom ETL framework which helps Data Scientist build lightweight ETL pipelines easily.
  • HBO WarnerMedia
    Data Engineer Intern - HBO Max Team

    January 2021 – April 2021
    Technology: Python, SQL, AWS, Airflow, Snowflake, Datadog

  • Integrated data quality framework on Slack to provide configuration driven alert messaging for data pipelines.
  • Designed monitoring tool for lambda function failures using CloudWatch metrics to send alerts via SNS topics to Slack.
  • Implemented POC for alerting and anomaly detection on critical metrics using Datadog for production releases.
  • Supported ETL process using Lambda functions to Snapshot Looker Dashboard data into Snowflake.
  • NYU IT
    Software Engineer Intern - Enterprise Data Management Team

    March 2020 – May 2021
    Technology: Python, Flask, React.JS, AWS, Snowflake

  • Created API endpoints using Flask to develop API Request Web Portal to deliver information about various databases.
  • Designed data pipeline to migrate data from S3 bucket to Snowflake Data Warehouse using Lambda and CloudWatch.
  • Developed configuration driven CDC methods to capture incremental data from upstream data sources.
  • HBO WarnerMedia
    Cloud Technology Intern - Data Analytics and Platform Support Team

    June 2020 – August 2020
    Technology: Python, SQL, AWS, Airflow, PostgreSQL, Grafana

  • Designed custom dashboard using Grafana to monitor daily workloads and provide insights on resource utilization.
  • Developed rule base alert system to notify Data Engineering and Data Science teams about critical 100+ data sources.
  • Integrated PostgreSQL queries to get information of different Airflow DAGs execution run time and job failures.
  • Analyzed CloudWatch logs to monitor Step functions and Glue jobs for Data Science and Data Engineering teams.
  • MAQ Software
    Software Engineer Intern

    January 2019 – May 2019
    Technology: Python, SQL, SSIS, SSMS, Power BI

  • Developed data pipelines to handle Transactional Sales data by integrating 15+ source tables.
  • Designed Metrics and KPI dashboard using Power BI to give data insights to client and senior management.
  • Skills & Proficiency

    Python

    SQL

    Kubernetes

    Docker

    AWS

    Azure

    GCP

    HTML

    CSS

    JAVASCRIPT

    C++

    JAVA