Dipanwita Das

M.Sc. Data Science Student | Analytics & Forecasting Specialist

LinkedIn | GitHub | Email | Phone

[email protected]

+91 8910916460

Kolkata, IN

About

Highly analytical M.Sc. Data Science student with proven expertise in regression, forecasting, and large-scale survey analysis. Adept at leveraging Python, R, and SQL to develop robust ETL pipelines, build predictive models, and generate data-driven insights. Seeking an impactful role in analytics or forecasting to apply advanced statistical modeling and machine learning techniques to solve complex business challenges.

Work Experience

Analytics Intern

SoftOfficePro

Feb 2025 - May 2025

Kolkata, West Bengal, IN

Spearheaded survey analytics and forecasting initiatives, developing robust ETL pipelines and predictive models to optimize data processing and strategic planning.

  • Engineered a Python-based ETL pipeline to process over 120K ODK survey records, automating multi-select variable handling and generating labeled datasets, which reduced manual effort by 57%.
  • Developed SPSS-compatible outputs and automated reports, including a completeness dashboard, to monitor data quality and streamline reporting processes.
  • Designed and implemented regression models to accurately forecast respondent turnout, providing critical insights to guide targeted outreach planning.
  • Contributed significantly to a large-scale census project by developing and deploying an end-to-end ETL pipeline for cleaning and processing field data, automatically flagging duplicates, and assigning records for quality assurance, enhancing data integrity.
  • Created Excel scripting logic for ODK XLSForms, incorporating advanced skip logic, validations, and calculations to improve data collection efficiency and accuracy.

Education

Data Science

Christ University, Pune Lavasa Campus

8.44/10

Aug 2023 - Jun 2025

Pune, Maharashtra, IN

Statistics

University of Calcutta

7.856/10

Mar 2020 - Jun 2023

Kolkata, West Bengal, IN

Projects

Forecasting U.S. Energy Production (Time Series Modeling)

Jan 2024 - Jun 2024

Developed and evaluated advanced time series models to predict U.S. energy production, focusing on accuracy and model tuning.

SDG Goal 4: Power BI Dashboard for Education Insights

Jan 2024 - Jun 2024

Designed and implemented an interactive Power BI dashboard to visualize global education metrics, enabling data-driven insights.

Retail Sales Prediction (Regression)

Sep 2023 - Dec 2023

Developed a regression model to predict retail store sales based on various influencing factors, achieving high accuracy.

Skills

Statistical Analysis & Modeling

  • Regression
  • Hypothesis Testing
  • Forecasting
  • Exploratory Data Analysis (EDA)
  • Residual Diagnostics

Machine Learning

  • Linear Regression
  • Logistic Regression
  • Decision Trees
  • Random Forest
  • SARIMA
  • Holt-Winters
  • LSTM

Programming Languages

  • Python
  • R
  • SQL

Data Visualization

  • Matplotlib
  • Seaborn
  • Power BI

Soft Skills

  • Analytical Thinking
  • Problem Solving