❄️ Snowflake Developer & Data Engineer

Manisha Sankalamaddi

Snowflake SnowPro Core Certified data professional with 3+ years of experience designing data pipelines, building cloud-based data warehouse solutions, and developing advanced SQL transformations for scalable analytics environments across Snowflake, AWS, and big data platforms.

Get In Touch View Projects

Projects

Taxi HDFS to Snowflake Pipeline

Built an end-to-end NYC Taxi data pipeline using HDFS, PySpark, Snowflake, S3, and Streamlit. Ingested raw trip data, transformed it into analytics-ready datasets, loaded reference data from S3, and created reporting views.

HDFS PySpark Snowflake S3 Streamlit

AWS S3 to Snowflake Products Pipeline

AWS S3 to Snowflake data pipeline for product data ingestion, transformation, and task-based automation using storage integrations, external stages, raw staging tables, curated analytics tables, and scheduled refresh tasks.

AWS S3 Snowflake External Stages Tasks

GitHub Events Snowflake Pipeline

A free-access GitHub Events to Snowflake pipeline built with Python and Snowflake. Fetches GitHub event data through authenticated REST API calls, lands data as NDJSON, loads into RAW tables, and transforms into CURATED tables.

Python Snowflake REST API NDJSON

Streamlit Cortex Dashboard

Interactive Streamlit sales analytics dashboard built with Python and Snowflake, enhanced using Cortex Code CLI for prompt-driven code updates, UI improvements, and development acceleration.

Streamlit Snowflake Cortex Python

Technical Skills

❄️

Snowflake

Snowflake Cloud Data Platform, Snowflake SQL, Stored Procedures, Streams, Tasks, External Stages, Zero-Copy Cloning, Incremental Loads, Query Optimization, Performance Tuning

⚙️

Data Engineering & Warehousing

ETL/ELT, Data Migration, Data Transformation, Data Modeling, Dimensional Modeling, Star Schema, Data Warehousing, Data Validation, Data Reconciliation, Data Quality

☁️

Cloud Platforms & Big Data

AWS S3, AWS Glue, Amazon Redshift, Amazon Athena, GCP BigQuery, Hadoop, HDFS, Apache Spark, PySpark, Spark SQL

💻

Programming, BI & DevOps

SQL, Python, Pandas, NumPy, Power BI, Tableau, KPI Reporting, Dashboard Development, Azure DevOps, Git, CI/CD, Docker, Kubernetes

Professional Experience

Data Analyst – Snowflake Developer — Integrated Proteins

Oct 2025 – Present

Designed and developed Snowflake-based ELT pipelines to migrate operational data from source systems into external stages, staging tables, and curated warehouse tables
Built and maintained advanced Snowflake SQL stored procedures for data ingestion, transformation, business-rule processing, and recurring load execution
Implemented incremental load workflows using Snowflake Streams and Tasks, enabling efficient processing of newly added and changed records
Automated recurring data workflows through Snowflake Tasks with CRON scheduling for timely delivery of refreshed datasets
Created reporting-ready Snowflake views and transformed data layers used as the backend for Power BI dashboards
Used Zero-Copy Cloning for backup, testing, and environment replication, preserving production data integrity
Performed data validation, reconciliation, deduplication, and query optimization in Snowflake

Data Analyst — Sodexo

Jul 2024 – Oct 2025

Built and maintained AWS-based data migration and transformation workflows using Amazon S3, AWS Glue, Amazon Redshift, and Athena
Developed ETL pipelines to ingest raw data, apply transformations, and load structured datasets into Amazon Redshift
Wrote advanced SQL queries in Redshift and Athena for transformation, reconciliation, validation, and trend analysis
Used Python (Pandas, NumPy) to clean, standardize, preprocess, and automate recurring data preparation tasks
Created and maintained curated reporting tables and analytical datasets that improved consistency of KPIs
Supported incremental and scheduled data refresh workflows ensuring timely movement of updated data
Applied strong data quality controls including source-to-target validation, reconciliation, and deduplication

Data Engineer — RSoft Systems & Services Pvt. Ltd.

Jun 2021 – Jun 2023

Ingested data from databases, flat files, APIs, and system-generated files into Hadoop/HDFS for centralized storage and distributed processing
Built ingestion workflows using ETL jobs and Python scripts to land source data into raw HDFS layers
Used PySpark and Spark SQL to read data from HDFS, clean and standardize fields, apply business rules, and transform raw data into structured datasets
Processed both historical and incremental datasets in Hadoop and Spark, converting high-volume raw data into curated outputs
Wrote Spark SQL and SQL-based logic for joins, aggregations, validation, reconciliation, and deduplication across large-scale datasets
Moved transformed datasets from Hadoop/Spark layer into AWS including Amazon S3 for cloud-based analytics access
Prepared curated datasets for downstream Power BI and Tableau reporting with strong data quality and consistency