Data Scientist/Data Analyst
Experienced Research Assistant with a demonstrated history of working in the higher education industry. Skilled in R, Exploratory Data Analysis, KPI, RShiny Dashboards and Microsoft Excel. Strong research professional with a Master’s Degree focused in Industrial Engineering (specialization Statistics) from Ira A. Fulton Schools of Engineering at Arizona State University.
Arizona State University | MS in Industrial Engineering | Specialization in Industrial Statistics
Veermata Jijabai Technological Institute | BTech in Production Engineering
Enterprise Marketing Hub April 2016 – December 2017
Knowledge and Insights Assistant
- Summarized alumni data to help generate actionable insights, such as interests and communication preferences. Used R libraries like dplyr, reshape2, ggplot2, and apply functions. In addition, built custom functions to detect outliers and missing values.
- Published an interactive web application using shiny and ggplot2 that showed media channel interests for each alumni segment and age group combination. The app has been used for identifying target media channels for each audience. Link to app
- Implemented factor analysis to scale down 166 observed variables into 25 broader categories. Percentiles were calculated and plots were made for each category for each target audience. This allowed us to identify the interests of our target audiences.
- Used natural language processing on survey data to analyze, what benefits alumni are aware of at their current membership level. Responses were split into bi-grams and term frequency was used as a performance metric, to analyze alumni responses. Findings were presented, in the form of network diagrams and were published as interactive web applications, using R shiny and ggplot2. Link to app
- Presented summary reports through quality and reproducible documents generated with Rmarkdown and PowerPoint.
- Trained and certified to work with FERPA protected data.
House Price Prediction in King County, WA Using Regression January 2017 – May 2017
- A polynomial regression model of order two and prediction accuracy of 79.43% was built to predict house prices in King County, WA in R.
- Data pre-processing involved scaling down the categorical variables, zip code and year built, from 57 and 115 levels, to 4 and 6 levels respectively.
- This was achieved by categorizing zip code into regions (North, South, East, West) and year built into broader categories like 1901-1920, 1921-1940 and so on.
- Implemented log transformations to deal with violations like non-constant variance and non-normality.
- Backward elimination was used to identify significant factors, which contributed to better prediction.
Database Management System for Inventory Management at Blue Bell January 2017 – May 2017
- An effective database management system was built using MySQL and MS Excel to address many data-driven decisions for Blue Bell Corporations, a large apparel manufacturer, following their huge investment in automated manufacturing operations
- A simple graphical user interface was created to keep tab on various performance metrics like best sellers, premium customers, top trends, customer orders, sales, inventory and production
- An enhanced E-R model was built which was efficient in terms of data redundancies and data anomalies. The performance was further improved by normalization and taking care of functional dependencies
- : R, MySQL, Exploratory data analysis, KPI, Rshiny dashboards, Predictive Modelling, Data Mining, Machine Learning, MS Office, Descriptive and Inferential Statistics