R programmer/Statiscian
Summary
I am a R programmer with 10+ years professional experience. I also program in Python,SQL and SAS. I am currently have a role where I perform statistical analysis.
Education
Worcester Polytechnic Institute, Applied Statistics, MS
University of Connecticut, Applied Math MS
University of Connecticut, Applied Math BS
ACADEMIC ACHIEVEMENTS:
Masters Thesis: Computational Fluid Dynamics (CFD): solved a set of Navier-Stokes equations for a 2d
Senior Thesis: Non-Linear Dynamics, a computational model that was coded in both FORTRAN and C where
Masters Thesis: Analysis of Array Complete Genome Hybridization using Nimblegen data for Cancer
horizontally stratified fluid flow. A survey of numerical algorithms was conducted to determine the most
accurate approximation. Fortran 90 was used to provide the approximation and Matlab was used to produce the
graphs.
used to analyze a three dimensional system of non-linear system of differential equations. The magnitude and
frequency of the forcing function were varied along with the initial position to determine the effect on the out
come of the system. Portions of the thesis were submitted to the Department of Energy.
genomics. Utilized R and Bioconductor to perform the analysis and plotting. Developed C++ application to
provide users with tool to analyze and explore the data.
Experience
InVentiv Health, Eli Lilly: (Jan 2012 – present.)
Automated SAS routines for repeated measures analysis in R. Deployed the programs in a web environment to
Monte Carlo techniques in R to evaluate the impact of dosing levels and number of animals. This provided
Established routines for next gen sequence pipeline. Evaluating RNA count results. Used both R and SAS to
Mastery level of R for data analysis and plotting. Utilizing ggplot and Rmarkdown for professional presentation
Provided web environment using R studio and shiny for presentation of gene expression results. Allowed easy enable scientists to evaluate tumor growth results. results to determine impact of time and animals used on ED50 error.
evaluate data.
of results.
analysis of large spreadsheets of data with both summary and graphical output.
AT&T Research: (Florham Park NJ) (Nov 2010 – Jan 2012)
Developed a research prototype system for analysis of hourly Optima data using Oracle, R and FastRWeb.
Implemented a system using R via Staterver to provide quality analysis of the forecasting process and resulting
Implemented a system using Splus via Statserver for forecasting of engineering mobility demand.
Working to expand the system to provide analysis of Optima hourly time series.
forecasts. The development effort included Java, Javascript, Oracle and R.
Smith Hanley: Becton Dickenson (NJ & NC) (Sep 2009 – Oct 2010)
Performed analysis of insulin degradation using R image analysis routines.
Asisted with a clinical trial to analyze wearability of various insulin devlivery devices.
Performed analysis of various insulin pumps and delivery devices.
Integrated SAS and R routines to automated Deming regression analysis. Using SQL in SAS.
Expanded the scope of the international data collection methods for hospital qc and tracking of BD products.
Abbott Laboratories (Worcester, MA) (Nov 2007 – Sep 2009)
Installed and configured web environment (Apache/Tomcat). Using JSP and beginning to use the Spring
framework for Java development. Installed 64 bit R on HPC grid as computational backend, interfaced to SAS
clinical environment for SAS web deployment. This has enabled all analysis (in R and SAS) to be deployed on
the web. Using unix scripting. This established an enterprise data mining system.
counts.
Wrote SAS routines to evaluate fitting errors to determine optimal animal dosing configuration and animal
Wrote SAS routines using Monte Carlo simulation to evaluate assay precision profiles.
Developed routines in R to produce optimal gene signatures from gene expression. The procedures can also be
Automated analysis and graphic results for RNAi high throughput results. This was done to compare various hit
Developed new Kinase endpoints for chemical compound analysis. Functions written in R to generate data and
Used VBA to automate analysis templates for plate level analytics.
Used Matlab to compute large generalized inverses of the model matrix.
Developed Routines in R and protocols in pipeline Pilot to analyze domain applicability.
Used Pipeline Pilot webport to deploy web-based solutions based on analysis written in R.
used for Chemical structure analysis for QSAR. The R functions allowed dynamic cross validation and model
selection based on user input parameters. Allowed data mining QSAR.
selection methods against this and other HTS data to evaluate selection techniques.
intermediate results store in a database for flexible retrieval. Visualizations performed in Spotfire.
Pfizer Inc (Groton, CT) (Jan 2003 – Nov 2007)
Worked in clinical group to support data analysis for worldwide use of lipitor. Cleaned data and loaded results
Supported late stage Safety sciences with SAS data analysis of ECG’s for FDA filings.
Developed Java based Web environment for large-scale visualization of statistical models to support the SAR
Enhanced Affyermetrix computational platform to improve throughput and computational accuracy. Enhanced
Utilized linear model theory on large number of R-groups to advanced predictions on Virtual libraries. Over the
into clinical systems.
efforts of computational chemistry. Utilizing Java Randomforest programs along with swing and JavaScript for
improved interactivity. This provided large scale data mining using RandomForest.
R routines and interfaced to platform LSF. Data mining of gene data.
next year will integrate into large-scale application to allowing viewing of various libraries via Pfizer global
virtual library system and custom Java viewers.
Deployed repeated measure analysis in SAS under Zope to analyze data for micro-dialysis analysis. Python
Converting perl-cgi original r-statserver to zope. In addition used R to perform repeated measure analysis
Productionalized an existing set of Matlab routines and established a Matlab GUI. This allowed the user
Using RandomForest under R a deployed R-Group visualization within spotfire and established modeling
Converted excel application to Matlab for analysis of the effect of Cyclodextrin in increasing oral bioavailabity.
Excipient screen blending analysis of Filler/Disinginrant interaction with API. Process originally done by hand
Performing validation of Partial Least Squares equations used to provide estimates for material assessment
integration.
instead of SAS and develop an interactive user environment. This was an initial step in establishing an
enterprise server application using Zope along with R and SAS to integrate all aspects on pharmaceutical
discovery including Computational Chemistry, Biomarkers, Pharamacogenomics, Safety and manufacturing.
This platform can utilize the LSF grid and other advanced computational features of Pfizer’s infrastructure.
Python and Perl.
community to perform their own analysis.
environment to support hit to lead and library advancement for closed loop Chemistry. Migrated the application
to utilize Java based programs to allow larger training and prediction sets.
The conversation enables the scientist to analyze large amount of data within hours. The Matlab application was
enhanced under collaboration between Pfizer and Northwestern University.
using SAS and Splus, currently automating using MS Excel to JMP with final results automated to MS Word.
protocol of direct compression results. Combining results from various lab equipment to determine accuracy of
PLS, estimates for the blended materials used to create the tablet. Using both JMP and SAS to perform
validation and generate results.
Activex controls available from Quality America and the R statistics package. This was done to support the Six
Sigma effort within Pfizer.
Created an SPC application for prototype manufacturing. The final implementation was done with excel and
Assisted with a Gage R & R analysis of material hardness.
Neurogen Corporation (Branford, CT) (Apr 2001 – Oct 2002)
Support Oracle enterprise informatics system. Data integration and support for HTS, Computational Chemistry,
My responsibilities in the biology area, included data integration and automated validation for both dose
Support and validation of Computational Chemistry modeling and data content. This was done using R and a
Established a data tracking and input validation mechanism for PK data. Automated Excel spreadsheet so that
Provided advanced statistical analytical support for gene expression analysis. Utilizing Perl , BioPerl , Oracle
Non-Linear regression analysis of Dose Response data. Non-linear estimates of Dose response to estimate
Evaluated Oracle-Clinical to determine if there could be a cost reduction with the current outsourcing of clinical
PK and toxicogenomics. Utilizing PL/SQL, Perl, JMP, R, SAS, Java(JSP,JDBC) and Spotfire. Across multiple
platforms. Conducted individual and group sessions to deploy software and ensure proper use and
understanding of information content. Unix scripting.
response and primary screening. QC/QA data analysis and visualization. Data integration and automation of
various labs. Using a direct mix of Perl and Oracle via a web page to provide this information.
SAS interface to Oracle with Perl as the connection mechanism and JMP on the client PC.
the PK dosing could be tracked and integrated directly into Oracle, using VBA and Oracle stored procedures.
and JMP. The Expression analysis was generated with utilizing a predefined false positive cut-off. I also
developed an analytical environment using Oracle and JMP that provided dynamic visualization and data
mining. This allowed various chemical compounds to be investigated simultaneously.
chemical confirmations at the ligand-binding site. Using R and Oracle.
studies.
Prior professional experience: 1982-2001:
FLEET FINANCIAL GROUP:
Oracle & Perl system development on UNIX and VAX.
JOHNSON & JOHNSON:
Developed a profitability management system in COGNOS and Oracle.
FLEET FINANCIAL GROUP:
Optimized a data warehouse to support finance and planning.
TRAVELERS INSURANCE CORPORATION:
Integrated SAS and FCOUS to provide automated analysis.
The HARTFORD INSURANCE GROUP:
Supported Research group using SAS, Oracle and FOCUS.
KAMAN AEROSPACE:
Supplied QC metrics and analysis for Quality group.
HASBRO TOY COMPANY:
Developed global planning systems using Oracle, FOCUS on unix.
SHAWMUT NATIONAL CORPORATION:
Developed and supported the needs of various departments using FOCUS and mainframe platforms.
Skills
- R, SAS, SQL, Python
Specialties
- Mathematics programming, Statisics
Spoken Languages
- English