Data Scientist
- Github page
- https://github.com/jcrosskey
Summary
I am enthusiastic about acquiring, exploring, visualizing and analyzing data; and to use data analysis to help make business decision. Currently I’m looking for a job in Greater New York area.
Education
- Department of Mathematics, Indiana University, Bloomington, IN
Doctorate of Philosophy in Statistics, GPA 3.96, 6/2011
- School for the Gifted Young, University of Science and Technology of China, Hefei, China
Bachelor of Science in Mathematics, GPA 3.47, 6/2006
Experience
Postdoctoral Associate, Computer Science and Mathematics Division, Oak Ridge National Lab Oak Ridge, TN (8/2013–1/2016)
- Designed and implemented algorithms to error correct PacBio sequencing data and assemble (meta)genomes using NGS and PacBio sequencing data.
- Developed a protein annotation program based on UniProt, and created a pipeline to analyze biological functions of genomes (UniFam).
- Performed installation, parallelization, benchmark, testing, and maintenance of bioinformatics programs on High Performance Computers such as Titan and genepool.
- Create pipelines and modified existing bioinformatics programs to analyze large metagenomic and metaproteomic microbial datasets.
- Conducted Phylogenomic and statistical analysis of biological data.
- Related skills: C++, Python, R, make, gcc, gdb, lldb, Valgrind, bash, openMP, MPI, PBS and SGE schedulers, Linux, Git, Cytoscope, HMM; familiarity with public bioinformatics resources
Postdoctoral Fellow, National Institute of Mathematical and Biological Synthesis Knoxville, TN (7/2011–7/2013)
- Established a novel mechanistic statistical model for protein evolution
- Related experience: phylogeny, population genetics, stochastic modeling, R programming, non-linear optimization
Graduate Research Assistant Indiana University, Bloomington, IN (9/2007–6/2011)
- Constructively solved a conjecture in phylogenetics using graph theory and combinatorics.
- Proved the identifiability of the most commonly used statistical models in phylogenetics.
- Constructed a two-pathway model for crossovers during meiosis; compared the model with existing models.
- Mathematical and Statistical Modeling Workshop, Raleigh, NC: Developed a pharmacokinetic model to predict the effect and to optimize dose strategy for a new antibacterial compound in development at GlaxoSmithKline.
Teaching Assistant
• Instructed/assisted classes including college algebra, calculus, mathematical analysis, numerical analysis, mathematical statistics, mathematical modeling, phylogenomics
Skills
- R (6 years of experience)
- Python (3 years)
- C++ (3 years)
- Bash
- SQL
- MongoDB
- D3.js
Specialties
- Data visualization, Scripting, Statistical data analysis
Spoken Languages
- English (Fluent), Mandarin Chinese