Data Scientist

Resume posted by jjcrosskey in Finance.
Desired salary: $100,000.00
Desired position type: Full-Time
Location: Verplanck New York, United States

Contact jjcrosskey


I am enthusiastic about acquiring, exploring, visualizing and analyzing data; and to use data analysis to help make business decision. Currently I’m looking for a job in Greater New York area.


  • Department of Mathematics, Indiana University, Bloomington, IN

    Doctorate of Philosophy in Statistics, GPA 3.96, 6/2011

  • School for the Gifted Young, University of Science and Technology of China, Hefei, China

    Bachelor of Science in Mathematics, GPA 3.47, 6/2006


Postdoctoral Associate, Computer Science and Mathematics Division, Oak Ridge National Lab Oak Ridge, TN (8/2013–1/2016)

  • Designed and implemented algorithms to error correct PacBio sequencing data and assemble (meta)genomes using NGS and PacBio sequencing data.
  • Developed a protein annotation program based on UniProt, and created a pipeline to analyze biological functions of genomes (UniFam).
  • Performed installation, parallelization, benchmark, testing, and maintenance of bioinformatics programs on High Performance Computers such as Titan and genepool.
  • Create pipelines and modified existing bioinformatics programs to analyze large metagenomic and metaproteomic microbial datasets.
  • Conducted Phylogenomic and statistical analysis of biological data.
  • Related skills: C++, Python, R, make, gcc, gdb, lldb, Valgrind, bash, openMP, MPI, PBS and SGE schedulers, Linux, Git, Cytoscope, HMM; familiarity with public bioinformatics resources

    Postdoctoral Fellow,  National Institute of Mathematical and Biological Synthesis Knoxville, TN (7/2011–7/2013)

  • Established a novel mechanistic statistical model for protein evolution
  • Related experience: phylogeny, population genetics, stochastic modeling, R programming, non-linear optimization

    Graduate Research Assistant Indiana University, Bloomington, IN (9/2007–6/2011)

  • Constructively solved a conjecture in phylogenetics using graph theory and combinatorics.
  • Proved the identifiability of the most commonly used statistical models in phylogenetics.
  • Constructed a two-pathway model for crossovers during meiosis; compared the model with existing models.
  • Mathematical and Statistical Modeling Workshop, Raleigh, NC: Developed a pharmacokinetic model to predict the effect and to optimize dose strategy for a new antibacterial compound in development at GlaxoSmithKline.

    Teaching Assistant

    • Instructed/assisted classes including college algebra, calculus, mathematical analysis, numerical analysis, mathematical statistics, mathematical modeling, phylogenomics


  • R (6 years of experience)
  • Python (3 years)
  • C++ (3 years)
  • Bash
  • SQL
  • MongoDB
  • D3.js


    Data visualization, Scripting, Statistical data analysis

Spoken Languages

    English (Fluent), Mandarin Chinese