Cornell University (BA, Physics) University of Arizona (PhD, Planetary Science)
LinkedIn
Github
Below is a compilation of results and highlights from my 9 years of data analysis, software engineering, and machine learning, some of which have been featured in the media (NBC, BBC, radio).
Analysis of the lunar gravity dataset reveals anomalous signal originating from subsurface Grand Canyon-sized structures. I used Bayesian statistics and Markov chain Monte Carlo (MCMC) in python to constrain the depth, thickness, and widths of the structures. Our analysis resulted in an increase of 60% in model precision and peer-reviewed publications in high-impact journals, as well as widespread media coverage.

Publication 1 (Nature Geoscience)
Publication 2 (Icarus)
The traditional way of identifying meteor impact sites (craters) is to count them by hand. I designed, implemented, and optimized an algorithm using A/B testing that automatically identifies craters in lunar gravity maps, enabling an over 50% increase in the efficiency of data analysis.

Publication (JGR: Planets)
Github Repository
I significantly (five-fold) enhanced an ETL multi-platform data/image processing pipeline as part a NASA collaboration, using a variety of software languages. This pipeline, capable of parallelized image alignment and sophisticated processing of raw images to generate terrain and solar angle maps, facilitates the analysis of martian rover images spanning multiple martian years. The results of our analysis, using nonlinear regression methods, greatly constrained the mineral composition of the martian surface. This pipeline has consistently served NASA’s data processing needs, remaining in active use for over 7 years.
Publication 1
Publication 2
Publication 3
Unfortunately, the code has not yet been cleared for public release.
As part of a NASA collaboration, I designed and developed “new_cv24,” a novel GUI software capable of loading and processing Photometry QUBs – extensive data files consisting of ~60 Mars rover images. The software empowers users to overlay images, select regions of interest, and assemble information from all images into an ASCII file, greatly enhancing data analysis capabilities. This not only expedited various data analysis subroutines into a singular, streamlined interface, but also significantly improved user operational efficiency.

Used in the photometric analysis publications
Unfortunately, the code has not yet been cleared for public release.
The Moon has countless craters (left image) whose depths vary significantly. Here, I build machine learning models that incorporate linear regression, decision trees, random forests, and gradient boosting to predict the depth of a lunar crater if the diameter, as well as the location of the crater, are known. In this project, I use Apache Spark as the main base from which machine learning is conducted. After the project is finalized, I used Docker to consolidate the pipeline into a distributable application. The model achieved a reasonable R^2 of 0.7 as well as a RMSE of 0.4 log(m), showing the reliability of the model in estimating the depths of lunar craters.
A study of using deep learning to detect craters within the lunar gravity dataset using the Amazon AWS environment. Transfer learning using the MobileNetV2 learning weights was utilized. The resulting accuracy of crater prediction was 91%, which is an improvement over the A/B algorithm described below by 5%. As only one instance of training was needed, the subsequent crater predictions is much faster than the A/B algorithm, resulting in a further 50% increase in the efficiency of data analysis.
Analysis of the lunar gravity data shows a lack of craters in the nearside. We model heat and topographical diffusion to the farside highlands as hypothetical processes that could have resulted in the deficit. The results reveals the potential thermal erasure of a structure at an scale (500 km width x 3 km height) that is the largest of its kind in the entire Solar System.
