Weigang Liang, PhD

Logo



Cornell University (BA, Physics) University of Arizona (PhD, Planetary Science)

LinkedIn
Github

Data Science and Machine Learning Portfolio

Below is a compilation of results and highlights from my 9 years of data analysis, software engineering, and machine learning, some of which have been featured in the media (NBC, BBC, radio).


MCMC Analysis of Anomalies in Lunar Gravity Data

Analysis of the lunar gravity dataset reveals anomalous signal originating from subsurface Grand Canyon-sized structures. I used Bayesian statistics and Markov chain Monte Carlo (MCMC) in python to constrain the depth, thickness, and widths of the structures. Our analysis resulted in an increase of 60% in model precision and peer-reviewed publications in high-impact journals, as well as widespread media coverage.

Python pandas NumPy matplotlib Jupyter Markov Chain Monte Carlo

Publication 1 (Nature Geoscience)
Publication 2 (Icarus)


Automated Pattern Recognition Algorithm for Lunar Gravity Data

The traditional way of identifying meteor impact sites (craters) is to count them by hand. I designed, implemented, and optimized an algorithm using A/B testing that automatically identifies craters in lunar gravity maps, enabling an over 50% increase in the efficiency of data analysis.

A/B Testing Algorithm Development Pattern Recognition Feature Engineering Automated Detection MATLAB

Publication (JGR: Planets)
Github Repository


Photometric Analysis of Mars Rover Image Data

I significantly (five-fold) enhanced an ETL multi-platform data/image processing pipeline as part a NASA collaboration, using a variety of software languages. This pipeline, capable of parallelized image alignment and sophisticated processing of raw images to generate terrain and solar angle maps, facilitates the analysis of martian rover images spanning multiple martian years. The results of our analysis, using nonlinear regression methods, greatly constrained the mineral composition of the martian surface. This pipeline has consistently served NASA’s data processing needs, remaining in active use for over 7 years.

ETL Machine Learning Nonlinear Regression Model Validation Data Pipeline Design Cross-functional Collaboration MATLAB Perl

Publication 1
Publication 2
Publication 3
Unfortunately, the code has not yet been cleared for public release.


Graphical User Interface (GUI) Development to Analyze Mars Rover Data

As part of a NASA collaboration, I designed and developed “new_cv24,” a novel GUI software capable of loading and processing Photometry QUBs – extensive data files consisting of ~60 Mars rover images. The software empowers users to overlay images, select regions of interest, and assemble information from all images into an ASCII file, greatly enhancing data analysis capabilities. This not only expedited various data analysis subroutines into a singular, streamlined interface, but also significantly improved user operational efficiency.

Data Visualization GUI Design Image Processing MATLAB Data Extraction and Manipulation

Used in the photometric analysis publications
Unfortunately, the code has not yet been cleared for public release.


Crater Depth Prediction using Supervised Learning, Apache Spark, and Docker

The Moon has countless craters (left image) whose depths vary significantly. Here, I build machine learning models that incorporate linear regression, decision trees, random forests, and gradient boosting to predict the depth of a lunar crater if the diameter, as well as the location of the crater, are known. In this project, I use Apache Spark as the main base from which machine learning is conducted. After the project is finalized, I used Docker to consolidate the pipeline into a distributable application. The model achieved a reasonable R^2 of 0.7 as well as a RMSE of 0.4 log(m), showing the reliability of the model in estimating the depths of lunar craters.

Apache Spark Docker Supervised Learning Linear Regression Decision Trees Random Forests Gradient Boosting

Project Code
Docker Image

Crater Detection using Deep Learning/Neural Networks and AWS (S3, SageMaker)

A study of using deep learning to detect craters within the lunar gravity dataset using the Amazon AWS environment. Transfer learning using the MobileNetV2 learning weights was utilized. The resulting accuracy of crater prediction was 91%, which is an improvement over the A/B algorithm described below by 5%. As only one instance of training was needed, the subsequent crater predictions is much faster than the A/B algorithm, resulting in a further 50% increase in the efficiency of data analysis.

AWS Deep Learning Neural Networks S3 SageMaker Transfer Learning Convolutional Neural Network (CNN)

Markdown Code


Numerical Modeling of Heat and Topographical Diffusion on the Moon

Analysis of the lunar gravity data shows a lack of craters in the nearside. We model heat and topographical diffusion to the farside highlands as hypothetical processes that could have resulted in the deficit. The results reveals the potential thermal erasure of a structure at an scale (500 km width x 3 km height) that is the largest of its kind in the entire Solar System.

Problem Solving Numerical Modeling Data Manipulation


Github Repository