Machine Learning Engineer with 7 years of experience in direct, customer-facing engagements and shipping data science and machine learning products. Led teams in adopting engineering best practices for code review and CI/CD pipelines and drove transition to enterprise technologies to save in operation costs, reduce production bugs, and accelerate feature processing.
- Deployed first ACH categorization model for Cash App Banking, reducing uncategorized transactions by 50% and increasing tracked payroll deposit volume by 30%. The improved income categorization has allowed us to make more profitable loans by sizing loan offers to expected ability to repay.
- Led 8 data scientists in building machine learning models to detect and stop fraud.
- Drove adoption of engineering best practices by the team, including implementation of peer review for code changes and automated correctness checking, building of CI/CD pipelines for code and model deployment, and added metrics around test coverage and code health, reducing number of P0 production bugs from 2 in the first year to 0.
- Deployed the first in-product, real-time account takeover prevention model at Intuit—launched in production in TurboTax and alerted security to a possible breach within the first week of running. Back-testing showed it would have detected 95% of last year’s stolen tax return downloads.
- Drove migration of machine learning models from Intuit’s on-prem data center to AWS platform without interruption of services, saving $1.5M per year in operation costs.
- Improved the TurboTax Online account takeover model leading to a 90% reduction in wrongly challenged users, stopping 10X as many fraudsters, and shortening feature processing time from 2 hours to under a second.
- Led a team of 3 engineers in investigating the latest computer vision techniques for vehicle re-identification using deep learning and develop a system within 6 months that enabled clients to automatically detect the same vehicle across multiple videos from security cameras. Handed over the new system to customer’s internal development team and provided training.
- Worked as part of a team of 3 scientists to develop an embedding technique to train a convolutional neural network on unlabeled, open-source image data. Built a system using TensorFlow that learned to embed images and text into a joint vector space, allowing customers to perform content-based image retrieval on a corpus of 100M untagged images.
- Designed and implemented a recommender system evaluation framework in Python and Spark and leveraged it to develop a Python-snippet recommender using word embeddings.
Python, Scala, SQL, shell script, C++, LaTeX
Sagemaker, NumPy, SciPy, Matplotlib, Tensorflow, Pandas, Spark, git, Linux, vim