Bio

Mitchell Naylor is an applied machine learning professional with extensive experience in natural language processing (NLP), statistical modeling, and deep learning. Mitch currently works as a senior applied researcher at GitHub, where he leads a team of machine learning scientists to develop new experiences and improve existing features within GitHub Copilot. Mitch is also an author of Applied Causal Inference, which released in August 2023.

Mitch lives in Chattanooga, TN. Outside of work, he enjoys rock climbing, hiking, music, and traveling.

Tools

Technology: Python, R, Docker, Git, Flask, cloud (GCP, Azure, and AWS), Dash, Shiny, RMarkdown, SQL

Methods: NLP, large language models (LLMs), language modeling, model evaluation, deep learning, Bayesian statistics, causal inference, data visualization, experimental design, data manipulation

Libraries: PyTorch, Hugging Face (Transformers, Tokenizers, Datasets), pandas, NumPy, scikit-learn, XGBoost, spaCy, matplotlib, plotly, ggplot2, tidyverse

Experience

GitHub | Senior Applied Researcher | April 2024 - Present

Lead a team of 4 ML researchers in collaborations with product and engineering teams, working to develop new features and experiences across the GitHub Copilot ecosystem
Fine-tune large language models (LLMs) to perform various software engineering tasks
Evaluate new LLMs for novel tasks within GitHub Copilot
Develop and improve offline evaluation capabilities to measure the quality of generated code
Correspond with external LLM providers to identify potential collaboration opportunities
Key projects: Copilot Code Review (Applied Science tech lead), LLM fine-tuning for code completion

Azra AI | Lead Data Scientist | October 2019 - March 2024

Served as technical leader on a team whose models identify high-risk clinical findings for hundreds of thousands of patients in the US annually
Built, improved, and monitored NLP model pipelines in cloud-based production environments
Used PyTorch and Hugging Face to build deep learning models for text classification and named entity recognition
Designed and optimized Docker containers for efficient deployment in cloud-based production environments
Communicated impact and findings to stakeholders of varied technical and clinical backgrounds
Led applied research initiatives, with a focus on bringing promising prototypes into production
Assessed cutting-edge ML algorithms and bring promising research prototypes into production
Developed best practices for robust evaluation of model performance and minimizing sample bias in modeling data
Contributed to strategic planning for the development of new products and relevant ML needs
Streamlined processes for repeatability and scalability, including developing Python packages for internal use
Note: Azra AI spun out as a standalone company in January 2022, previously part of Digital Reasoning

Asurion | Data Scientist | March 2018 - October 2019

Applied methods from NLP, statistical inference, and operations research to solve a diverse set of business problems
Communicated findings to stakeholders ranging from highly technical to C-level

GEICO | Product Modeling Analyst III | June 2016 - March 2018

Built predictive models for customer behavior under various pricing scenarios
Conducted robust statistical analyses in support of pricing department

Open Source + Publications

Implemented MEGA into Hugging Face’s transformers library

Using Machine Learning to Accelerate Identification of Pancreatic Incidentalomas | Poster appeared at AONN+ 2023

Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff | Interpretable Healthcare in Machine Learning (IMLH) at ICML 2021 | Paper link

PsychBERT: A Mental Health Language Model for Social Media Mental Health Behavioral Analysis | IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2021 | Paper link + link to pretrained model

Non-Author Textbook Contributions

Code contributions for Transformers for Machine Learning: A Deep Dive (Kamath, Graham & Emara; Chapman & Hall, 2022); Machine Translation - Transformer vs. LSTM

Code contributions for Explainable Artificial Intelligence: An Introduction to Interpretable Machine Learning (Kamath & Liu; Springer 2021); Demo of Explainable Models

Mitchell Naylor

Bio