Bio

Mitchell Naylor is an applied machine learning professional with extensive experience in natural language processing (NLP), statistical modeling, and deep learning. Mitch currently works as a senior applied researcher at GitHub, where he leads a team of machine learning scientists to develop new experiences and improve existing features within GitHub Copilot. Mitch is also an author of Applied Causal Inference, which released in August 2023.

Mitch lives in Chattanooga, TN. Outside of work, he enjoys rock climbing, hiking, music, and traveling.

Tools

  • Technology: Python, R, Docker, Git, Flask, cloud (GCP, Azure, and AWS), Dash, Shiny, RMarkdown, SQL
  • Methods: NLP, large language models (LLMs), language modeling, model evaluation, deep learning, Bayesian statistics, causal inference, data visualization, experimental design, data manipulation
  • Libraries: PyTorch, Hugging Face (Transformers, Tokenizers, Datasets), pandas, NumPy, scikit-learn, XGBoost, spaCy, matplotlib, plotly, ggplot2, tidyverse

Experience

GitHub | Senior Applied Researcher | April 2024 - Present
  • Lead a team of 4 ML researchers in collaborations with product and engineering teams, working to develop new features and experiences across the GitHub Copilot ecosystem
  • Fine-tune large language models (LLMs) to perform various software engineering tasks
  • Evaluate new LLMs for novel tasks within GitHub Copilot
  • Develop and improve offline evaluation capabilities to measure the quality of generated code
  • Correspond with external LLM providers to identify potential collaboration opportunities
  • Key projects: Copilot Code Review (Applied Science tech lead), LLM fine-tuning for code completion
Azra AI | Lead Data Scientist | October 2019 - March 2024
  • Served as technical leader on a team whose models identify high-risk clinical findings for hundreds of thousands of patients in the US annually
  • Built, improved, and monitored NLP model pipelines in cloud-based production environments
  • Used PyTorch and Hugging Face to build deep learning models for text classification and named entity recognition
  • Designed and optimized Docker containers for efficient deployment in cloud-based production environments
  • Communicated impact and findings to stakeholders of varied technical and clinical backgrounds
  • Led applied research initiatives, with a focus on bringing promising prototypes into production
  • Assessed cutting-edge ML algorithms and bring promising research prototypes into production
  • Developed best practices for robust evaluation of model performance and minimizing sample bias in modeling data
  • Contributed to strategic planning for the development of new products and relevant ML needs
  • Streamlined processes for repeatability and scalability, including developing Python packages for internal use
  • Note: Azra AI spun out as a standalone company in January 2022, previously part of Digital Reasoning
Asurion | Data Scientist | March 2018 - October 2019
  • Applied methods from NLP, statistical inference, and operations research to solve a diverse set of business problems
  • Communicated findings to stakeholders ranging from highly technical to C-level
GEICO | Product Modeling Analyst III | June 2016 - March 2018
  • Built predictive models for customer behavior under various pricing scenarios
  • Conducted robust statistical analyses in support of pricing department

Education

Georgia Institute of Technology | M.S. Analytics | Completed December 2020

University of Tennessee | B.S. Business Analytics | Completed May 2016

Open Source + Publications

Author of Applied Causal Inference, released August 2023

Implemented MEGA into Hugging Face’s transformers library

Using Machine Learning to Accelerate Identification of Pancreatic Incidentalomas | Poster appeared at AONN+ 2023

Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff | Interpretable Healthcare in Machine Learning (IMLH) at ICML 2021 | Paper link

PsychBERT: A Mental Health Language Model for Social Media Mental Health Behavioral Analysis | IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2021 | Paper link + link to pretrained model

Non-Author Textbook Contributions

Code contributions for Transformers for Machine Learning: A Deep Dive (Kamath, Graham & Emara; Chapman & Hall, 2022); Machine Translation - Transformer vs. LSTM

Code contributions for Explainable Artificial Intelligence: An Introduction to Interpretable Machine Learning (Kamath & Liu; Springer 2021); Demo of Explainable Models

Mitchell Naylor


Bio

Mitchell Naylor is an applied machine learning professional with extensive experience in natural language processing (NLP), statistical modeling, and deep learning. Mitch currently works as a senior applied researcher at GitHub, where he leads a team of machine learning scientists to develop new experiences and improve existing features within GitHub Copilot. Mitch is also an author of Applied Causal Inference, which released in August 2023.

Mitch lives in Chattanooga, TN. Outside of work, he enjoys rock climbing, hiking, music, and traveling.

Tools

  • Technology: Python, R, Docker, Git, Flask, cloud (GCP, Azure, and AWS), Dash, Shiny, RMarkdown, SQL
  • Methods: NLP, large language models (LLMs), language modeling, model evaluation, deep learning, Bayesian statistics, causal inference, data visualization, experimental design, data manipulation
  • Libraries: PyTorch, Hugging Face (Transformers, Tokenizers, Datasets), pandas, NumPy, scikit-learn, XGBoost, spaCy, matplotlib, plotly, ggplot2, tidyverse

Experience

GitHub | Senior Applied Researcher | April 2024 - Present
  • Lead a team of 4 ML researchers in collaborations with product and engineering teams, working to develop new features and experiences across the GitHub Copilot ecosystem
  • Fine-tune large language models (LLMs) to perform various software engineering tasks
  • Evaluate new LLMs for novel tasks within GitHub Copilot
  • Develop and improve offline evaluation capabilities to measure the quality of generated code
  • Correspond with external LLM providers to identify potential collaboration opportunities
  • Key projects: Copilot Code Review (Applied Science tech lead), LLM fine-tuning for code completion
Azra AI | Lead Data Scientist | October 2019 - March 2024
  • Served as technical leader on a team whose models identify high-risk clinical findings for hundreds of thousands of patients in the US annually
  • Built, improved, and monitored NLP model pipelines in cloud-based production environments
  • Used PyTorch and Hugging Face to build deep learning models for text classification and named entity recognition
  • Designed and optimized Docker containers for efficient deployment in cloud-based production environments
  • Communicated impact and findings to stakeholders of varied technical and clinical backgrounds
  • Led applied research initiatives, with a focus on bringing promising prototypes into production
  • Assessed cutting-edge ML algorithms and bring promising research prototypes into production
  • Developed best practices for robust evaluation of model performance and minimizing sample bias in modeling data
  • Contributed to strategic planning for the development of new products and relevant ML needs
  • Streamlined processes for repeatability and scalability, including developing Python packages for internal use
  • Note: Azra AI spun out as a standalone company in January 2022, previously part of Digital Reasoning
Asurion | Data Scientist | March 2018 - October 2019
  • Applied methods from NLP, statistical inference, and operations research to solve a diverse set of business problems
  • Communicated findings to stakeholders ranging from highly technical to C-level
GEICO | Product Modeling Analyst III | June 2016 - March 2018
  • Built predictive models for customer behavior under various pricing scenarios
  • Conducted robust statistical analyses in support of pricing department

Education

Georgia Institute of Technology | M.S. Analytics | Completed December 2020

University of Tennessee | B.S. Business Analytics | Completed May 2016

Open Source + Publications

Author of Applied Causal Inference, released August 2023

Implemented MEGA into Hugging Face’s transformers library

Using Machine Learning to Accelerate Identification of Pancreatic Incidentalomas | Poster appeared at AONN+ 2023

Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff | Interpretable Healthcare in Machine Learning (IMLH) at ICML 2021 | Paper link

PsychBERT: A Mental Health Language Model for Social Media Mental Health Behavioral Analysis | IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2021 | Paper link + link to pretrained model

Non-Author Textbook Contributions

Code contributions for Transformers for Machine Learning: A Deep Dive (Kamath, Graham & Emara; Chapman & Hall, 2022); Machine Translation - Transformer vs. LSTM

Code contributions for Explainable Artificial Intelligence: An Introduction to Interpretable Machine Learning (Kamath & Liu; Springer 2021); Demo of Explainable Models