Dustin Burns, Ph.D., GStat - Senior Managing Scientist - Data Sciences

Dr. Dustin Burns is a scientist who assists clients in the healthcare and government sectors design and implement data strategies, scale data architectures, and perform complex statistical analyses. Combining his background in laboratory experiments with his expertise in data analytics and cloud computing, Dr. Burns contributes to projects along the entire data science lifecycle, from experimental design and data collection, through data quality evaluation, exploratory data analysis and cleaning, to modeling, visualization, and reporting.

Dr. Burns' scientific expertise is in the area of radiation physics and the interaction of ionizing radiation with matter. He is knowledgeable on calculations for source activity, exposure, and attenuation from shielding, including methods for radiation dosage/dosimetry, detection methods, sensors, and detection limitations and uncertainties. In addition to precision lab measurements and source characterization, Dustin can simulate the interaction of radiation with materials in products and devices. In the field, Dustin can measure background levels or localized sources of radiation in areas where there may be a concern about ionizing radiation, such as waste facilities, manufacturing and chemical plants, military bases, hospitals, and sites of environmental remediation and naturally occurring radiation including radon.

Dustin received his Ph.D. in Physics from the University of California, Davis, in the field of experimental high-energy particle and astroparticle physics. Dustin's dissertation research was based at the European Council for Nuclear Research (CERN) Large Hadron Collider (LHC), where he worked on the team that contributed to the discovery of the Higgs boson, a new fundamental particle of nature, in 2012. Additionally, Dustin is a founding member of the CRAYFIS: Cosmic RAYs Found In Smartphones (http://crayfis.io) experiment, where he helped design a crowd-sourced comic ray detector array using the cameras in smartphones.

Successfully executed a number of applied machine learning projects in different industry areas. Provided critical feedback to a publication misapplying a clustering algorithm based on t-SNE. Advised on a custom loss function to penalize false negatives for a deep learning model in TensorFlow for a prototype medical diagnostic device. Trained and evaluated machine learning models for risk prediction of smartcard issuance sites having high rates of issuance failures. Extracted feature from images and trained machine learning for regressing pig weight from facial images for a precision agriculture product, including custom evaluation metrics. Performed a cluster analysis on text logs associated with a medical device failure. Engineered features from images and 3D spatial models to train machine learning models for categorizing component wear categories based on drone imagery for a utilities company. Feature engineering from time-series data to develop machine learning classification models for categorizing outcomes of a robotic medical operation. Cluster analysis of images of diverse scenes to extract features including colors, textures, and objects to quantify representativeness. Piloted AI services on AWS for named entity recognition for PHI/PII redaction from text notes in medical records.
Managed a number of AI research projects related to AI products and the competitive landscape. Performed a literature review of published work related to mood/emotion labeling for music in the context of determining schemas used for tags for the purpose of training machine learning models, including methodologies and advantages and limitations of each.
Collected and summarized secondary research on the evolving news-creation landscape, including how publishers are updating their practices and business models in response to changing news creation and consumption patterns, focused on independent publishers and news influencers.
Collected and summarized secondary research on the market landscape related to alternative content sources for factual information that is popular among Gen Z users.
Performed desk research conducting a targeted review of industry reports, pre-existing public survey collation, and published academic papers, including systematic reviews published in the past five years for the U.S. market. The review provided 1) a top-level summary of the landscape of internet usage for health information, defining key health domains and subcategories identified, 2) characterized the underlying user journeys, and 3) provided an overview of potential data sources for web scraping or API queries to obtain relevant quantitative data (social media and non-social sites). We abstracted and reported descriptive statistics and health domains and subcategories that arose from the literature search to facilitate planning for future survey design and data collection.
Performed desk research relating to the following concepts: consumer perceptions about Digital Safety, Cybersecurity (e.g. security against bad actors/hacking), and GenAI. The review focused on academic literature in the English language within the past three to five years based on their relevance, including sources such as non-academic surveys, government sources, and preliminary analyses of opinions across social media, blogs, and online forums. Conducted a comparative review of in-product messaging from Generative AI competitors related to reassurance and transparency moments about digital safety, cybersecurity, and data privacy.
Extended on previous work to generate labels for 1,000 responses across 8 question types (8,000 total). This work provided additional labels for the 8,000 responses, including superordinate and subordinate category labels to future refine the annotations generated previously, as well as a second set of labels categorizing each response with positive or negative sentiment. In addition to annotation, Exponent provided data visualizations, showing the occurrence of labels with respect to respondent demographics.
Conducted a literature review to synthesize publications on the value proposition and key use cases for conversational AI technology and/or multimodal interfaces that include natural language input. The study outcomes provided a core understanding of the unique value of conversational interfaces, the use cases for which this tech is best suited, and a prioritization of the potential applications of convAI.
Applied statistical learning and AI methods to address traditional business problems in new ways, while incorporating the principles of security, explainability, and reducing algorithmic bias. Dustin's team links electronic health records (EHR) with various supplemental data sources to administrate the databases underlying pharmacoepidemiologic studies for post-marketing safety and effectiveness. Additionally, he performs statistical programming in the execution of these studies, particularly when complex multivariate cohort definitions, diagnoses, and medical codes are involved.
Applied AI and Machine Learning approaches to evaluate data in a number of commercial and defense settings. The commercial work includes analysis of data from healthcare, medical device, utilities, and environmental science clients. The defense work focuses on the management of large databases and analyses for Department of Defense personnel data. This work includes fundamental research on supply chain security, identity management, insider threat, and cyber threat detection and mitigation.
Conducted a literature and data review, collection of data, data analysis, and consideration of statistical techniques related to quantifying risk mitigation for suppliers along supply chains. Quantified aspects of supply chain risk including concepts of integrity, resilience, trust, and risk using knowledge graph representation and graph theoretic methods. Demonstrated supply chain robustness quantification with examples of real world supply chains and quantification of risk due to insider threat risk.
Helped establish the Long-Term Care Data Cooperative (ltcdatacooperative.org), where I am the Lead Data Scientist building a first-of-its-kind cloud architecture to aggregate and harmonize electronic health record data from 10,000+ nursing home facilities across the U.S. to enable public health surveillance, care coordination, and research, on a nationwide scale. Navigated the murky pool of health data aggregators, pharmaceutical companies, regulatory agencies, long-term care provider networks, academic research groups, and related cloud computing and software vendors. Coordinated a team of data scientists, information security and technology experts, epidemiologists, and public health professionals to build the clould platform based on Amazon Web Services (AWS) to provide researcher data access.
Led data ingestion and cloud architecture design for a large dataset of electronic health records (EHR). Conducted several pharmacoepidemiologic studies including a retrospective cohort study, a case-crossover analysis, and a trial replication study.
Integrated with a client's software and database infrastructure to assist with data cleaning and transformation prior to loading into a structured database. Utilized client's task management and version control systems to integrate with multiple client teams. Assisted incorporating Python code in client's production environment.
Led smart card durability laboratory testing tasks for several clients, including commercial card manufacturers, and including our long-standing relationship with the DoD Defense Manpower Data Center (DMDC) supporting the Common Access Card (CAC) program. I am involved in all stages of the CAC card lifecycle, beginning with ISO/IEC 7816 and ISO/IEC 10373 testing on blank and personalized cardstock from multiple vendors. I have personally spent hundreds of hours performing ISO/IEC 10373 smart card tests in Exponent's mechanical, chemical, electrical, and identity labs. Additionally, I contribute to Exponent's historical data analysis monitoring issuance and field failures of the CAC. I lead our digital smart card effort, performing ISO/IEC 14443 testing. I have personally performed over 100 hours of ISO/IEC 14443 testing on smart cards and secure identity documents. In addition to the CAC program, I have conducted secure identity document physical and digital testing for several clients.
Ingested and maintained a data warehouse to preserve historical records and update the status of active personnel for over 20 years. Data transfer is performed over a secure network connection from the client's relational database servers to our in-house SAS database workstation. In this relational database structure, various queries are made to extract rich subsets of the data for specific analyses. Our team of data scientists and statisticians performs recurring monthly analyses in both SAS and Python to track trends over time which give key insight into the agency's personnel records and field-deployed equipment. These analyses include tracking of hardware failure modes in the field over time and predictive analytics with machine learning techniques to reduce overall failure rates. Our contributions have resulted in millions of dollars in saved costs and a deeper understanding of the clients multifactor data assets through data-driven reporting and data visualizations.
Aggregated medical data from 100+ hospitals for thousands of events where a medical device malfunctioned and further analysis was required for FDA compliance. This data included patient records, medical device logs and investigation reports, hospital training records, device testing reports, and included interfacing with the client's database endpoints. Data files were exported from the client's enterprise data management system and transmitted to Exponent via secure network file transfer. Each data type had a custom ETL pipeline including time-series analysis of log files and text analysis of interviews and investigation reports. After the data was collected and processed, a multivariate analysis was conducted to extract new insights into the failure investigations, including machine learning modeling, text mining and natural language processing, and application of advanced statistical methods. For example, a correlation analysis was performed to understand the relationship between recurring errors in device log files and the propensity for a device to cause harm to a patient. Additionally, a cluster analysis was performed using NLP techniques to discover patterns in the text of investigation reports that may be related to hidden similarities in device failures. The analysis resulted in a report which elucidated underlying patterns and correlations not initially apparent to investigators.
Dore DD, Kyle M, Burns D, Murray CR, Watson H, Swaney J, Spevack S, Leonhard M, Simon M, Moynihan E, Lapane KL, Wang SV, Longo CL, Ritchey ME. Cardiovascular safety of fixed-dose extended-release naltrexone/bupropion in clinical practice. Obesity Pillars. 2025;100169. ISSN 2667-3681. https://doi.org/10.1016/j.obpill.2025.100169.
Dore DD, Myles L, Recker A, Burns D, Murray CR, Gifford D, Mor V. The Long-Term Care Data Cooperative: The Next Generation of Data Integration. Journal of the American Medical Directors Association. 2022 Dec 1;23(12):2031-3.
Burns, Dustin. 2021. "Applications of Artificial Intelligence in Cybersecurity." In Software Engineering: Artificial Intelligence, Compliance, and Security, edited by Brian D'Andrade. New York: Nova Science Publishers.
Measurements of properties of the Higgs boson decaying into the four-lepton final state in pp collisions at √s = 13 TeV. By CMS Collaboration (Albert M Sirunyan et al.). arXiv:1706.09936 [hep-ex].
Search for associated production of dark matter with a Higgs boson decaying to bb or γγ at √s = 13 TeV. By CMS Collaboration (Albert M Sirunyan et al.). arXiv:1703.05236 [hep-ex].
Les Houches 2015: Physics at TeV colliders - new physics working group report. By G. Brooijmans et al.. arXiv:1605.02684 [hep-ph].
Searching for ultra high-energy cosmic rays with smartphones. By Daniel Whiteson, Michael Mulhearn, Chase Shimmin, Kyle Cranmer, Kyle Brodie, Dustin Burns. arXiv:1410.2895 [astro-ph.IM]. 10.1016/j.astropartphys.2016.02.002. Astropart.Phys. 79 (2016) 1-9.
For a complete listing of publications, see https://inspirehep.net/literature?sort=mostrecent&size=25&page=1&q=Dustin%20Burns
Presentations
Burns D. Technical Introduction to Large Language Models: What is "GPT" in ChatGPT?. Insider Threat Summit (ITS) NOOK. Monterey, CA. 2024.
Burns DR, Kyle ML, Rogers Murray C, Watson H, Swaney JR, Spevack S, Leonhard M, Simon MA, Moynihan E, Lapane KL, Wang S, Longo C, Ritchey ME, Dore D. Evaluating the Cardiovascular Safety of Naltrexone/Bupropion in the United States. Poster presented at: The 40th International Conference on Pharmacoepidemiology. 2024 August 24-28. Berlin, Germany.
Burns D, Artificial Intelligence with Large Language Models. Engineering Professional Development Seminar. Exponent, Menlo Park, CA. 2024.
Burns D. Applying Physics Reasoning in Epidemiology. PQHS Research Meeting, UMass Chan Medical School Department of Population and Quantitative Health Sciences. Worcester, MA. 2024.
Burns D, et al., Developing a Career at Exponent. Engineering Professional Development Seminar. Exponent, Menlo Park, CA. 2024.
Burns D. How a Former Physicist Thinks About Epidemiology. UC Irvine Physics Department Colloquium. Irvine, CA. 2023.
Simon MA, Lapane K, Khan AM, Spevack SC, Burns DR, Rogers Murray C, Gifford D, Mor V, Dore DD. Novel Data and Lag-times within the EMR Provided by the Long-Term Care (LTC) Data Cooperative. Poster presented at: The 39th International Conference on Pharmacoepidemiology. 2023 August 23-27. Halifax, Nova Scotia, Canada.
Spevack SC, Khan AM, Malone J, Burns DR, Rogers Murray C, Cleary EI, Lapane K, Dore DD. Falling in Love with Nursing Home Electronic Health Records (EHR): Greater Understanding of the Nature of Falls from Harnessing Information in Nurses' Notes. Poster presented at: The 38th International Conference on Pharmacoepidemiology. 2022 August 24-28. Copenhagen, Denmark.
Careers in Consulting — Why Physicists Make the Best Data Scientists, University of California, Davis, Physics Department Alumni Seminar, Davis, CA. 2022.
Burns D. Can AI Predict Human Behavior? The Defense and Aerospace Test and Analysis Workshop (DATAWorks). Washington, D.C. 2020.
Burns D. Can AI Predict Human Behavior? Insider Threat Summit 6 (ITS6). Monterey, CA. 2020.
Burns D, Artificial Intelligence and Machine Learning. Engineering Professional Development Seminar. Exponent, Menlo Park, CA. 2020.
Groves E, Phan S, Burns D, Big Data and Analytics: What Does it Mean at Exponent? Engineering Professional Development Seminar. Exponent, Menlo Park, CA. 2020.
Burns D. Applications of AI in Cybersecurity. Open Data Science Conference (ODSC), West. San Francisco, CA. 2019.
Burns D. Demystifying AI in Cybersecurity. Insider Threat Summit 5 (ITS5). Monterey, CA. 2019.
Burns D on behalf of CMS. A collider search for Dark Matter produced in association with a Higgs boson in the four-lepton final state at the 13 TeV LHC with CMS. US LHC Users Association Meeting. Lawrence Berkeley National Laboratory, Berkeley, CA. 2-4 Nov 2016.
Burns D on behalf of CMS. A collider search for Dark Matter produced in association with a Higgs boson in the four-lepton final state at the 13 TeV LHC with CMS. APS Far West Section Meeting, University of California, Davis, Davis, CA. 29 October 2016.
Burns D on behalf of CRAYFIS. Cosmic RAYs Found In Smartphones. Grad Slam research presentation competition. Grad Studies, UC Davis, Davis, CA. 1 March 2016.
Mobs E, Meyer C, Burns D, Agapitos A, Koutava I, Kokabi A. CMS Create: Jumping for particles. Report and product demonstration at: 2015 CMS Create Competition. Ideasquare, CERN. 20 November 2015.
Burns D. Dark matter search via mono-Higgs → ZZ → four leptons. Report presented at: CMS Exotica Workshop 2015. Venice, Italy. 13 November 2015.

Dustin Burns

Dustin's recent insights

Is Your Data Ready for AI?

What Can Linked Electronic Health Records and Death Data Tell Us About Drug Safety?

Exponent Joins U.S. AI Standards Committee

FDA Clarifies Guidance on Use of Real-World Data

What Do 1,000 Nursing Homes (and Counting) Have In Common?

Project Experience

Publications