Led the software development of an updated PIV functional testing tool, designed to fully align with NIST SP 800-73-5 (2024) and SP 800-78-5 (2024) standards, including support for RSA 3072-bit cryptography where applicable. This project established a clear division of responsibilities between our tool and the GSA Card Conformance Tool (CCT), ensuring robust, standards-based testing for PIV card interface conformance. Recognizing that existing test tools are based on earlier standards (SP 800-85A-4 and SP 800-85B-4 DRAFT, aligned with SP 800-73-4) and that no published test methods yet exist for SP 800-73-5, the team authored new and modified tests to meet updated functional requirements in anticipation of future standardization. The implementation approach leveraged the GSA CCT's UI, reporting, and utilities, with codebase management in Exponent's enterprise GitHub organization, secure test data storage, and project management in Azure DevOps. The tool's architecture retained the CCT's interface and reporting features, replacing core logic with SP 800-85A-aligned tests. Development followed IT SDLC and QMS best practices, including sprint-based implementation, testing on both emulated and physical cards, and thorough documentation of results.
Successfully executed a number of applied machine learning projects in different industry areas. Provided critical feedback to a publication misapplying a clustering algorithm based on t-SNE. Advised on a custom loss function to penalize false negatives for a deep learning model in TensorFlow for a prototype medical diagnostic device. Trained and evaluated machine learning models for risk prediction of smartcard issuance sites having high rates of issuance failures. Extracted feature from images and trained machine learning for regressing pig weight from facial images for a precision agriculture product, including custom evaluation metrics. Performed a cluster analysis on text logs associated with a medical device failure. Engineered features from images and 3D spatial models to train machine learning models for categorizing component wear categories based on drone imagery for a utilities company. Feature engineering from time-series data to develop machine learning classification models for categorizing outcomes of a robotic medical operation. Cluster analysis of images of diverse scenes to extract features including colors, textures, and objects to quantify representativeness. Piloted AI services on AWS for named entity recognition for PHI/PII redaction from text notes in medical records.
Managed a number of AI research projects related to AI products and the competitive landscape. Performed a literature review of published work related to mood/emotion labeling for music in the context of determining schemas used for tags for the purpose of training machine learning models, including methodologies and advantages and limitations of each.
Collected and summarized secondary research on the evolving news-creation landscape, including how publishers are updating their practices and business models in response to changing news creation and consumption patterns, focused on independent publishers and news influencers.
Collected and summarized secondary research on the market landscape related to alternative content sources for factual information that is popular among Gen Z users.
Performed desk research conducting a targeted review of industry reports, pre-existing public survey collation, and published academic papers, including systematic reviews published in the past five years for the U.S. market. The review provided 1) a top-level summary of the landscape of internet usage for health information, defining key health domains and subcategories identified, 2) characterized the underlying user journeys, and 3) provided an overview of potential data sources for web scraping or API queries to obtain relevant quantitative data (social media and non-social sites). We abstracted and reported descriptive statistics and health domains and subcategories that arose from the literature search to facilitate planning for future survey design and data collection.
Performed desk research relating to the following concepts: consumer perceptions about Digital Safety, Cybersecurity (e.g. security against bad actors/hacking), and GenAI. The review focused on academic literature in the English language within the past three to five years based on their relevance, including sources such as non-academic surveys, government sources, and preliminary analyses of opinions across social media, blogs, and online forums. Conducted a comparative review of in-product messaging from Generative AI competitors related to reassurance and transparency moments about digital safety, cybersecurity, and data privacy.
Extended on previous work to generate labels for 1,000 responses across 8 question types (8,000 total). This work provided additional labels for the 8,000 responses, including superordinate and subordinate category labels to future refine the annotations generated previously, as well as a second set of labels categorizing each response with positive or negative sentiment. In addition to annotation, Exponent provided data visualizations, showing the occurrence of labels with respect to respondent demographics.
Conducted a literature review to synthesize publications on the value proposition and key use cases for conversational AI technology and/or multimodal interfaces that include natural language input. The study outcomes provided a core understanding of the unique value of conversational interfaces, the use cases for which this tech is best suited, and a prioritization of the potential applications of convAI.
Applied statistical learning and AI methods to address traditional business problems in new ways, while incorporating the principles of security, explainability, and reducing algorithmic bias. Dustin's team links electronic health records (EHR) with various supplemental data sources to administrate the databases underlying pharmacoepidemiologic studies for post-marketing safety and effectiveness. Additionally, he performs statistical programming in the execution of these studies, particularly when complex multivariate cohort definitions, diagnoses, and medical codes are involved.
Applied AI and Machine Learning approaches to evaluate data in a number of commercial and defense settings. The commercial work includes analysis of data from healthcare, medical device, utilities, and environmental science clients. The defense work focuses on the management of large databases and analyses for Department of Defense personnel data. This work includes fundamental research on supply chain security, identity management, insider threat, and cyber threat detection and mitigation.
Conducted a literature and data review, collection of data, data analysis, and consideration of statistical techniques related to quantifying risk mitigation for suppliers along supply chains. Quantified aspects of supply chain risk including concepts of integrity, resilience, trust, and risk using knowledge graph representation and graph theoretic methods. Demonstrated supply chain robustness quantification with examples of real world supply chains and quantification of risk due to insider threat risk.
Helped establish the Long-Term Care Data Cooperative (ltcdatacooperative.org), where I am the Lead Data Scientist building a first-of-its-kind cloud architecture to aggregate and harmonize electronic health record data from 10,000+ nursing home facilities across the U.S. to enable public health surveillance, care coordination, and research, on a nationwide scale. Navigated the murky pool of health data aggregators, pharmaceutical companies, regulatory agencies, long-term care provider networks, academic research groups, and related cloud computing and software vendors. Coordinated a team of data scientists, information security and technology experts, epidemiologists, and public health professionals to build the clould platform based on Amazon Web Services (AWS) to provide researcher data access.
Led data ingestion and cloud architecture design for a large dataset of electronic health records (EHR). Conducted several pharmacoepidemiologic studies including a retrospective cohort study, a case-crossover analysis, and a trial replication study.
Integrated with a client's software and database infrastructure to assist with data cleaning and transformation prior to loading into a structured database. Utilized client's task management and version control systems to integrate with multiple client teams. Assisted incorporating Python code in client's production environment.
Led smart card durability laboratory testing tasks for several clients, including commercial card manufacturers, and including our long-standing relationship with the DoD Defense Manpower Data Center (DMDC) supporting the Common Access Card (CAC) program. I am involved in all stages of the CAC card lifecycle, beginning with ISO/IEC 7816 and ISO/IEC 10373 testing on blank and personalized cardstock from multiple vendors. I have personally spent hundreds of hours performing ISO/IEC 10373 smart card tests in Exponent's mechanical, chemical, electrical, and identity labs. Additionally, I contribute to Exponent's historical data analysis monitoring issuance and field failures of the CAC. I lead our digital smart card effort, performing ISO/IEC 14443 testing. I have personally performed over 100 hours of ISO/IEC 14443 testing on smart cards and secure identity documents. In addition to the CAC program, I have conducted secure identity document physical and digital testing for several clients.
Ingested and maintained a data warehouse to preserve historical records and update the status of active personnel for over 20 years. Data transfer is performed over a secure network connection from the client's relational database servers to our in-house SAS database workstation. In this relational database structure, various queries are made to extract rich subsets of the data for specific analyses. Our team of data scientists and statisticians performs recurring monthly analyses in both SAS and Python to track trends over time which give key insight into the agency's personnel records and field-deployed equipment. These analyses include tracking of hardware failure modes in the field over time and predictive analytics with machine learning techniques to reduce overall failure rates. Our contributions have resulted in millions of dollars in saved costs and a deeper understanding of the clients multifactor data assets through data-driven reporting and data visualizations.
Aggregated medical data from 100+ hospitals for thousands of events where a medical device malfunctioned and further analysis was required for FDA compliance. This data included patient records, medical device logs and investigation reports, hospital training records, device testing reports, and included interfacing with the client's database endpoints. Data files were exported from the client's enterprise data management system and transmitted to Exponent via secure network file transfer. Each data type had a custom ETL pipeline including time-series analysis of log files and text analysis of interviews and investigation reports. After the data was collected and processed, a multivariate analysis was conducted to extract new insights into the failure investigations, including machine learning modeling, text mining and natural language processing, and application of advanced statistical methods. For example, a correlation analysis was performed to understand the relationship between recurring errors in device log files and the propensity for a device to cause harm to a patient. Additionally, a cluster analysis was performed using NLP techniques to discover patterns in the text of investigation reports that may be related to hidden similarities in device failures. The analysis resulted in a report which elucidated underlying patterns and correlations not initially apparent to investigators.