Hi, this is Muyu Pan. I am an undergraduate student at The Pennsylvania State University, Schreyer Honors College, pursuing a B.S. in Computer Science with a minor in Mathematics. I have the privilege of conducting research under the mentorship of Prof. Rui Zhang and Prof. Mahfuza Farooque.
My research interest covers AI Safety (Hallucination Detection and Mitigation) and Applied AI (LLMs for decision making, Robotic Learning, and AI for Engineering). I am currently working on utilizing model internal signals to mitigate model hallucination through GRPO RL. I aim to pursue Safe and Robust Machine Learning frameworks to bridge the gap between academic innovation and real-world utility in the long run. I am always happy to chat and discuss potential collaborations. Feel free to contact me via email at mfp5696 AT psu.edu.
Publications
Analysis of Automated Theorem Proving through Abstraction of Logical Expressions
Published in National Conference on Undergraduate Research (NCUR), 2025
Fine-Tuned Large Language Models for Logical Translation: Reducing Hallucinations with Lang2Logic
Published in 2025 IEEE International Symposium on Networks, Computers and Communications (ISNCC), 2025
Awards
- 2025: Penn State Computer Science and Engineering Department and Schreyer Honors College Conference Travel Grant ($1000)
- 2023: Schreyer Honors Scholar
Research Projects
AI Safety
Time: 2025 - Present
Location: Penn State University
Advisor: Prof. Rui Zhang
Skills: Group Relative Policy Optimization (GRPO) Reinforcement Learning , Uncertainty quantification (UQ), Retrieval-Augmented Generation (RAG)
Currently, I am developing a safety framework that utilizes model internal signals to detect and mitigate hallucinations in Large Language Models. By leveraging Group Relative Policy Optimization (GRPO) and Reinforcement Learning, the system dynamically adjusts model outputs to align with factual correctness, aiming to create more robust and safe AI decision making agents.
Hallucination Guardrail for NCATS Summarizer
Time: 2025
Location: Penn State University
Advisor: Prof. David Koslicki
Skills: NVIDIA NeMo, Guardrails AI, Knowledge graph
I developed a hallucination guardrail system for the NCATS (National Center for Advancing Translational Sciences) Translator Summarizer. This system is designed to verify and constrain the outputs of Large Language Models when summarizing complex biomedical data, ensuring that the generated insights are factually grounded and free from fabrications, thereby increasing trust in AI-assisted translational science.
Retrain diffusion model Shepherd for Drug Design
Time: 2025
Location: Penn State University
Supervisor: Philip Hanoian
Skills: PyTorch, Diffusion Models, Model Training
This project explores the application of the Shepherd diffusion model to the specialized domain of Boron-containing drug design. By adapting diffusion architectures to account for the unique electrostatic and shape properties of Boron, I aim to generate novel molecular structures that optimize pharmacophore matching, facilitating the discovery of bioisosteric drug candidates with enhanced therapeutic potential.
Lang2Logic: Reducing Hallucinations in Logical Translation
Time: 2024
Location: Penn State University
Advisor: Prof. Mahfuza Farooque
Skills: NLP, Supervised Fine-Tuning (SFT), SciPy
To address the unreliability of Large Language Models in formal reasoning, I developed ‘Lang2Logic’, a framework that translates natural language into Conjunctive Normal Form (CNF) for automated theorem proving. I fine tuned LLMs and constrained their outputs using a custom defined grammar and symbolic computation libraries, successfully creating a pipeline that significantly reduces hallucinations and enables the expansion of SAT solver domains from logical verification to finance, healthcare, and other fields.
Nanophaser: Bioinformatics Python Package
Time: 2024
Location: Penn State University
Advisor: Prof. István Albert
Skills: Python, Bioinformatics, Algorithm Design
Addressing the limitations of existing genomic tools, I developed Nanophaser, a specialized Python package for analyzing DNA sequencing data. This tool was optimized for MinION sequencing technology to identify genetic variants and perform variant phasing at multi-allelic loci. By creating a robust pipeline for haplotype classification, I provided researchers with a more accurate method for characterizing complex immune system genes.
Neural Networks for Vaccine Discovery
Time: 2023 - 2024
Location: Penn State University
Advisor: Prof. Vivek Kapur
Skills: Python, Neural Networks, Data Analysis
Focusing on reducing the wet lab costs of vaccine research, I utilized the NetMHCIIpan-4.3 Artificial Neural Network (ANN) to predict immune triggering peptides in Tuberculosis antigens, focusing on the binding pattern with MHC II. I processed large scale sequencing data and validated the model’s predictions against experimental results using R² correlation and Jaccard similarity, demonstrating the effectiveness of computational methods in identifying potential vaccine candidates.
