April 28 – Isabella Antonuccio-Amato’s thesis defense – Florida Polytech

Graduate student Isabella Antonuccio-Amato will be defending her thesis titled “Large Language Models for Generating and Evaluating Education Finance Reports.”

Isabella Antonuccio-Amato thesis defense

Date: Tuesday, April 28
Time: Noon-1 p.m.
Location: BARC 1122
Current major: M.S. of data science
Thesis committee chair: Dr. Jim Dewey
Committee members: Dr. Abdulaziz Alhamadani, Dr. Kathleen Hardesty, and Dr. Susan LeFrancois.

Abstract

This study investigates the use of large language models (LLMs) for both generating and evaluating structured education finance reports. A multi-stage LLM pipeline was developed to produce state-level FY2025 reports modeled after State of the States (SOS) publications, incorporating source retrieval, drafting, validating, and iterative revision. To enable direct comparison with human-authored reports, a set of FY2024 reports was also generated using a single-prompt approach.

LLM-generated reports and human-authored SOS reports were compared by identifying and counting individual claims within each document. For each pair of reports, evaluators recorded the number of total, shared, and unique claims, and assessed whether each claim was factually correct. Evaluations were performed using multiple LLMs (GPT, Gemini, Mistral), with report order reversed to test for consistency and positional bias. A subset of 11 states was also evaluated by human annotators to provide a baseline for comparison.

Results indicate that LLM-based evaluations are sensitive to document ordering, with the first-presented report consistently receiving higher claim counts across models, demonstrating positional bias. Compared to human annotators, LLMs consistently identify fewer total, unique, and factual claims, while also tending to overestimate overlap between documents.

These findings highlight limitations in using LLMs as standalone evaluators and emphasize the need for careful evaluation design. This study contributes a structured framework for evaluating LLM performance in applied research settings, and highlights the importance of human oversight when using LLMs for scholarly work.

For more information, please contact Isabella Antonuccio-Amato.

Weekly Phoenix

Weekly Phoenix

April 20, 2026

April 28 – Isabella Antonuccio-Amato’s thesis defense

This notice appeared in the Weekly Phoenix between April 20, 2026 and April 26, 2026.

Isabella Antonuccio-Amato thesis defense

Abstract