gAImble

Project Author
Issue Date
2025-05-05
Authors
Wolff, William
Loading...
Thumbnail Image
Embargo
First Reader
Luman, Douglas J.
Additional Readers
Green, Morgan
Keywords
LLM reasoning
Distribution
Abstract
Ever since the rise in popularity with large language modesl such as GPT, Gemini, and Claude. We now are seeing an advance towards a new age of technology. With what we know now about large language models, there are still further questions on how advanced are large language models. This test is to see how well can large language models reason in a risk/reward scenario with the scenario chosen being blackjack. The average player wins 42% of the time. Using the basic blackjack strategy as a baseline evaluation, we look to see if 3 large language models (GPT, Gemini, and Claude) can reason and be either more than 2% better or worse than the average player. To see if the reasoning is on par with reasoning, each large language cannot make an illegal move more than 3% of the time. Furthermore, this project look to see if models can reason more than 20% of the total runs that have occured when ran.
Description
Chair
Major
Computer Science
Department
Computer and Information Science
Recorder
License
Citation
Version
Honors
Publisher
Series