Home
Workflows
Tags
Blog
Premium
About
Home
/
Workflows
/
Automate LLM Testing with GPT-4 Judge & Google Sheets Tracking
Automate LLM Testing with GPT-4 Judge & Google Sheets Tracking
by Adam Janes
•
Updated: Last update 2 months ago
•
Source:
n8n.io
Loading workflow viewer...
Tags
Engineering
AI Summarization
Getting Started
Free to Download
Details
Content
How it works
The workflow loads a list of test cases from a Google Sheet (previous results stored from an LLM)
For each test case, we execute a call to an LLM judge in parallel (using HTTP Request + Webhook nodes)
The judge uses the Input, Output, and Reference Answer fields from the spreadsheet to mark each LLM response as Pass/Fail
The results are logged into a separate sheet in the same Sheets file.
Set up steps:
Add your credentials for Google Sheets and OpenRouter (or replace the OpenRouter node with your favourite chat model).
Make a copy of the example Sheet to populate it with you own test data.
Run the workflow with the Execute Workflow button next to the Manual Trigger node.
Related Workflows
Benchmark LLM Performance on Legal Documents with Google Sheets and OpenRouter
by Adam Janes
Engineering
AI Summarization
Simple Eval for Legal Benchmarking
by Adam Janes
Engineering
AI Summarization
Auto-Generate Meeting Attendee Research with GPT-4o, Google Calendar, and Gmail
by Adam Janes
Personal Productivity
AI Summarization