Capstone Project
TECHNOLOGY CATEGORY
Machine Learning / AIINDUSTRY SPONSOR
MicrosoftPROBLEM SPACE
Open Source TechDATE COMPLETED
March 15, 2024
Evaluation CoPilot
Problem
The rapid integration of Large Language Models (LLMs) in app development introduces challenges in understanding and trusting AI-generated responses.
Approach
Our team combined user centered design with technological innovation, starting with extensive user research to understand developers’ needs and pain points. We iteratively developed a series of prototypes following Agile framework, incorporating feedback from user testing sessions and learning from the latest research outcomes in the field.
Solution
The “Evaluation Copilot” is a web app that demystifies LLM evaluation metrics for developers, offering an intuitive platform to test, understand, and refine AI-generated text. It provides clear, actionable feedback on how to improve prompts for better LLM responses, ensuring developers can enhance AI reliability and effectiveness in their applications.