• No notifications yet.
  • Sign Out
Registered User? Login
Forgot Password?
Sign Up
loader image
New User? Sign Up
Forgot Password?
Login
loader image

    Fine-tuning an LLM Judge to Reduce Hallucination


    Webinar

    July 17 at 8:00am PT

    What to expect?

    In this webinar, we explore the potential of leveraging out-of-domain data to enhance the finetuning of MistralAI language models for detecting factual inconsistencies, also known as hallucinations. 

    Inspired by Eugene Yan’s article on bootstrapping hallucination detection, we use the Factual Inconsistency Benchmark (FIB) dataset and initially finetune a MistralAI-based model solely on this dataset, achieving limited success. We then employed pre-finetuning on Wikipedia summaries from the Unified Summarization Benchmark (USB) before applying task-specific finetuning on FIB. 

    This approach significantly improved performance. Our methodology incorporates Weights & Biases Weave to automate model evaluation, demonstrating that pre-finetuning on related but out-of-domain data can effectively bootstrap the detection of factual inconsistencies, thus reducing the need for extensive task-specific data collection. 

    This technique offers a promising strategy for enhancing the accuracy and applicability of natural language inference models in production environments.

    Attend for the live Q&A interaction with the speaker or watch on-demand.

    Register Now

    Our Speakers

    image placeholder
    SY
    Sophia Yang
    Head of Developer Relations
    Mistral AI
    image placeholder
    TC
    Thomas Capelle
    ML Engineer
    Weights & Biases
Looking for your ticket? Contact the organizer
Looking for your ticket? Contact the organizer