AI Safety: An Introduction by Jan Betley

Name: AI Safety: An Introduction by Jan Betley
Start: 2025-04-10T15:00:00+02:00
Location: MIMUW

Schedule

Thu, 10 Apr, 2025 at 03:00 pm

UTC+02:00

Location

MIMUW | Warsaw, MZ

When and where?
Place: Room 2070, MIMUW
Time: 15:00
Date: April 10th
As artificial intelligence rapidly advances, ensuring its safety is becoming one of the most critical challenges for humanity’s future.
Jan Betley researches AI safety, focusing on Large Language Models' self-awareness and implicit knowledge. His work explores how LLMs understand their own behaviors and goals, and how specific training methods can lead to emergent misalignment, informing our approach to AI risk. As an experienced researcher in this domain, he will invite us to think through how AI can fail.
The Effective Altruism UW Student Group helps students find ways to do the most good. We explore the ideas of Effective Altruism: using evidence and compassion to find the best ways to improve the world. AI safety is one such key challenge – come to the meeting to learn more!