Papers We Love Milano: Inside the Mind of an LLM

Schedule

Thu Oct 10 2024 at 07:00 pm to 09:00 pm

UTC+02:00

Location

eDreams ODIGEO Tech Hub | Milano, LO

Advertisement
Emanuele Fabbiano of Xtream joins us again for a new talk about the internals of LLMs!
About this Event

Papers We Love Milano: Inside the Mind of an LLM

Elevator Pitch

Recent advances in Large Language Model (LLM) explainability have yielded intriguing results. This talk will explore the most recent breakthroughs: how we discovered Llamas think in English and altered Claude’s belief system, inducing it to give the highest importance to the Golden Gate Bridge. Finally, we'll examine the implications of these findings for AI privacy and security.

Abstract

In 2022 and 2023, explaining the inner workings of large language models (LLMs) seemed like a daunting task. However, recent studies in 2024 have led to significant breakthroughs in this area. This talk will explore three of the most important discoveries in LLM explainability.

1. Llamas “Think” in English [1]

Researchers from EPFL have revealed that models from the Llama 2 family of LLMs use English as their internal representation, regardless of the input or output language. This behaviour may account for certain biases in the models' style when used with non-English languages.

2. Monosemantic Features in Claude 3 and GPT-4 [2]

Researchers from Anthropic, later followed by OpenAI, have succeeded in collapsing the internal representation of Claude 3 Sonnet and GPT-4 into monosemantic features. This discovery enables a deeper understanding of which areas of the model are associated with specific topics and allows for the adjustment of the relative importance of these topics. This technique holds promise for aligning LLMs with ethical values.

3. LLMs Memorize Unusual Data [3, 4]

Recent research has shown why LLMs tend to memorize specific samples that are outliers compared to the normal data distribution. Unique strings, such as names and personal information, are particularly prone to memorization and can be reproduced by the models, especially when exploring similarly unusual data spaces. This finding explains instances where GPT-4 have shared personal information when prompted to repeat the same word indefinitely.

Finally, we will discuss the implications of these findings on privacy and security in the context of LLMs.

Time Split

Talk Structure

  • 0-5 mins: Introduction - Why LLMs are challenging to explain
  • 5-10 mins: Llamas “think” in English
  • 10-25 mins: Monosemantic features in Claude 3 and GPT-4
  • 25-35 mins: LLMs Memorize Unusual Data
  • 35-50 mins: Conclusion - Impact on security and privacy

Resources

  1. Wendler, Chris, et al. "Do llamas work in english? on the latent language of multilingual transformers." arXiv preprint arXiv:2402.10588 (2024).
  2. Templeton, Adly. Scaling monosemanticity: Extracting interpretable features from claude 3 sonnet. Anthropic, 2024.
  3. Jarmul, Katharine. Practical Data Privacy. " O'Reilly Media, Inc.", 2023.
  4. Jarmul, Katharine. Your Model Probably Memorized the Training Data, at PyData Berlin, 20224
Advertisement

Where is it happening?

eDreams ODIGEO Tech Hub, Via Gustavo Fara, 26, Milano, Italy

Event Location & Nearby Stays:

Tickets

EUR 0.00

Papers We Love Milano

Host or Publisher Papers We Love Milano

It's more fun with friends. Share with friends

Discover More Events in Milano

LE VERBENE DEL PASSATO
Thu Oct 10 2024 at 06:30 pm LE VERBENE DEL PASSATO

Via Vallarsa, 11

Thu Oct 10 2024 at 06:30 pm What is Marketing Today? Club Marketing & Sales POLIMI

Campus Navigli, POLIMI GSoM

Presentazione del libro "L'intravisto"
Thu Oct 10 2024 at 06:30 pm Presentazione del libro "L'intravisto"

Via Alfonso Lamarmora, 26

FESTIVALS ART
MUNCH. IL GRIDO INTERIORE a PALAZZO REALE
Thu Oct 10 2024 at 06:45 pm MUNCH. IL GRIDO INTERIORE a PALAZZO REALE

12 Piazza del Duomo,Milano,IT

ART LITERARY-ART
MILANO CON VISTA A 108mt-TORRE BRANCA+Aperitivo \/CC | Info +393382724181
Thu Oct 10 2024 at 07:30 pm MILANO CON VISTA A 108mt-TORRE BRANCA+Aperitivo /CC | Info +393382724181

Torre Branca

FOOD-DRINKS ART
InfoMilano | TORRE BRANCA Milano \u2013 Salita Notturna con Aperitivo Esclusivo
Thu Oct 10 2024 at 07:30 pm InfoMilano | TORRE BRANCA Milano – Salita Notturna con Aperitivo Esclusivo

Torre Branca

ENTERTAINMENT MUSIC
Parioli Garden per Milano Wine Week 2024  - Aperitivo e Party con Dj Set
Thu Oct 10 2024 at 07:30 pm Parioli Garden per Milano Wine Week 2024 - Aperitivo e Party con Dj Set

Piazzale dello Sport, 6

PARTIES ENTERTAINMENT
MAXIE FREEMAN BLUES BAND
Thu Oct 10 2024 at 08:00 pm MAXIE FREEMAN BLUES BAND

Nidaba Theatre Milano

MUSIC ENTERTAINMENT
OPEN DAY  | Corsi di Disegno e Pittura | Milano Painting Academy
Tue Oct 05 2021 at 05:00 pm OPEN DAY | Corsi di Disegno e Pittura | Milano Painting Academy

Via Francesco de Sanctis, 34

ART FINE-ARTS
Public Speaking - Toastmasters Club Milan-Easy
Tue Nov 22 2022 at 06:45 pm Public Speaking - Toastmasters Club Milan-Easy

Museo d'Arte e Scienza

PUBLIC-SPEAKING WORKSHOPS
Paint 'n Sip Workshop
Fri Nov 03 2023 at 05:00 pm Paint 'n Sip Workshop

San Lorenzo Osteria Bistrot, Via Pio IV, Milano, MI, Italia

WORKSHOPS SIP-AND-PAINT
Paint 'n Sip x TripBurger
Fri Jan 12 2024 at 07:00 pm Paint 'n Sip x TripBurger

Tripburger

ART SIP-AND-PAINT
COCCOLE POETICHE A DOMICILIO
Mon Apr 08 2024 at 12:00 am COCCOLE POETICHE A DOMICILIO

Milano

WORKSHOPS ART
Candle Light Vinyasa
Mon Sep 02 2024 at 07:15 pm Candle Light Vinyasa

Via Daniele Crespi, 4

WORKSHOPS HEALTH-WELLNESS
Candle Light Hata Yoga + Sound Healing
Mon Sep 02 2024 at 08:15 pm Candle Light Hata Yoga + Sound Healing

Via Daniele Crespi, 4

HEALTH-WELLNESS WORKSHOPS
Neon Brush Kids: corso di pittura al neon per tutta la famiglia
Tue Sep 03 2024 Neon Brush Kids: corso di pittura al neon per tutta la famiglia

Fever Hub - Milano

KIDS WORKSHOPS
Morning Vinyasa
Tue Sep 03 2024 at 07:30 am Morning Vinyasa

Via Daniele Crespi, 4

HEALTH-WELLNESS YOGA
Yoga Base
Wed Sep 04 2024 at 07:00 pm Yoga Base

Via Daniele Crespi, 4

HEALTH-WELLNESS WORKSHOPS
Lunch Break | Yoga Vinyasa
Fri Sep 06 2024 at 01:00 pm Lunch Break | Yoga Vinyasa

Via Daniele Crespi, 4

HEALTH-WELLNESS WORKSHOPS
Candlelight Vinyasa + social coffee - PLS READ DESCRIPTION BEFORE SIGN UP
Sat Sep 07 2024 at 10:30 am Candlelight Vinyasa + social coffee - PLS READ DESCRIPTION BEFORE SIGN UP

Via Daniele Crespi, 4

CANDLELIGHT-CONCERTS HEALTH-WELLNESS

What's Happening Next in Milano?

Discover Milano Events