Last Week in GAI Security Research - 10/07/24
Highlights from Last Week
* 🫥 The Perfect Blend: Redefining RLHF with Mixture of Judges
* 🔏 Confidential Prompting: Protecting User Prompts from Cloud LLM Providers
* 🦺 Overriding Safety protections of Open-source Models
* 🤖 Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
* 🧠 Undesirable Memorization in Large Language Models: A Survey
* 🚧 The