Last Week in GAI Security Research - 10/07/24
Highlights from Last Week
* ๐ซฅ The Perfect Blend: Redefining RLHF with Mixture of Judgesย
* ๐ Confidential Prompting: Protecting User Prompts from Cloud LLM Providersย
* ๐ฆบ Overriding Safety protections of Open-source Models
* ๐ค Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
* ๐ง Undesirable Memorization in Large Language Models: A