Last Week in GAI Security Research - 09/23/24

Last Week in GAI Security Research - 09/23/24

Highlights from Last Week

  • ๐Ÿงฎ Jailbreaking Large Language Models with Symbolic Mathematics
  • โ‡ AutoSafeCoder: A Multi-Agent Framework for Securing LLM Code Generation through Static Analysis and Fuzz Testing
  • ๐Ÿ“จ Towards Novel Malicious Packet Recognition: A Few-Shot Learning Approach
  • ๐Ÿง‘โ€๐Ÿ’ป Hacking, The Lazy Way: LLM Augmented Pentesting
  • ๐Ÿ“ CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration 

Partner Content

Pillar Security is the security stack for AI teams. Fortify the entire AI application development lifecycle while helping Security teams regain visibility and visibility control.

  • Gain complete oversight of your AI inventory. Audit usage, app interactions, inputs, outputs, meta-prompts, user sessions, models and tools with full transparency.
  • Safeguard your apps with enterprise-grade low-latency security and safety guardrails. Detect and prevent attacks that can affect your users, data and AI-app integrity.
  • Assess and reduce risk by continuously stress-testing your AI apps with automated security and safety evaluations. Enhance resilience against novel attacks and stay ahead of emerging threats.

๐Ÿงฎ Jailbreaking Large Language Models with Symbolic Mathematics (http://arxiv.org/pdf/2409.11445v1.pdf)

  • The MathPrompt technique demonstrated an average attack success rate (ASR) of 73.6% across several Large Language Models (LLMs) by encoding harmful prompts in symbolic mathematics to bypass safety mechanisms.
  • Mathematically encoded prompts significantly alter the semantic representation of the input, reducing the effectiveness of LLM's safety filters and highlighting a need for comprehensive safety measures that address various input modalities.
  • Even with advanced safety training, LLMs remain vulnerable to symbolic mathematics attacks, with proprietary and open-source models alike exhibiting high success rates for bypasses such as GPT-4o at 85.0% and Llama 3.1 70B at 73.3%.

โ‡ AutoSafeCoder: A Multi-Agent Framework for Securing LLM Code Generation through Static Analysis and Fuzz Testing (http://arxiv.org/pdf/2409.10737v1.pdf)

  • AutoSafeCoder framework achieves a 13% reduction in code vulnerabilities without compromising functionality, highlighting the efficacy of integrating multi-agent systems in secure code generation.
  • In evaluations, AutoSafeCoder outperformed GPT-4o in reducing vulnerabilities, with a significant identification and rectification of recurring vulnerability CWE-94 across multiple code samples.
  • The integration of Large Language Models in software development raises security concerns, with studies showing that up to 40% of LLM-generated codes may fail to meet minimal security standards.

๐Ÿ“จ Towards Novel Malicious Packet Recognition: A Few-Shot Learning Approach (http://arxiv.org/pdf/2409.11254v1.pdf)

  • The study achieved an average accuracy of 86.35% and an F1-Score of 86.40% in detecting various malware types across network traffic, including IoT environments, using few-shot learning with large language model (LLM) embeddings.
  • A few-shot learning framework with pre-trained LLM embeddings demonstrated substantial adaptability in recognizing novel malware types with minimal labeled data, emphasizing the shift towards more intelligent and adaptive malware detection methodologies.
  • Encrypting network traffic with algorithms like AES significantly diminishes the model's ability to learn distinguishable patterns from raw bytes, showing an accuracy of 57.16% and an F1-Score of 44.52%, whereas Fernet encryption resulted in much higher performance with an accuracy of 91.41% and an F1-Score of 92.09%.

๐Ÿง‘โ€๐Ÿ’ป Hacking, The Lazy Way: LLM Augmented Pentesting (http://arxiv.org/pdf/2409.09493v1.pdf)

  • LLM Augmented Pentesting with GPT-4-turbo increases efficiency, reducing command generation and summarization tasks from 8-10 minutes to 4-5 minutes.
  • Augmented pentesting effectively addresses complex real-world cybersecurity challenges without significantly enhancing task completion rates.
  • The integration of Retrieval-Augmented Generation (RAG) with Pentest Copilot significantly improves the accuracy and relevance of generated pentesting commands.

๐Ÿ“ CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration (http://arxiv.org/pdf/2409.11365v1.pdf)

  • Constitutional Calibration technique enhances MLLMs' safety-awareness against malicious visual inputs by recalibrating the output distribution in alignment with safety prompts.
  • Experimental evaluations demonstrate that incorporating safety principles directly into MLLM prompts substantially improves the model's resilience to generating harmful responses from malicious image queries.
  • The gap between image and textual modalities in MLLMs contributes to their susceptibility to malicious inputs, but can be effectively mitigated through the training-free approach of Constitutional Calibration.

Other Interesting Research

  • Prompt Obfuscation for Large Language Models (http://arxiv.org/pdf/2409.11026v1.pdf) - Prompt obfuscation effectively protects intellectual property in LLM applications without sacrificing performance or usability, demonstrating resilience against sophisticated deobfuscation methods.
  • Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities (http://arxiv.org/pdf/2409.10574v1.pdf) - SmartVD's application of finely-tuned LLMs on a balanced dataset revolutionizes the detection of vulnerabilities in Solidity contracts, setting a new standard for security in blockchain technologies.
  • ContractTinker: LLM-Empowered Vulnerability Repair for Real-World Smart Contracts (http://arxiv.org/pdf/2409.09661v1.pdf) - ContractTinker demonstrates the effective application of Language Models and Chain-Of-Thought reasoning in automating the repair of smart contract vulnerabilities, though it also reveals the limits of current technologies in achieving perfect accuracy without human intervention.
  • LLM-Powered Text Simulation Attack Against ID-Free Recommender Systems (http://arxiv.org/pdf/2409.11690v2.pdf) - Research unveils the inherent vulnerability of ID-free recommender systems to LLM-based text simulation attacks and introduces an effective countermeasure to enhance system integrity.
  • Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models (http://arxiv.org/pdf/2409.10490v1.pdf) - Advanced LLMs significantly elevate the precision of software vulnerability detection, marking a pivotal advancement towards robust cybersecurity frameworks.
  • VulnLLMEval: A Framework for Evaluating Large Language Models in Software Vulnerability Detection and Patching (http://arxiv.org/pdf/2409.10756v1.pdf) - LLMs show promising potential in cybersecurity tasks, with performance hinging on context richness and dataset relevance.
  • Context-aware Code Segmentation for C-to-Rust Translation using Large Language Models (http://arxiv.org/pdf/2409.10506v1.pdf) - Optimizing code segment size and supplementing context significantly improves C-to-Rust translation accuracy and compilation success rates.
  • Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data (http://arxiv.org/pdf/2409.11423v1.pdf) - Fine-tuning LLMs with generated or domain-specific data enhances performance but raises significant privacy risks, suggesting careful consideration of tuning parameters to mitigate threats.

Strengthen Your Professional Network

In the ever-evolving landscape of cybersecurity, knowledge is not just powerโ€”it's protection. If you've found value in the insights and analyses shared within this newsletter, consider this an opportunity to strengthen your network by sharing it with peers. Encourage them to subscribe for cutting-edge insights into generative AI.

๐ŸŽฏ
This post was generated using generative AI (OpenAI GPT-4T). Specific approaches were taken to reduce fabrications. As with any AI-generated content, mistakes might be present. Sources for all content have been included for reference.