thuccslab / awesome-lm-ssp Goto Github PK

A reading list for large models safety, security, and privacy.

Home Page: https://github.com/ThuCCSLab/Awesome-LM-SSP

License: Apache License 2.0

adversarial-attacks awesome-list diffusion-models jailbreak language-model llm nlp privacy safety security

awesome-lm-ssp's Issues

What is the difference between Data Reconstruction and Extraction?

我认为Data Reconstruction是指从公共聚合信息中，部分重建私有数据集的方法。比如基于开源语言模型，加入私有数据进行训练。对私有数据的攻击是Data Reconstruction（刚接触这个领域，不知道这样描述对不对）。可是在Data Reconstruction中看到了[Extracting Training Data from Large Language Models]这篇文章。

Some of my related works

Title	Link	Code	Venue	Classification	Model	Comment
Towards More Effective Protection Against Diffusion-Based Mimicry with Score Distillation	https://arxiv.org/abs/2311.12832	https://github.com/xavihart/Diff-Protect	ICLR 2024	C2. Copyright	Diffusion Model	protective perturbation of diffusion model
Diffusion-Based Adversarial Sample Generation for Improved Stealthiness and Controllability	https://arxiv.org/abs/2305.16494	https://github.com/xavihart/Diff-PGD	NeurIPS 2023	B1. Adversarial Samples	Diffusion Model	generate stealthy adversarial samples

Some works that have not been included

👍 Thank you for creating and maintaining such a great repository. I found that these works have not been included and hope they can be added.

Title	Link	Code	Venue	Classification	Model
Query-Relevant Images Jailbreak Large Multi-Modal Models	https://arxiv.org/abs/2311.17600	https://github.com/isXinLiu/MM-SafetyBench	arXiv'23	A1. Jailbreak	VLM
GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models	https://arxiv.org/abs/2402.03299		arXiv'24	A1. Jailbreak	LLM
On the Robustness of Large Multimodal Models Against Image Adversarial Attacks	https://arxiv.org/abs/2312.03777		arXiv'23	B1. Adversarial Examples	VLM
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models	https://arxiv.org/abs/2402.13851		arXiv'24	B2. Poisoning	VLM

Kindly request the inclusion

Title	Link	Code	Venue	Classification	Model	Comment
Automatic and Universal Prompt Injection Attacks against Large Language Models	https://arxiv.org/abs/2403.04957	https://github.com/SheltonLiu-N/Universal-Prompt-Injection	arXiv'24	A7. Prompt Injection	LLM	Automatically generating highly effective and universal prompt injection data

request for adding a new survey

Hi, thank you for this great repo. Could you please add this new survey? Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey

Kindly request the inclusion

Title	Link	Code	Venue	Classification	Model	Comment
AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting	https://arxiv.org/abs/2403.09513	https://github.com/rain305f/AdaShield	arXiv'24	A1. Jailbreak	VLM	VLM Jailbreak Defense

Kindly request the inclusion

Thank you for this great paper collection! It will be my pleasure if my work can be included in the repo; thanks!

Title	Link	Code	Venue	Classification	Model	Comment
MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning	https://arxiv.org/abs/2311.13127	https://github.com/liuyixin-louis/MetaCloak	CVPR'24 Oral	B1. Adversarial Examples	Diffusion	a more robust protective perturbation framework for safeguarding portrait against customized diffusion models training

thuccslab / awesome-lm-ssp Goto Github PK

awesome-lm-ssp's Issues

What is the difference between Data Reconstruction and Extraction?

Some of my related works

Some works that have not been included

Kindly request the inclusion

request for adding a new survey

Kindly request the inclusion

Kindly request the inclusion

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs