Large language models are vulnerable to “prompt injection” attacks
Large language models are vulnerable to a newly discovered kind of adversarial attack, known as “prompt injection, ” in which users trick the model into disregarding its designer’s instructions.