DolphinAttack
Ultrasonic voice commands. Human hears silence, assistant hears 'call attacker.' Older voice systems. Some modern LLMs vulnerable in edge cases.
Advertisement
Adversarial audio
Perturbations inaudible to humans, transcribed as attacker's chosen text. Related to adversarial vision.
Advertisement
Embedded in speech
Ask assistant to summarize podcast. Podcast contains 'ignore previous instructions.' Model transcribes + complies.
Defenses
Frequency filtering. Cross-checker (does human hear the same command?). Refuse commands in unusual frequency ranges.