Query-based extraction
Millions of queries → responses. Train student model on (query, response) pairs. Distillation-adjacent.
Advertisement
Final layer recovery
Carlini et al 2024: extract exact final layer of production LLM via O(N²) queries + top-K logprob API. Recovered final layer of OpenAI + Google models in demo.
Advertisement
Detecting theft
Watermark queries. Rare distinctive responses embedded → stolen model outputs them → provenance proof.
Defenses
Rate limit per account. Don't return raw logprobs. Add noise to outputs. Legal + terms-of-service enforcement.