#ai
I am curious how people handle (if any) adversarial attacks in [[Transformer]]s such as [[GPT3]] that generate and execute code.
OK that's vague?
Say I deploy an app that generate, idk, start-up ideas, somehow people pay for that.
Bob the hacker engineer a nice prompt that generate code on the app server and execute it, it generated a code that return the environment variables on the server (which contains somehow a bank account credentials), great!
Maybe it's more clear now, my concern is how people will check & prevent such hacks, just like hacks in autonomous cars where people show some weird noisy picture and the car goes in a wall [INSERT PAPER].