AudioGen is trained for the task of text-to-sound generation. Given a text prompt, it generates 5 seconds of audio adhering to the provided text description.
Whistling with wind blowing
Sirens and a humming engine approach and pass
A duck quacking as birds chirp and a pigeon cooing
Railroad crossing signal followed by a train passing and blowing horn
Typing on a typewriter