This article was published 1 yearago

Generative AI will now have a lot more role to play in the world of music, thanks to a new open-source tool that Meta released today. It has already proved its mettle in providing human-style conversations and text, and now, it can be used to create music and audio, yet again based on user text prompts.

This comes courtesy of Meta’s AudioCraft, its open-source AI code. The social media giant introduced the same on Wednesday, giving users the ability to create “high-quality” and “realistic” audio and music from short text-based prompts. AudioCraft currently consists of three AI models – AudioGen, EnCodec, and MusicGen – all of which are dedicated to specified areas of music and sound compression and generation. This makes Meta the latest company to bring generative AI and audio closer – earlier this year, Google parent Alphabet had introduced its own experimental audio generating AI tool in the form of MusicLM.

In its blog post, the social media company revealed that MusicGen is trained using “20,000 hours of music owned by Meta or licensed specifically for this purpose.” EnCodec – the improved version – now lets users create sounds with fewer artifacts. As for AudioGen, it is currently trained on public sound effects, and generates audio from text-based user inputs. Meta provided some examples in the form of some sample audio made with AudioCraft – the samples generated noise of whistling, sirens, music, and humming – all from simple text prompts.

“In recent years, generative AI models including language models have made huge strides and shown exceptional abilities: from the generation of a wide-variety of images and video from text descriptions exhibiting spatial understanding to text and speech models that perform machine translation or even text or speech dialogue agents. Yet while we’ve seen a lot of excitement around generative AI for images, video, and text, audio has always seemed to lag a bit behind. There’s some work out there, but it’s highly complicated and not very open, so people aren’t able to readily play with it,” the company wrote in its post.

The usage of generative AI in music creation has far-reaching implications that impact various aspects of the music industry, artistic expression, and society as a whole, opening up exciting possibilities for creativity and innovation. The most visible effect is the ability to let musicians create new compositions without playing any instruments at all – with AudioCraft, they can seamlessly explore new musical landscapes and experiment with unique sounds, melodies, and harmonies.

Of course, there are concerns as well. The rapidly-advancing AI sector has alarmed numerous industry experts, who are rightfully concerned over the dangers of generative AI and are worried about the rapid pace at which the AI sector is advancing. The use of generative AI to create music and sounds raises questions about copyright and ownership as well – who owns the rights to AI-generated music – the AI developer, the musician using the tool, or the AI model itself? (And the ethical considerations are a different can of worms altogether).