Closer to the Catalan version of Siri and Alexa
The Government of Catalonia will invest 3 million euros in the AINA project, the initiative based on artificial intelligence, to launch the first speech corpus in Catalan in 2022. Ultimately, this will allow the creation of massive data that will make voice assistants in Catalan possible.
In addition to the investment, what is needed now is to boost the collection of voices for the Common Voice corpus, which already exceeds 1,000 hours recorded in Catalan. In additional to the voice, there are also the 1.77 billion words of the text corpus that the initiative already presented in 2020.
The aim of the AINA project is to generate a massive volume of voice data, covering the dialects, registers and subjects of the Catalan language.