they are opaque and closed. BLOOM is the great open source project that wants to change everything

  • 5

DALL-E, GPT-3, Image… are some of the most recognized names in the field of artificial intelligence. They all have something in common and that is that they are not open models. These AIs are enabling generate amazing images and conversations, but it is not clear to everyone how they got there. An extremely complex process that for many researchers is also opaque.

BLOOM is the great open source project that wants to change this situation. An open multilanguage model with 176 billion parameters and trained on 1.5 terabytes of text. If the existing models are in relevance like the Google of the time, BLOOM may be the equivalent of Wikipedia.

A year of work later, the community already has its great AI open

The number of parameters is no accident. BLOOM (‘BigScience Language Open-science Open-access Multilingual’) is just slightly larger than GPT-3 (175,000M). But it is not its power that makes it so relevant, but the process by which it has been carried out. Companies like Meta or OpenAI also have some open AI, but all these initiatives have a commercial interest behind them.

This is where the community and BLOOM come in. BigScience is the organization responsible for this model. A group of more than 1,000 researchers dedicated to artificial intelligence, united through Hugging Face, the leading platform and community around AI. But they have not been alone. Total, more than 250 institutions have collaborated on this project that began in early 2021.

as described NatureBLOOM was trained in France with the Jean Zay supercomputer financed with public funds amounting to 7 million dollars. The result was published in the middle of last June.

The use of BLOOM will depend on the researchers, but some uses are already contemplated, such as extracting information from historical texts and making classifications in biology. Being an open project, from Hugging Face will launch a web application Y will allow any user to download BLOOM to make it work.

One of the features of BLOOM is the data used. AI results are closely related to the data sets on which they are based. In this case, the team of researchers hand-selected almost 70% of the 341 billion words with whom he trained.

One of the goals of the initiative was also to feed the AI ​​with a diverse database and sufficiently representative of different languages ​​and cultures.

"Compliments and applause are not enough": Another Open Source developer gets tired of working for the love of art and uncovers an unfair reality

“Values ​​such as openness, inclusion, diversity, responsibility and reproducibility are the DNA of this project. BigScience and BLOOM embody the most remarkable and honest attempt to break down the barriers that Big Tech has erected around AI during these years”, Alberto Romero points outanalyst at CambrianAI.

We will have to wait to see the results, but the fact that the open source community has already presented an open alternative to AI models is great news, especially considering the enormous work and high technical requirements behind of creating these models.

More information | BigScience

DALL-E, GPT-3, Image… are some of the most recognized names in the field of artificial intelligence. They all have something…

DALL-E, GPT-3, Image… are some of the most recognized names in the field of artificial intelligence. They all have something…

Leave a Reply

Your email address will not be published.