This article is part of our coverage of the latest inAI research.

What makes OPT-175B unique is Metas commitment to openness, as the models name implies.

Meta has made the model available to the public (with some caveats).

Don’t expect large language models like the next GPT to be democratized

It has also released a ton of details about the training and development process.

40% off TNW Conference!

However, the competition over large language models has reached a point where it can no longer be democratized.

https://i0.wp.com/bdtechtalks.com/wp-content/uploads/2021/12/large-language-models.jpg?resize=696%2C435&ssl=1

Metas release of OPT-175B has some key features.

It includes both pretrained models as well as the code needed to train and use the LLM.

It will also help reduce the massivecarbon footprintcaused by the computational resources needed to train large neural networks.

https://i0.wp.com/bdtechtalks.com/wp-content/uploads/2020/09/microsoft-openai-gpt-3-license.jpg?resize=696%2C464&ssl=1

At the time of this writing, all models up to OPT-30B are accessible for download.

The full 175-billion-parameter model will be made available to select researchers and institutions that fill a request form.

Published papers usually only include information about the final model.

https://i0.wp.com/bdtechtalks.com/wp-content/uploads/2021/09/tech-giants-artificial-intelligence.jpg?resize=696%2C392&ssl=1

Among the reasons OpenAI stated for not making GPT-3 public was controlling misuse and development of harmful applications.

However, it is worth noting that transparency and openness is not the equivalent of democratizing large language models.

The company says that the models carbon footprint has been reduced to a seventh of GPT-3.

Experts I had previously spoken to estimated GPT-3s training costs to beup to $27.6 million.

This means that OPT-175B will still cost several million dollars to train.

Meta AIs logbook further confirms that training large language models is a very complicated task.

The researchers also had to restart the training process several times, tweak hyperparameters, and change loss functions.

All of these incur extra costs that small labs cant afford.

Language models such as OPT and GPT are based on thetransformer architecture.

Some researchers believe that reaching higher levels of intelligence is only a scale problem.

Last year, Microsoft and Nvidia created a530-billion parameter language modelcalled Megatron-Turing (MT-NLG).

Last month, Google introduced thePathways Language Model (PaLM), an LLM with 540 billion parameters.

And there are rumors that OpenAI will release GPT-4 in the next few months.

However, larger neural networks also require larger financial and technical resources.

On the commercial side, big tech companies will have an even greater advantage.

Running large language models is very expensive and challenging.

For smaller companies, the overhead of running their own version of an LLM like GPT-3 is too prohibitive.

This in turn will further centralize AI in the hands of big tech companies.

More AI research labs will have to enter partnerships with big tech to fund their research.

This can come at the cost of areas of research that do not have a short-term return on investment.

you could read the original articlehere.

Also tagged with