Microsoft reportedly developing MAI-1 AI model with 500B parameters


Microsoft Corp. is developing a large language model with about 500 billion parameters, The Information reported today.

The LLM, which is said to be known as MAI-1 internally, is expected to make its debut as early as this month.

When OpenAI introduced GPT-3 in mid-2020, it detailed that the initial version of the model had 175 billion parameters. The company disclosed that GPT-4 is larger but hasn’t yet shared specific numbers. Some reports suggest that OpenAI’s flagship LLM includes 1.76 trillion parameters while Google LLC’s Gemini Ultra, which has comparable performance to GPT-4, reportedly features 1.6 trillion.

That Microsoft’s MAI-1 reportedly comprises 500 billion parameters suggests it could be positioned as a kind of midrange option between GPT-3 and ChatGPT-4. Such a configuration would allow the model to provide high response accuracy, but using significantly less power than OpenAI’s flagship LLM. That would translate into lower inference costs for Microsoft.

According to The Information, the development of MAI-1 is being overseen by Mustafa Suleyman, the founder of LLM developer Inflection AI Inc. Suleyman joined Microsoft in March along with most of the startup’s employees through a deal reportedly worth $625 million. The executive earlier co-founded Google LLC’s DeepMind AI research group.

Microsoft may reportedly use training data and certain other assets from Inflection AI to power MAI-1. The model’s training dataset is said to include types of information as well, including text generated by GPT-4 and web content. Microsoft is reportedly carrying out the development process using a “large cluster of servers” equipped with Nvidia Corp. graphics cards.

The Information’s sources indicated that the company hasn’t yet determined how it will use MAI-1. If the model indeed features 500 billion parameters, it’s too complex to run on consumer devices. That means Microsoft will most likely deploy MAI-1 in its data centers, where the LLM could be integrated into services such as Bing and Azure.

It’s believed the company could debut MAI-1 during its Build developer conference, which will kick off on May 16, if the model shows sufficient promise by then. That hints the company expects to have a working prototype of the model within a few weeks, if it doesn’t have one already.

The news that Microsoft is developing MAI-1 comes less than two weeks after it open-sourced a language model dubbed Pi-3 Mini. According to the company, the latter model features 3.8 billion parameters and can outperform LLMs more than 10 times its size. Pi-3 is part of an AI series that also includes two other, larger neural networks with slightly better performance.

Photo: Microsoft

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *