GPT-4o: Difference between revisions

Generative Pre-trained Transformer 4 Omni (GPT-4o)
Developer(s)	OpenAI
Initial release	May 13, 2024; 7 months ago
Predecessor	GPT-4 Turbo
Type	Multimodal; Large language model; Generative pre-trained transformer; Foundation model;
License	Proprietary
Website	openai.com/index/hello-gpt-4o

Browse history interactively

← Previous edit Next edit →

Content deleted Content added

VisualWikitext

Inline

Revision as of 15:20, 15 May 2024

GPT-4o (GPT-4 omni) is a multilingual, multimodal generative pre-trained transformer designed by OpenAI. It was announced by OpenAI's CTO Mira Murati during a live-streamed demo on 13 May 2024 and released the same day.^[1] GPT-4o is free, but with a usage limit that is 5 times higher for ChatGPT Plus subscribers.^[2] Its API is twice as fast and half the price of its predecessor, GPT-4 Turbo.^[1]

Background

GPT-4o was originally shadow launched on LMSYS, as 3 different models. These 3 models were called gpt2-chatbot, im-a-good-gpt2-chatbot, and im-also-a-good-gpt2-chatbot. On 7 May 2024, Sam Altman revealed that OpenAI was responsible for these mysterious new models.^[3]

Capabilities

GPT-4o achieves state-of-the-art results in voice, multilingual, and vision benchmarks, setting new records in audio speech recognition and translation.^[4] GPT-4o scores 88.7 on the Massive Multitask Language Understanding (MMLU) benchmark compared to 86.5 by GPT-4.^[4] for Voice-to-Voice, unlike GPT-3.5 and GPT-4 which convert the voice to text, give the text to the model then convert the text back to voice using another model, GPT-4o natively supports Voice-to-Voice making the response near instant and seamless.^[4]

The model supports over 50 languages,^[1] covering over 97% of speakers. Mira Murati demonstrated the model's multilingual capability by speaking Italian to the model and having it translate between English and Italian during the live-streamed OpenAI demo event on 13 May 2024. In addition to this, the new tokenizer uses fewer tokens for certain languages, especially non-English languages such as Gujarati which uses 4.4x fewer tokens, making it cheaper for those languages.^[4]

It is currently the leading model in the Large Model Systems Organization (LMSYS) Elo Arena Benchmarks by the University of California, Berkeley.^[5]

References

^ ^a ^b ^c Wiggers, Kyle (2024-05-13). "OpenAI debuts GPT-4o 'omni' model now powering ChatGPT". TechCrunch. Retrieved 2024-05-13.
^ Field, Hayden (2024-05-13). "OpenAI launches new AI model GPT-4o and desktop version of ChatGPT". CNBC. Retrieved 2024-05-14.
^ Sam Altman "https://twitter.com/sama/status/1787222050589028528" Twitter, X. Retrieved 14 May 2024.
^ ^a ^b ^c ^d "Hello GPT-4o". OpenAI.
^ Fedus, William. "GPT-4o is our new state-of-the-art frontier model".

[TechCrunch-1] Wiggers, Kyle (2024-05-13). "OpenAI debuts GPT-4o 'omni' model now powering ChatGPT". TechCrunch. Retrieved 2024-05-13.

[2] Field, Hayden (2024-05-13). "OpenAI launches new AI model GPT-4o and desktop version of ChatGPT". CNBC. Retrieved 2024-05-14.

[3] Sam Altman "https://twitter.com/sama/status/1787222050589028528" Twitter, X. Retrieved 14 May 2024.

[Hello_GPT-4o-4] "Hello GPT-4o". OpenAI.

[5] Fedus, William. "GPT-4o is our new state-of-the-art frontier model".

[1]

[2]

[3]

[4]

[5]

@@ Line 24: / Line 24: @@
 == Capabilities ==
-GPT-4o achieves state-of-the-art results in voice, multilingual, and vision benchmarks, setting new records in audio speech recognition and translation.<ref name="Hello GPT-4o">{{Cite web |title=Hello GPT-4o |url=https://openai.com/index/hello-gpt-4o/ |website=OpenAI}}</ref> GPT-4o scores 88.7 on the Massive Multitask Language Understanding ([[MMLU]]) benchmark compared to 86.5 by GPT-4.<ref name="Hello GPT-4o" /> for Voice-to-Voice, unlike GPT-3.5 and GPT-4 which convert the voice to text, give the text to the model then convert the text back to voice using another model, GPT-4o natively supports Voice-to-Voice making the response near instant and seamless<ref name="Hello GPT-4o" />
+GPT-4o achieves state-of-the-art results in voice, multilingual, and vision benchmarks, setting new records in audio speech recognition and translation.<ref name="Hello GPT-4o">{{Cite web |title=Hello GPT-4o |url=https://openai.com/index/hello-gpt-4o/ |website=OpenAI}}</ref> GPT-4o scores 88.7 on the Massive Multitask Language Understanding ([[MMLU]]) benchmark compared to 86.5 by GPT-4.<ref name="Hello GPT-4o" /> for Voice-to-Voice, unlike GPT-3.5 and GPT-4 which convert the voice to text, give the text to the model then convert the text back to voice using another model, GPT-4o natively supports Voice-to-Voice making the response near instant and seamless.<ref name="Hello GPT-4o" />
-The model supports over 50 languages,<ref name="TechCrunch" /> covering over 97% of speakers. Mira Murati demonstrated the model's multilingual capability by speaking Italian to the model and having it translate between English and Italian during the live-streamed OpenAI demo event on 13 May 2024. In addition to this, the new tokenizer uses fewer tokens for certain (especially non-English) languages (4.4x fewer tokens for Gujarati) making it cheaper for those languages<ref name="Hello GPT-4o" />
+The model supports over 50 languages,<ref name="TechCrunch" /> covering over 97% of speakers. Mira Murati demonstrated the model's multilingual capability by speaking Italian to the model and having it translate between English and Italian during the live-streamed OpenAI demo event on 13 May 2024. In addition to this, the new tokenizer uses fewer tokens for certain languages, especially non-English languages such as Gujarati which uses 4.4x fewer tokens, making it cheaper for those languages.<ref name="Hello GPT-4o" />
 It is currently the leading model in the Large Model Systems Organization (LMSYS) [[Elo rating system|Elo]] Arena Benchmarks by the [[University of California, Berkeley]].<ref>{{Cite web |last=Fedus |first=William |title=GPT-4o is our new state-of-the-art frontier model. |url=https://twitter.com/LiamFedus/status/1790064963966370209}}</ref>

Revision as of 15:20, 15 May 2024

Background

Capabilities

See also

References