Sora (text-to-video model): Difference between revisions
Changes to old information that is no longer correct Tag: Reverted |
Ibrahimmusa4 (talk | contribs) m References added Tags: Visual edit Mobile edit Mobile web edit |
||
(40 intermediate revisions by 32 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description| |
{{Short description|Video-generating machine learning model}} |
||
{{Use mdy dates|date=February 2024}} |
{{Use mdy dates|date=February 2024}} |
||
{{Infobox software |
{{Infobox software |
||
Line 7: | Line 7: | ||
| developer = [[OpenAI]] |
| developer = [[OpenAI]] |
||
| genre = [[Text-to-video model]] |
| genre = [[Text-to-video model]] |
||
| website = {{url|https://sora.com/}} |
|||
| released = |
|||
| released = {{Start date and age|2024|12|9|p=y|b=y}} |
|||
| platform = OpenAI |
| platform = OpenAI |
||
}} |
}} |
||
{{Artificial intelligence}} |
{{Artificial intelligence}} |
||
'''Sora''' is a [[text-to-video model]] developed by [[OpenAI]]. The model [[Generative artificial intelligence|generates]] short video clips based on user [[Prompt engineering|prompts]], and can also extend existing short videos. Sora was released publicly for ChatGPT Plus and ChatGPT Pro users in December 2024.<ref>{{Cite web |title=Sora {{!}} OpenAI |url=https://openai.com/sora/ |access-date=2024-12-09 |website=openai.com |language=en-US}}</ref><ref>{{Cite web |last=Wang |first=Gerui |title=How Sora And AI Videos Transform Media: Strengths And Challenges |url=https://www.forbes.com/sites/geruiwang/2024/12/12/ai-videos-are-changing-media--strengths-limitations-and-potential/ |access-date=2024-12-24 |website=Forbes |language=en}}</ref> |
|||
'''Sora''' is an upcoming [[generative artificial intelligence]] model developed by [[OpenAI]], that specializes in [[text-to-video model|text-to-video]] generation. The model generates short video clips corresponding to [[prompt engineering|prompts]] from users. Sora can also extend existing short videos. As of {{CURRENTMONTHNAME}} {{CURRENTYEAR}} it is unreleased and not yet available to the public.<ref name="NBC" /> Only special access is given to select beta-testers on the OpenAI platform. |
|||
==History== |
==History== |
||
Several other text-to-video generating models had been created prior to Sora, including [[Meta Platforms|Meta]]'s Make-A-Video, [[Runway (company)|Runway]]'s Gen-2, and [[Google]]'s Lumiere, the last of which, {{As of|February 2024|lc=y|post=,}} is also still in its research phase.<ref name="Wired" /> [[OpenAI]], the company behind Sora, had released [[DALL-E|DALL·E 3]], the third of its DALL-E [[text-to-image model]]s, in September 2023.<ref name="CNET">{{cite web |last1=Lacy |first1=Lisa |title=Meet Sora, OpenAI's Text-to-Video Generator |url=https://www.cnet.com/tech/meet-sora-openais-text-to-video-generator/ |website=[[CNET]] |access-date=February 16, 2024 |date=February 15, 2024 |archive-date=February 16, 2024 |archive-url=https://web.archive.org/web/20240216004932/https://www.cnet.com/tech/meet-sora-openais-text-to-video-generator/ |url-status=live }}</ref> |
Several other text-to-video generating models had been created prior to Sora, including [[Meta Platforms|Meta]]'s Make-A-Video, [[Runway (company)|Runway]]'s Gen-2, and [[Google]]'s Lumiere, the last of which, {{As of|February 2024|lc=y|post=,}} is also still in its research phase.<ref name="Wired">{{cite magazine |last1=Levy |first1=Steven |author-link=Steven Levy |date=February 15, 2024 |title=OpenAI's Sora Turns AI Prompts Into Photorealistic Videos |url=https://www.wired.com/story/openai-sora-generative-ai-video/ |access-date=February 16, 2024 |magazine=[[Wired (magazine)|Wired]] |archive-date=February 15, 2024 |archive-url=https://web.archive.org/web/20240215234655/https://www.wired.com/story/openai-sora-generative-ai-video/ |url-status=live }}</ref> [[OpenAI]], the company behind Sora, had released [[DALL-E|DALL·E 3]], the third of its DALL-E [[text-to-image model]]s, in September 2023.<ref name="CNET">{{cite web |last1=Lacy |first1=Lisa |title=Meet Sora, OpenAI's Text-to-Video Generator |url=https://www.cnet.com/tech/meet-sora-openais-text-to-video-generator/ |website=[[CNET]] |access-date=February 16, 2024 |date=February 15, 2024 |archive-date=February 16, 2024 |archive-url=https://web.archive.org/web/20240216004932/https://www.cnet.com/tech/meet-sora-openais-text-to-video-generator/ |url-status=live }}</ref> |
||
The team that developed Sora named it after [[:Wiktionary:空#Japanese|the Japanese word for sky]] to signify its "limitless creative potential".<ref name="NYT_CM_2024_02_15">{{cite news |author=Metz |first=Cade |date=February 15, 2024 |title=OpenAI Unveils A.I. That Instantly Generates Eye-Popping Videos |url=https://www.nytimes.com/2024/02/15/technology/openai-sora-videos.html |work=[[The New York Times]] |publisher= |access-date=February 15, 2024 |archive-date=February 15, 2024 |archive-url=https://web.archive.org/web/20240215220626/https://www.nytimes.com/2024/02/15/technology/openai-sora-videos.html |url-status=live }}</ref> On February 15, 2024, OpenAI first previewed Sora by releasing multiple clips of [[high-definition video]]s that it created, including an [[SUV]] driving down a mountain road, an animation of a "short fluffy monster" next to a candle, two people walking through [[Tokyo]] in the snow, and fake historical footage of the [[California gold rush]], and stated that it was able to generate videos up to one minute long.<ref name="Wired" /> The company then shared a technical report, which highlighted the methods used to train the model.<ref name="OAI_research">{{cite web |last1=Brooks |first1=Tim |last2=Peebles |first2=Bill |last3=Holmes |first3=Connor |last4=DePue |first4=Will |last5=Guo |first5=Yufei |last6=Jing |first6=Li |last7=Schnurr |first7=David |last8=Taylor |first8=Joe |last9=Luhman |first9=Troy |date=February 15, 2024 |title=Video generation models as world simulators |url=https://openai.com/research/video-generation-models-as-world-simulators |url-status=live |website=[[OpenAI]] |publisher= |first10=Eric |last10=Luhman |first11=Clarence Wing Yin |last11=Ng |first12=Ricky |last12=Wang |first13=Aditya |last13=Ramesh |access-date=February 16, 2024 |archive-date=February 16, 2024 |archive-url=https://web.archive.org/web/20240216072133/https://openai.com/research/video-generation-models-as-world-simulators }}</ref><ref name="ars"/> OpenAI CEO [[Sam Altman]] also posted a series of tweets, responding to [[Twitter]] users' prompts with Sora-generated videos of the prompts. |
The team that developed Sora named it after [[:Wiktionary:空#Japanese|the Japanese word for sky]] to signify its "limitless creative potential".<ref name="NYT_CM_2024_02_15">{{cite news |author=Metz |first=Cade |date=February 15, 2024 |title=OpenAI Unveils A.I. That Instantly Generates Eye-Popping Videos |url=https://www.nytimes.com/2024/02/15/technology/openai-sora-videos.html |work=[[The New York Times]] |publisher= |access-date=February 15, 2024 |archive-date=February 15, 2024 |archive-url=https://web.archive.org/web/20240215220626/https://www.nytimes.com/2024/02/15/technology/openai-sora-videos.html |url-status=live }}</ref> On February 15, 2024, OpenAI first previewed Sora by releasing multiple clips of [[high-definition video]]s that it created, including an [[SUV]] driving down a mountain road, an animation of a "short fluffy monster" next to a candle, two people walking through [[Tokyo]] in the snow, and fake historical footage of the [[California gold rush]], and stated that it was able to generate videos up to one minute long.<ref name="Wired" /> The company then shared a technical report, which highlighted the methods used to train the model.<ref name="OAI_research">{{cite web |last1=Brooks |first1=Tim |last2=Peebles |first2=Bill |last3=Holmes |first3=Connor |last4=DePue |first4=Will |last5=Guo |first5=Yufei |last6=Jing |first6=Li |last7=Schnurr |first7=David |last8=Taylor |first8=Joe |last9=Luhman |first9=Troy |date=February 15, 2024 |title=Video generation models as world simulators |url=https://openai.com/research/video-generation-models-as-world-simulators |url-status=live |website=[[OpenAI]] |publisher= |first10=Eric |last10=Luhman |first11=Clarence Wing Yin |last11=Ng |first12=Ricky |last12=Wang |first13=Aditya |last13=Ramesh |access-date=February 16, 2024 |archive-date=February 16, 2024 |archive-url=https://web.archive.org/web/20240216072133/https://openai.com/research/video-generation-models-as-world-simulators }}</ref><ref name="ars"/> OpenAI CEO [[Sam Altman]] also posted a series of tweets, responding to [[Twitter]] users' prompts with Sora-generated videos of the prompts. |
||
In November 2024, an [[API]] key for Sora access was leaked by a group of testers on [[Hugging Face]], who posted a [[manifesto]] stating they protesting that Sora was used for "[[Artwashing|art washing]]". OpenAI revoked the access three hours after the leak was made public, and gave a statement that "hundreds of artists" have shaped the development, and that "participation is voluntary."<ref>{{cite web |last1=Spangler |first1=Todd |date=November 27, 2024 |title=OpenAI Shuts Down Sora Access After Artists Released Video-Generation Tool in Protest: 'We Are Not Your PR Puppets' |url=https://variety.com/2024/digital/news/openai-shuts-down-sora-artists-protest-leak-1236224878/ |website=[[Variety (magazine)|Variety]] |access-date=2 December 2024 }}</ref> |
|||
⚫ | |||
⚫ | As of the 9th of December 2024, OpenAI has made Sora available to the public, for ChatGPT Pro and ChatGPT Plus users. Prior to this, the company had provided limited access to a small "[[red team]]", including experts in misinformation and bias, to perform [[Adversarial machine learning|adversarial testing]] on the model.<ref name="CNET" /> The company also shared Sora with a small group of creative professionals, including video makers and artists, to seek feedback on its usefulness in creative fields.<ref name="WDH_MIT_2024_02_15">{{cite web |author=Heaven |first=Will Douglas |date=February 15, 2024 |title=OpenAI teases an amazing new generative video model called Sora |url=https://www.technologyreview.com/2024/02/15/1088401/openai-amazing-new-generative-ai-video-model-sora/ |website=[[MIT Technology Review]] |publisher= |access-date=February 15, 2024 |archive-date=February 15, 2024 |archive-url=https://web.archive.org/web/20240215220619/https://www.technologyreview.com/2024/02/15/1088401/openai-amazing-new-generative-ai-video-model-sora/ |url-status=live }}</ref> |
||
==Capabilities and limitations== |
==Capabilities and limitations== |
||
Line 36: | Line 39: | ||
== See also == |
== See also == |
||
* {{annotated link|VideoPoet}} |
* {{annotated link|VideoPoet}} |
||
* [[Dream Machine (text-to-video model)]] |
|||
<!--* Runway [[Stable Diffusion]] (by [[Runway (company)#Stable Diffusion|Runway]]) |
<!--* Runway [[Stable Diffusion]] (by [[Runway (company)#Stable Diffusion|Runway]]) |
||
* HeyGen, D-ID, Kaiber, AI Studios, Synthesia--> |
* HeyGen, D-ID, Kaiber, AI Studios, Synthesia--> |
||
Line 44: | Line 48: | ||
== External links == |
== External links == |
||
{{Commons category|Sora (text-to-video model)|Sora}} |
{{Commons category|Sora (text-to-video model)|Sora}} |
||
* |
*{{Official website|https://openai.com/index/sora}} |
||
*[https://hailuoai.net Hailuo AI]:Chinese AI Video Generator |
|||
{{OpenAI navbox}} |
{{OpenAI navbox}} |
||
{{Generative AI}} |
|||
{{Artificial intelligence navbox}} |
|||
[[Category:OpenAI]] |
[[Category:OpenAI]] |
Latest revision as of 22:45, 24 December 2024
Developer(s) | OpenAI |
---|---|
Initial release | December 9, 2024 |
Platform | OpenAI |
Type | Text-to-video model |
Website | sora |
Part of a series on |
Artificial intelligence |
---|
Sora is a text-to-video model developed by OpenAI. The model generates short video clips based on user prompts, and can also extend existing short videos. Sora was released publicly for ChatGPT Plus and ChatGPT Pro users in December 2024.[1][2]
History
[edit]Several other text-to-video generating models had been created prior to Sora, including Meta's Make-A-Video, Runway's Gen-2, and Google's Lumiere, the last of which, as of February 2024,[update] is also still in its research phase.[3] OpenAI, the company behind Sora, had released DALL·E 3, the third of its DALL-E text-to-image models, in September 2023.[4]
The team that developed Sora named it after the Japanese word for sky to signify its "limitless creative potential".[5] On February 15, 2024, OpenAI first previewed Sora by releasing multiple clips of high-definition videos that it created, including an SUV driving down a mountain road, an animation of a "short fluffy monster" next to a candle, two people walking through Tokyo in the snow, and fake historical footage of the California gold rush, and stated that it was able to generate videos up to one minute long.[3] The company then shared a technical report, which highlighted the methods used to train the model.[6][7] OpenAI CEO Sam Altman also posted a series of tweets, responding to Twitter users' prompts with Sora-generated videos of the prompts.
In November 2024, an API key for Sora access was leaked by a group of testers on Hugging Face, who posted a manifesto stating they protesting that Sora was used for "art washing". OpenAI revoked the access three hours after the leak was made public, and gave a statement that "hundreds of artists" have shaped the development, and that "participation is voluntary."[8]
As of the 9th of December 2024, OpenAI has made Sora available to the public, for ChatGPT Pro and ChatGPT Plus users. Prior to this, the company had provided limited access to a small "red team", including experts in misinformation and bias, to perform adversarial testing on the model.[4] The company also shared Sora with a small group of creative professionals, including video makers and artists, to seek feedback on its usefulness in creative fields.[9]
Capabilities and limitations
[edit]The technology behind Sora is an adaptation of the technology behind DALL-E 3. According to OpenAI, Sora is a diffusion transformer[10] – a denoising latent diffusion model with one Transformer as the denoiser. A video is generated in latent space by denoising 3D "patches", then transformed to standard space by a video decompressor. Re-captioning is used to augment training data, by using a video-to-text model to create detailed captions on videos.[7]
OpenAI trained the model using publicly available videos as well as copyrighted videos licensed for the purpose, but did not reveal the number or the exact source of the videos.[5] Upon its release, OpenAI acknowledged some of Sora's shortcomings, including its struggling to simulate complex physics, to understand causality, and to differentiate left from right.[11] One example shows a group of wolf pups seemingly multiplying and converging, creating a hard-to-follow scenario.[12] OpenAI also stated that, in adherence to the company's existing safety practices, Sora will restrict text prompts for sexual, violent, hateful, or celebrity imagery, as well as content featuring pre-existing intellectual property.[4]
Tim Brooks, a researcher on Sora, stated that the model figured out how to create 3D graphics from its dataset alone, while Bill Peebles, also a Sora researcher, said that the model automatically created different video angles without being prompted.[3] According to OpenAI, Sora-generated videos are tagged with C2PA metadata to indicate that they were AI-generated.[5]
Reception
[edit]Will Douglas Heaven of the MIT Technology Review called the demonstration videos "impressive", but noted that they must have been cherry-picked and may not be representative of Sora's typical output.[9] American academic Oren Etzioni expressed concerns over the technology's ability to create online disinformation for political campaigns.[5] For Wired, Steven Levy similarly wrote that it had the potential to become "a misinformation train wreck" and opined that its preview clips were "impressive" but "not perfect" and that it "show[ed] an emergent grasp of cinematic grammar" due to its unprompted shot changes. Levy added, "[i]t will be a very long time, if ever, before text-to-video threatens actual filmmaking."[3] Lisa Lacy of CNET called its example videos "remarkably realistic – except perhaps when a human face appears close up or when sea creatures are swimming".[4]
Filmmaker Tyler Perry announced he would be putting a planned $800 million expansion of his Atlanta studio on hold, expressing concern about Sora's potential impact on the film industry.[13][14]
See also
[edit]- VideoPoet – Text-to-video model by Google
- Dream Machine (text-to-video model)
References
[edit]- ^ "Sora | OpenAI". openai.com. Retrieved December 9, 2024.
- ^ Wang, Gerui. "How Sora And AI Videos Transform Media: Strengths And Challenges". Forbes. Retrieved December 24, 2024.
- ^ a b c d Levy, Steven (February 15, 2024). "OpenAI's Sora Turns AI Prompts Into Photorealistic Videos". Wired. Archived from the original on February 15, 2024. Retrieved February 16, 2024.
- ^ a b c d Lacy, Lisa (February 15, 2024). "Meet Sora, OpenAI's Text-to-Video Generator". CNET. Archived from the original on February 16, 2024. Retrieved February 16, 2024.
- ^ a b c d Metz, Cade (February 15, 2024). "OpenAI Unveils A.I. That Instantly Generates Eye-Popping Videos". The New York Times. Archived from the original on February 15, 2024. Retrieved February 15, 2024.
- ^ Brooks, Tim; Peebles, Bill; Holmes, Connor; DePue, Will; Guo, Yufei; Jing, Li; Schnurr, David; Taylor, Joe; Luhman, Troy; Luhman, Eric; Ng, Clarence Wing Yin; Wang, Ricky; Ramesh, Aditya (February 15, 2024). "Video generation models as world simulators". OpenAI. Archived from the original on February 16, 2024. Retrieved February 16, 2024.
- ^ a b Edwards, Benj (February 16, 2024). "OpenAI collapses media reality with Sora, a photorealistic AI video generator". Ars Technica. Archived from the original on February 17, 2024. Retrieved February 17, 2024.
- ^ Spangler, Todd (November 27, 2024). "OpenAI Shuts Down Sora Access After Artists Released Video-Generation Tool in Protest: 'We Are Not Your PR Puppets'". Variety. Retrieved December 2, 2024.
- ^ a b Heaven, Will Douglas (February 15, 2024). "OpenAI teases an amazing new generative video model called Sora". MIT Technology Review. Archived from the original on February 15, 2024. Retrieved February 15, 2024.
- ^ Peebles, William; Xie, Saining (2023). "Scalable Diffusion Models with Transformers". 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 4172–4182. arXiv:2212.09748. doi:10.1109/ICCV51070.2023.00387. ISBN 979-8-3503-0718-4. ISSN 2380-7504. S2CID 254854389. Archived from the original on February 17, 2024. Retrieved February 17, 2024.
- ^ Pequeño IV, Antonio (February 15, 2024). "OpenAI Reveals 'Sora': AI Video Model Capable Of Realistic Text-To-Video Prompts". Forbes. Archived from the original on February 15, 2024. Retrieved February 15, 2024.
- ^ "Sora-generated video of wolves playing with some video issues". ABC News Australia. Retrieved May 16, 2024.
- ^ Kilkenny, Katie (February 23, 2024). "Tyler Perry Puts $800M Studio Expansion on Hold After Seeing OpenAI's Sora: "Jobs Are Going to Be Lost"". The Hollywood Reporter. Archived from the original on February 26, 2024. Retrieved February 26, 2024.
- ^ Edwards, Benj (February 23, 2024). "Tyler Perry puts $800 million studio expansion on hold because of OpenAI's Sora". Ars Technica. Archived from the original on February 26, 2024. Retrieved February 26, 2024.
External links
[edit]- Official website
- Hailuo AI:Chinese AI Video Generator