Jump to content

Apache Oozie: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Citation bot (talk | contribs)
Add: website. | Use this bot. Report bugs. | Suggested by Whoop whoop pull up | #UCB_webform 90/258
 
(52 intermediate revisions by 34 users not shown)
Line 1: Line 1:
{{Short description|Workflow scheduler for Apache Hadoop}}
{{multiple issues|
{{advert|date=January 2013}}
{{notability|date=January 2013}}
{{primary sources|date=January 2013}}
{{primary sources|date=January 2013}}
{{Infobox software
{{underlinked|date=January 2013}}
| name = Apache Oozie
| logo = Apache Oozie logo.svg
| logo caption =
| screenshot = <!-- Image name is enough -->
| caption =
| screenshot alt =
| collapsible =
| author =
| developer = [[Apache Software Foundation]]
| released = <!-- {{Start date and age|YYYY|MM|DD|df=yes/no}} -->
| discontinued =
| latest release version = 5.2.1
| latest release date = {{Start date and age|2021|02|26|df=yes}}<ref>{{cite web|url=https://lists.apache.org/thread/jy4ny7lqfj31xl4djlmkyldsohyh7k65|access-date=27 September 2022|title=[ANNOUNCE] Apache Oozie 5.2.1 released}}</ref>
| latest preview version =
| latest preview date = <!-- {{Start date and age|YYYY|MM|DD|df=yes/no}} -->
| repo = {{URL|https://gitbox.apache.org/repos/asf/oozie.git|Oozie Repository}}
| programming language = [[Java (programming language)|Java]],<ref>{{cite web|url=https://github.com/apache/oozie/tree/master/core/src/main/java/org/apache/oozie|title=apache/oozie - core/src/main/java/org/apache/oozie|website=[[GitHub]] |access-date=28 May 2020}}</ref> [[JavaScript]]
| operating system = [[Cross-platform]]
| platform = [[Java virtual machine]]
| size =
| language =
| language count = <!-- Number only -->
| language footnote =
| genre =
| license = [[Apache License 2.0]]
| alexa =
| website = {{URL|//oozie.apache.org/}}
| standard =
| AsOf =
}}
}}
'''Oozie''' is a workflow scheduler system to manage '''[[Apache Hadoop|Hadoop]]''' jobs. It is a server based Workflow Engine specialized in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs. Oozie is a Java Web-Application that runs in a Java servlet-container.
'''Apache Oozie''' is a server-based [[workflow]] [[Scheduling (computing)|scheduling]] system to manage [[Apache Hadoop|Hadoop]] jobs.
For the purposes of Oozie, a workflow is a collection of actions (i.e. Hadoop Map/Reduce jobs, Pig jobs) arranged in a control dependency DAG (Direct Acyclic Graph). "control dependency" from one action to another means that the second action can't run until the first action has completed.
The workflow actions start jobs in remote systems (i.e. Hadoop, Pig). Upon action completion, the remote systems callback Oozie to notify the action completion, at this point Oozie proceeds to the next action in the workflow. Oozie workflows contain control flow nodes and action nodes.
Control flow nodes define the beginning and the end of a workflow ( start , end and fail nodes) and provide a mechanism to control the workflow execution path ( decision , fork and join nodes).Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions: Hadoop map-reduce, Hadoop file system, Pig, SSH, HTTP, eMail and Oozie sub-workflow. Oozie can be extended to support additional type of actions.


Workflows in Oozie are defined as a collection of control flow and action [[Vertex (graph theory)|nodes]] in a [[directed acyclic graph]]. Control flow nodes define the beginning and the end of a workflow (start, end, and failure nodes) as well as a mechanism to control the workflow execution path (decision, fork, and join nodes). Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions including Hadoop [[MapReduce]], Hadoop distributed file system operations, [[Pig (programming tool)|Pig]], [[Secure Shell|SSH]], and [[email]]. Oozie can also be extended to support additional types of actions.
Oozie workflows can be parameterized (using variables like ${inputDir} within the workflow definition). When submitting a workflow job values for the parameters must be provided. If properly parameterized (i.e. using different output directories) several identical workflow jobs can concurrently.


Oozie workflows can be parameterised using variables such as <code>${inputDir}</code> within the workflow definition. When submitting a workflow job, values for the parameters must be provided. If properly parameterized (using different output directories), several identical workflow jobs can run concurrently.
'''Oozie''' is distributed under Apache License 2.0

Oozie is implemented as a Java [[web application]] that runs in a [[Java servlet]] container and is distributed under the [[Apache License]] 2.0.


==References==
==References==
{{Reflist}}
* {{Official website|http://oozie.apache.org/}}

==External links==
* {{Official website|//oozie.apache.org/}}

{{Apache Software Foundation}}


[[Category:Apache Software Foundation projects|Oozie]]
[[Category:Hadoop]]
[[Category:Hadoop]]
[[Category:Workflow applications]]

Latest revision as of 20:30, 27 March 2023

Apache Oozie
Developer(s)Apache Software Foundation
Stable release
5.2.1 / 26 February 2021; 3 years ago (2021-02-26)[1]
RepositoryOozie Repository
Written inJava,[2] JavaScript
Operating systemCross-platform
PlatformJava virtual machine
LicenseApache License 2.0
Websiteoozie.apache.org

Apache Oozie is a server-based workflow scheduling system to manage Hadoop jobs.

Workflows in Oozie are defined as a collection of control flow and action nodes in a directed acyclic graph. Control flow nodes define the beginning and the end of a workflow (start, end, and failure nodes) as well as a mechanism to control the workflow execution path (decision, fork, and join nodes). Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions including Hadoop MapReduce, Hadoop distributed file system operations, Pig, SSH, and email. Oozie can also be extended to support additional types of actions.

Oozie workflows can be parameterised using variables such as ${inputDir} within the workflow definition. When submitting a workflow job, values for the parameters must be provided. If properly parameterized (using different output directories), several identical workflow jobs can run concurrently.

Oozie is implemented as a Java web application that runs in a Java servlet container and is distributed under the Apache License 2.0.

References

[edit]
  1. ^ "[ANNOUNCE] Apache Oozie 5.2.1 released". Retrieved 27 September 2022.
  2. ^ "apache/oozie - core/src/main/java/org/apache/oozie". GitHub. Retrieved 28 May 2020.
[edit]