Jump to content

Dask (software): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Perplextase moved page Dask library to Dask (software): library is a bit confusing (unless you are a programmer), so we should use "software", like the Pandas library does
No edit summary
Line 14: Line 14:
| language = [[Python (programming_language)|Python]]
| language = [[Python (programming_language)|Python]]
}}
}}

{{context|date=April 2019}}


'''Dask''' is an open source library for [[parallel computing]] written in [[Python (programming language)|Python]].<ref>{{cite book |last=Daniel |first=Jesse C. |year=2019 |title=Data Science at Scale with Python and Dask |publisher=Manning Publications |isbn=9781617295607}}</ref><ref>{{cite journal |last1=Rocklin |first1=Matthew |title=Dask: Parallel Computation with Blocked algorithms and Task Scheduling |journal=Proceedings of the 14th Python in Science Conference |date=2015 |pages=126–132 |doi=10.25080/Majora-7b98e3ed-013 |url=https://conference.scipy.org/proceedings/scipy2015/matthew_rocklin.html}}</ref> Originally developed by Matthew Rocklin, Dask is a community project maintained and sponsored by developers and organizations.
'''Dask''' is an open source library for [[parallel computing]] written in [[Python (programming language)|Python]].<ref>{{cite book |last=Daniel |first=Jesse C. |year=2019 |title=Data Science at Scale with Python and Dask |publisher=Manning Publications |isbn=9781617295607}}</ref><ref>{{cite journal |last1=Rocklin |first1=Matthew |title=Dask: Parallel Computation with Blocked algorithms and Task Scheduling |journal=Proceedings of the 14th Python in Science Conference |date=2015 |pages=126–132 |doi=10.25080/Majora-7b98e3ed-013 |url=https://conference.scipy.org/proceedings/scipy2015/matthew_rocklin.html}}</ref> Originally developed by Matthew Rocklin, Dask is a community project maintained and sponsored by developers and organizations.

Revision as of 18:53, 31 March 2020

Dask
Original author(s)Matthew Rocklin
Developer(s)Dask
Initial releaseOctober 28, 2018; 6 years ago (2018-10-28)
Stable release
2.13.0 / March 25, 2020; 4 years ago (2020-03-25)
RepositoryDask Repository
Written inPython[1]
Operating systemLinux, Microsoft Windows, macOS
Available inPython
TypeData analytics
LicenseNew BSD
Websitedask.org

Dask is an open source library for parallel computing written in Python.[2][3] Originally developed by Matthew Rocklin, Dask is a community project maintained and sponsored by developers and organizations.

Overview

Dask is a library composed of two parts. It includes a task scheduling component for building dependency graphs and scheduling tasks. Second, it includes the distributed data structures with APIs similar to Pandas Dataframes or NumPy arrays. Dask has a variety of use cases and can be run with a single node and scale to thousand node clusters.[4]

References

  1. ^ "Dask: Parallel Computation with Blocked algorithms and Task Scheduling" (PDF). This paper introduces dask, a specification to encode parallel algorithms, using primitive Python dictionaries, tuples, and callables.
  2. ^ Daniel, Jesse C. (2019). Data Science at Scale with Python and Dask. Manning Publications. ISBN 9781617295607.
  3. ^ Rocklin, Matthew (2015). "Dask: Parallel Computation with Blocked algorithms and Task Scheduling". Proceedings of the 14th Python in Science Conference: 126–132. doi:10.25080/Majora-7b98e3ed-013.
  4. ^ https://docs.dask.org/en/latest/