ArangoDB

ArangoDB
Developer(s)	ArangoDB GmbH
Initial release	2011; 13 years ago
Stable release	3.9.3 / September 2, 2022; 2 years ago
Repository	github.com/arangodb/arangodb ;
Written in	C++, JavaScript
Type	Multi-model database, Graph database, Document-oriented database, Key/Value database, Full-text Search Engine
License	Apache License 2.0
Website	arangodb.com

ArangoDB is a free and open-source native graph database system developed by ArangoDB Inc. ArangoDB is a multi-model database system since it supports three data models (graphs, JSON documents, key/value)^[1] with one database core and a unified query language AQL (ArangoDB Query Language). AQL is mainly a declarative language^[2] and allows the combination of different data access patterns in a single query.^[3] ArangoDB is a NoSQL database system^[4] but AQL is similar in many ways to SQL.^[5]

History

ArangoDB Inc. was founded in 2015 by Claudius Weinberger and Frank Celler.^[6] They originally called the database system “A Versatile Object Container", or AVOC for short, leading them to call the database AvocadoDB.^[7]^[8]^[9] Later, they changed the name to ArangoDB.^[10] The word "arango" refers to a little-known avocado variety grown in Cuba.^[11]

In January 2017 ArangoDB raised a seed round investment of 4.2 million Euros led by Target Partners. In March 2019 ArangoDB raised 10 million dollars in series A funding^[12] led by Bow Capital. In October 2021 ArangoDB raised 27.8 million dollars in series B funding led by Iris Capital.^[13]

Release history


Release	First Release	Latest Minor Version	Latest Release	Feature Notes	Reference
3.9	2022-02-15	3.9.2	2022-06-07	Collections replicated on all cluster nodes can be combined with graphs sharded by document attributes to enable more local execution of graph queries ("Hybrid SmartGraphs", "Hybrid Disjoint SmartGraphs"). Language-agnostic tokenization of text ("Segmentation Analyzer").	Release Notes
3.8	2021-07-29	3.8.6	2022-02-23	Graph traversal algorithms to enumerate all paths between two vertices ("k Paths") and to emit paths in order of increasing edge weights ("Weighted Traversals"). Support for sliding window queries to aggregate adjacent documents, value ranges and time intervals. Geo-spatial queries can be combined with full-text search. Flexible data field pre-processing with custom queries ("AQL Analyzer") and the ability to chain built-in and custom analyzers ("Pipeline Analyzer"). Hardware-accelerated on-disk encryption.	Release Notes
3.7	2020-09-16	3.7.17	2022-02-01	Graphs replicated on all cluster nodes to execute graph traversals locally ("SatelliteGraphs"). Document validation using JSON Schema. Wildcard and fuzzy search support for full-text search. Key rotation for superuser JWT tokens, TLS certificates, and on-disk encryption keys.	Release Notes
3.6	2020-01-08	3.6.16	2021-09-06	Option to store all collections of a database on a single cluster node, to combine the performance of a single server and ACID semantics with a fault-tolerant cluster setup ("OneShard"). Parallel execution of queries on several cluster nodes. Late document materialization to only fetch the relevant documents from SORT/LIMIT queries and early pruning of non-matching documents in full collection scans. Inlining of certain subqueries to improve execution time.	Release Notes
3.5	2019-08-21	3.5.7	2020-12-30	Multi-document transactions with individual begin and commit / abort commands ("Stream Transactions"). Time-based removal of expired documents ("Time-to-live Index"). Stop condition support for graph traversals ("Pruning in Traversals"). Graph traversal algorithm to get multiple shortest paths ("k Shortest Paths"). Co-located joins in a cluster using identically sharded collections ("SmartJoins"). Consistent snapshot backup in cluster mode. Custom text pre-processors for full-text search ("Configurable Analyzers"). Data masking capabilities for attributes containing sensitive data / PII when creating backups.	Release Notes
3.4	2018-12-06	3.4.11	2020-09-09	Integrated full-text search and information retrieval engine ("ArangoSearch"). Improved geo-spatial index with GeoJSON support. Insert operations can be turned into a replace automatically, in case that the target document already exists ("Repsert"). Round-robin load-balancer support for cloud environments. Query profiling to show detailed runtime information. Cluster-distributed aggregation queries. Native implementations in C++ of all built-in query functions. Multi-threaded dump and restore operations.	Release Notes
3.3	2017-12-22	3.3.25	2020-02-28	Datacenter to Datacenter Replication for disaster recovery ("DC2DC"). Encrypted backups. Deployment mode for single servers with automatic failover.	Release Notes
3.2	2017-07-20	3.2.18	2019-02-02	Distributed iterative graph processing with Pregel in single server and cluster. Collections replicated on all cluster nodes to execute joins with sharded data locally ("SatelliteCollections"). Fault-tolerant microservices. Support for composable, distance-based geo-queries. Export utility for multiple formats. Encryption of on-disk data. LDAP authentication.	Release Notes
3.1	2016-11-03	3.1.29	2018-06-23	Value-based sharding of large graph datasets for better data locality when traversing graphs ("SmartGraphs"). Support for vertex-centric indexes for more efficient graph traversals with filter conditions. New viewer for large graphs, supporting WebGL. Binary wire format ("VelocyStream"). Low-latency request handling using a boost-ASIO server infrastructure. Improved query editor and query explain output. Audit logging.	Release Notes
3.0	2016-07-23	3.0.12	2016-11-23	Cluster support with synchronous replication and automatic failover. Binary storage format ("VelocyPack"). Persistent indexes that are stored on disk for faster restarts.	Release Notes

Main features

JSON: ArangoDB uses JSON as a default storage format,^[14] but internally it uses ArangoDB VelocyPack – a fast and compact binary format for serialization and storage.^[15] ArangoDB can natively store a nested JSON object as a data entry inside a collection. Therefore, there is no need to disassemble the resulting JSON objects. Thus, the stored data would simply inherit the tree structure of the JSON data.
Predictable performance: ArangoDB is written mainly in C++^[16] and manages its own memory to avoid unpredictable performance arising from garbage collection.
Scaling: ArangoDB provides scaling through clustering.^[17]
Reliability: ArangoDB provides datacenter-to-datacenter replication.^[18]
Kubernetes: ArangoDB runs on Kubernetes, including cloud-based Kubernetes services Amazon Elastic Kubernetes Service (EKS), Google Kubernetes Engine (GKE), and Microsoft Azure Kubernetes Service (AKS).^[19]
Microservices: ArangoDB provides integration with native JavaScript microservices directly on top of the DBMS using the Foxx framework.^[20]
Multiple query languages: The database has its own query language, AQL (ArangoDB Query Language), and also provides GraphQL to write flexible native web services directly on top of the DBMS.^[21]
Search: ArangoDB’s search engine combines boolean retrieval capabilities with generalized ranking components allowing for data retrieval based on a precise vector space model.^[22]
Pregel algorithm: Pregel is a system for large scale graph processing.^[23] Pregel is implemented in ArangoDB and can be used with predefined algorithms, e.g. PageRank, Single-Source Shortest Path and Connected components.^[24]
Transactions: ArangoDB supports user-definable transactions. Transactions in ArangoDB are atomic, consistent, isolated, and durable (ACID), but only if data is not sharded.^[25]

Query language

AQL (ArangoDB Query Language) is the SQL-like query language^[26] used in ArangoDB. It supports CRUD operations for both documents (nodes) and edges, but it is not a data definition language (DDL). AQL does support geospatial queries.

AQL is JSON-oriented as illustrated by the following queries:

// Return every document in a collection
FOR doc IN collection 
  RETURN doc
  
// Count the number of documents in a collection
FOR doc IN collection
    COLLECT WITH COUNT INTO length
    RETURN length
    

// Add a new document into our collection
INSERT { _key: "john", name: "John", age: 45 } INTO collection


// Update document with key of “john” to have age 46.
UPDATE { _key: "john", age: 46 } IN collection


// Add an attribute numberOfLogins for all users with status active:
FOR u IN users
  FILTER u.active == true
  UPDATE u WITH { numberOfLogins: 0 } IN users

Parameterized query

The following is a parameterized query for finding the number of descendants of a particular node (@start) in a graph named @graph with @max nodes:

FOR vert, edge, path IN 1..@max OUTBOUND @start GRAPH @graph
 RETURN path

Editions

Open Source: ArangoDB Community Edition is a free graph database with native multi-model database capabilities written mainly in C++ and available under an open-source license (Apache 2).^[27]
Commercial self-managed: ArangoDB Enterprise is a paid subscription that includes graph-aware sharding (called “SmartGraphs”)^[28] and collection replication (called “Satellite Collections”) to reduce query times,^[29] and increased security.^[30]
Cloud: ArangoDB is offered as a cloud service called Oasis, providing ArangoDB databases as a Service (DBaaS). ArangoDB Oasis provides the functionality of an ArangoDB cluster deployment while minimizing the amount of administrative effort required.^[31] ArangoDB Oasis run on multiple cloud service providers, include AWS, Azure, and Google Cloud.^[32]

ArangoDB publishes a comparison of different editions on their website.^[33]

References

^ "Advantages of native multi-model in ArangoDB". ArangoDB. Retrieved 2022-07-26.
^ "ArangoDB Query Language (AQL) Introduction | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.
^ "AQL Query Patterns & Examples | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.
^ Celler, Author Frank (2012-03-07). "ArangoDB's design objectives". ArangoDB. Retrieved 2022-07-26. {{cite web}}: |first= has generic name (help)
^ "ArangoDB Query Language (AQL) Introduction | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.
^ "Variety Database". www.avocadosource.com. Retrieved 2022-07-27.
^ Ortell, Bill (2021-03-08), AvocadoDB, retrieved 2022-07-27
^ AvocadoDB explained, retrieved 2022-07-27
^ AvocadoDB Query Language Jan Steemann in english, retrieved 2022-07-27
^ ""AvocadoDB" becomes "ArangoDB"". ArangoDB. 2012-05-09. Retrieved 2022-07-27.
^ "Variety Database". www.avocadosource.com. Retrieved 2022-08-05.
^ Weinberger, Author Claudius (2019-03-14). "ArangoDB receives Series A Funding led by Bow Capital". ArangoDB. Retrieved 2022-07-27. {{cite web}}: |first= has generic name (help)
^ "ArangoDB Announces $27.8 Million Series B Investment to Accelerate Development of Next-Generation Graph ML, Providing Advanced Analytics and AI Capabilities at Enterprise Scale". ArangoDB. Retrieved 2022-07-27.
^ AvocadoDB explained, retrieved 2022-08-05
^ AvocadoDB Query Language Jan Steemann in english, retrieved 2022-08-05
^ ArangoDB, ArangoDB, 2022-08-05, retrieved 2022-08-05
^ "Cluster | ArangoDB Deployment Modes | Architecture | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
^ "DC2DC Replication | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
^ "Kubernetes | Tutorials | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
^ "Foxx Microservices | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
^ ArangoDB, ArangoDB, 2022-08-05, retrieved 2022-08-05
^ "ArangoSearch - Full-text search engine including similarity ranking capabilities". ArangoDB. Retrieved 2022-08-05.
^ "Stanford University Pregel White paper" (PDF).
^ "Pregel | Data Science | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
^ "Transactions | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.
^ "Cluster | ArangoDB Deployment Modes | Architecture | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.
^ ArangoDB, ArangoDB, 2022-08-11, retrieved 2022-08-11
^ "ArangoDB SmartGraphs | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.
^ "ArangoDB SatelliteCollections | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.
^ "ArangoDB Enterprise Features". ArangoDB. Retrieved 2022-08-11.
^ "Getting Started with ArangoDB Oasis | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.
^ "ArangoDB Oasis". ArangoDB Oasis. Retrieved 2022-08-11.
^ "Subscriptions". ArangoDB. Retrieved 2022-08-11.

[1] "Advantages of native multi-model in ArangoDB". ArangoDB. Retrieved 2022-07-26.

[2] "ArangoDB Query Language (AQL) Introduction | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.

[3] "AQL Query Patterns & Examples | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.

[4] Celler, Author Frank (2012-03-07). "ArangoDB's design objectives". ArangoDB. Retrieved 2022-07-26. {{cite web}}: |first= has generic name (help)

[5] "ArangoDB Query Language (AQL) Introduction | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-07-26.

[6] "Variety Database". www.avocadosource.com. Retrieved 2022-07-27.

[7] Ortell, Bill (2021-03-08), AvocadoDB, retrieved 2022-07-27

[8] AvocadoDB explained, retrieved 2022-07-27

[9] AvocadoDB Query Language Jan Steemann in english, retrieved 2022-07-27

[10] ""AvocadoDB" becomes "ArangoDB"". ArangoDB. 2012-05-09. Retrieved 2022-07-27.

[11] "Variety Database". www.avocadosource.com. Retrieved 2022-08-05.

[12] Weinberger, Author Claudius (2019-03-14). "ArangoDB receives Series A Funding led by Bow Capital". ArangoDB. Retrieved 2022-07-27. {{cite web}}: |first= has generic name (help)

[13] "ArangoDB Announces $27.8 Million Series B Investment to Accelerate Development of Next-Generation Graph ML, Providing Advanced Analytics and AI Capabilities at Enterprise Scale". ArangoDB. Retrieved 2022-07-27.

[14] AvocadoDB explained, retrieved 2022-08-05

[15] AvocadoDB Query Language Jan Steemann in english, retrieved 2022-08-05

[16] ArangoDB, ArangoDB, 2022-08-05, retrieved 2022-08-05

[17] "Cluster | ArangoDB Deployment Modes | Architecture | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.

[18] "DC2DC Replication | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.

[19] "Kubernetes | Tutorials | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.

[20] "Foxx Microservices | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.

[21] ArangoDB, ArangoDB, 2022-08-05, retrieved 2022-08-05

[22] "ArangoSearch - Full-text search engine including similarity ranking capabilities". ArangoDB. Retrieved 2022-08-05.

[23] "Stanford University Pregel White paper" (PDF).

[24] "Pregel | Data Science | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.

[25] "Transactions | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-05.

[26] "Cluster | ArangoDB Deployment Modes | Architecture | Manual | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.

[27] ArangoDB, ArangoDB, 2022-08-11, retrieved 2022-08-11

[28] "ArangoDB SmartGraphs | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.

[29] "ArangoDB SatelliteCollections | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.

[30] "ArangoDB Enterprise Features". ArangoDB. Retrieved 2022-08-11.

[31] "Getting Started with ArangoDB Oasis | ArangoDB Documentation". www.arangodb.com. Retrieved 2022-08-11.

[32] "ArangoDB Oasis". ArangoDB Oasis. Retrieved 2022-08-11.

[33] "Subscriptions". ArangoDB. Retrieved 2022-08-11.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]