All Transform 2021 sessions are available on demand now. Watch now.
The AWS Neptune chart database is designed to store a wide collection of complex relationships as a scalable service. It supports a number of different and evolving standards to represent knowledge and complex networks as graphics and has recently added links for a Graphics Storage Protocol, openCypher, Neptune ML, and TinkerPop Gremlin to its wide range of supported APIs.
Running on the AWS Cloud, it is an important new member in the increasingly competitive field of graphics databases. In particular, Amazon is focusing on integrating the artificial intelligence routines of the company’s artificial intelligence service, SageMaker, into AWS Neptune. That is meant to create a hybrid tool that stores and analyzes data.
Graphics databases store large collections of relationships between objects, people, ideas, or any other entity that may be represented in a database. While relational databases work well with logging data fields and one-to-many connections, chart databases are optimized for tracking many-to-many relationships, such as social media (who knows who) and social networks. concept networks (which ideas are connected to others). ).
Some of the natural use cases for graphical databases like Neptune are:
- Fraud detection – Criminal behavior often falls into a predictable pattern and graph databases are useful for finding patterns based on connections between events. A series of incorrect events using the same physical or IP address, for example, could lead to future events being marked with the same addresses for scrutiny.
- Recommendation engines – If the chart can link similar items, a simple algorithm can offer users help finding new friends or potential purchases by following these links.
- Knowledge graphs – One of the most sophisticated options is to create a network of relationships between abstract ideas, thoughts and concepts. This can act as the basis for more sophisticated search algorithms, language translation, or other forms of artificial intelligence.
- Money laundering monitors – Some regulations ask financial institutions to track the flow of currency to help prevent crime. Chart databases are natural choices for modeling transactions and detecting net flows.
- Contact tracking – Epidemiologists often work to control the spread of disease by tracking how and when people meet and interact. Chart databases often have algorithms to trace the flow through multiple hops.
Neptune supports the two main conceptual models for graph data processing (Property Graph and RDF) and the different query languages for each. Users can choose a particular model when creating the database tables, but these are not easily interchangeable after creation.
Developers have several options for working with Neptune. The data can be inserted or consulted with any of these protocols:
- Gremlin, to access the property graph data, from the Apache Tinker Pop project
- openCypher, another option to query property graphics data, from Neo4J databases
- SPARQL, to search RDF data, from the W3C
- Bolt, a binary version of the open Cypher protocol, from Neo4J
AWS Neptune is also designed like other Amazon databases to hide much of the complexity of installing the software or scaling it effectively. The service will replicate the data to create read replicas across data centers and Availability Zones. Backups can be automatically activated on S3 repositories. If any node fails, other replicas can take over automatically.
Neptune pricing highly depends on the use. The bill adds the computing power ($ 0.098 per virtual machine hour and more), the amount of storage ($ 0.10 per GB per month), and the number of queries ($ 0.20 per 1 million requests). Backups can be cheaper ($ 0.02 per GB per month in the eastern US). There is a free amount of data transfer, but after the first terabyte it will start at $ 0.09 / GB and decrease with volume.
Integration with Amazon’s SageMaker provides the opportunity to allow the machine learning tool to classify the nodes and edges of the graph based on their attributes and the attributes of the nodes or edges connected to them. You can also determine the most likely connections based on a data set, allowing you to offer predictive routes.
Some applications of this machine learning option include tasks from the physical world, such as finding paths or paths through geographic data that have been converted into a graphical model. Other more abstract tasks, such as knowledge synthesis, rely on graphical models constructed from text or conceptual networks.
How do established companies compete?
Older databases are adding charting capabilities to their existing databases as another type of table. Oracle’s solution can also model property graphs or RDF data under the umbrella of its main database. These players added graph search capabilities to their query language and created a collection of tools such as Graph Studio that make it easy to extend existing data sets to utilize the capabilities of graphs.
Microsoft added property graph modeling capabilities to the Azure Cosmos DB service. Queries can be built using Gremlin to find the nodes that are replicating automatically. The company has also added node and chart objects to SQL Server, allowing chart information to be stored alongside other relational data.
IBM added the Apache TinkerPop analytics framework to Db2 so that queries written in Gremlin can work alongside more standard SQL requests.
How do upstarts compete?
Founded in 2007, Neo4J is one of the leading graphics database companies and is responsible for developing some of the standards that Neptune is emulating. It is compatible with Neo4J, one of the first successful graphics databases. The company has grown steadily and recently raised a funding round worth $ 2 billion, making it far from a startup, but not in the same range as the largest companies in the space.
In interviews, Neo4J’s leadership team cites the company’s moderate size as an advantage because it focuses on building the best graphics database ecosystem, rather than foraging into all technologies. The tool is also easily downloaded, allowing businesses to run it both in the cloud and on premises. The software can run locally, on a preconfigured image on the main clouds, or on Neo4J’s proprietary Aura cloud.
Some other graphics databases continue to grow. ArrangoDB also offers an enterprise version that can run on your own machines or as a preconfigured instance in major clouds. A community version without some of the features to support large multi-machine clusters is also available for those who want to access the source code. ArrangoDB advertises itself as “multimodal” because nodes can act as NoSQL key / value stores, parts of a chart, or both.
TigerGraph is also designed to address large data sets and can be used on local hardware or by subscribing to a service in the TigerGraph Cloud. It is designed to handle larger data sets using Apache Hadoop or Spark. The queries are written in GSQL.
Dgraph is a distributed graphics database available under the Apache license or with a proprietary set of enterprise-grade layers for creating larger multi-machine clusters. The main query language is GraphQL, created by Facebook.
JanusGraph is a Linux Foundation project supported by several companies, including Target. The database is designed to work with some of the big NoSQL databases, such as Apache HBase, Google’s Bigtable, and Oracle’s BerkleyDB. Data analysis can be done through some distributed Map Reduce frameworks or Apache Spark.
Is there anything AWS Neptune can’t do?
Support for Property Graph and RDF gives Neptune great appeal for many projects, including those that will use both architectures. But the support is not complete and Neptune does not offer all the functions in the different standards. For example, inference queries for RDF data are reportedly not available yet because they slowed down performance.
Available solely as a cloud service, AWS Neptune also differs from AWS offerings like Aurora in that the core software is not available as an open source distribution and developers cannot run local versions or leave AWS hardware.
VentureBeat’s mission is to be a digital urban plaza for technical decision makers to gain insight into transformative technology and transact. Our site offers essential information on data technologies and strategies to guide you as you run your organizations. We invite you to become a member of our community, to access:
- updated information on the topics of your interest
- our newsletters
- Exclusive content from thought leaders and discounted access to our treasured events, such as Transform 2021: Learn more
- network features and more
Become a member