Introduction to Neo4j Database

Neo4j is a graph database developed by Neo4j, Inc. What is a graph database, you ask?
A graph database is a database that stores data and its relationships in the structure of nodes and the edges connecting these nodes.

A node represents the objects in the graph. As shown in Fig.1, nodes are nothing but data records displayed in a pictorial representation. Fig. 1 represents simple empty records with no data stored in it yet.

Fig.1 Nodes in a graph database

How would a node store data in it?

Data is stored as Properties, which are simple name/value pairs. In the Fig 2. we can the a node with it’s properties displayed in name/value pair. The node has 2 properties: name and from.

Fig. 2 Property of a node


Nodes of the same category can be grouped together by applying a Label to each member. For example, if we data about employees working in an organization and their hometowns then the employee information could be placed under one Label called Employee/Person and the hometown data could be grouped in another label called City. A node can have zero or more labels. Labels do not have any properties

Fig 3. Nodes of the same category grouped together in a label

In Fig 3 we can see that Jane Doe and Emil belong to the same category and therefore they are assigned the same label(Person), and Australia and Sweden are grouped together in the same label(Country). Nodes in the same label are assigned the same color by Neo4j so that it becomes easy to identify them while querying.

Since Neo4j is schema-free, nodes of the same label can have common or unique properties.

Relationships in a graph database:

The actual potential of Neo4j is the way it can handle connected data. In a graph, relationship is defined as the edge connecting 2 nodes. Similar to the concept, a relationship in neo4j is how the data record stored in each node is related to another node. Relationships always have a direction, they can be unidirectional or bidirectional. Relationships always have a type and may or may not contain properties.

Fig 4. Unidirectional [lives_in] relationship between Person and Country

Why use graph database?

Neo4j provides a flexible schema-free database system and it was built to leverage not only the data but also the relationships. Unlike other databases, Neo4j graph database doesn’t need to compute the relationships between the data at query time, the connections are already present in the database.  Because Neo4j is designed around this simple, yet powerful optimization, it performs queries with complex connections orders of magnitude faster, and with more depth, than other databases.

You can find the steps to install Neo4j at


Cypher is a declarative graph query language that allows users to query, update and administer the graph database. Cypher is designed to be simple, yet powerful – highly complicated database queries can be easily drafted. Cypher borrows its structure from SQL, therefore the queries are built up using various clauses.

Let’s start from the basics.

The simplest graph database has just a single node in it, so let’s start from there and build our way up to a highly connected graph database.

Cypher query to create a node without any label or property:

Create keyword is used to create a node/relationship in cypher query language. Parenthesis() are used to denote a node.

The keywords in CQL are case insensitive, so create/CREATE/Create will all give the same result. But the label, properties and relationship types are case sensitive which means “Person” is not same as “person”.

Fig 5. Created an empty node

Create a node with label:

Parenthesis() are used to denote a node and a variable ‘n’ is used to hold the label “Person” in it, any letter/variablename can be used instead of n.

Generalized syntax is: CREATE (variable : Label)

CREATE (n:Person)
Fig 6. Created a node with label Person

Create a node with Label and Property:

Create (n:Person{first_name:"Jane", last_name:"Doe"})
Fig 7. Create a node with Label Person and 2 properties: first_name and last_name

A node with label Person is created with 2 properties: first_name and last_name. Properties are defined inside curly braces{} and are key/value pairs. Properties can be string, numeric or boolean.

Create a relationship between 2 nodes:

Relationship are defined in square brackets []. A relationship needs a starting node and an end node. Generalized syntax for creating a relationship between 2 nodes:

Create (n:node1) -[r:relationship_name] -> (c:node2)

n,r and c are temporary variables used in the query.

Create (p:Peron{name:"Jane Doe"})-[r:lives_in]->(c:Country{name:"Australia"})
Fig 8. Created a new relationship called [lives_in] between a Person label with property name and Country label with property name

The generalized syntax of query can also be written as :

Create (c:node2) <-[r:relationship_name] – (n:node1)

And it will output the same result.

Create (c:Country{name:"Australia"})<-[r:lives_in]-(p:Peron{name:"Jane Doe"})
Fig 9. Another way of creating relationship between 2 nodes

Relationship are always directional. Therefore you cannot do this:

Create (c:Country{name:"Australia"})-[r:lives_in]-(p:Peron{name:"Jane Doe"})
Fig 10. If the direction of the relationship is not specified then it will result in an error.

So far we’ve only touched the basics of creating nodes, labels, properties and relationships, the real test of any database management system is how efficiently it can access the data that it stores. So stay tuned for the next tutorial on cypher queries to access the graph the database.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s