487

July 22nd, 2022 × #Databases#Graph Databases#Neo4j

Supper Club × Adam Cowley and Neo4j Database

Adam Cowley from Neo4j explains graph databases, how they work, use cases, and how to query data with Cypher. He discusses how Neo4j can be used in web development.

or
Topic 0 00:00

Transcript

Guest 1

Hey, everybody. Welcome to the Syntax Supper Club. We've got Adam Cowley. Is that how I say it, Adam? Yeah. That's right. Yeah. Awesome. We got Adam Cowley from Neo four j on here to finally answer the question, what Is the graph and what does it mean? Supper Club is where we bring on industry experts and talk to them about whatever it is they are working on. We are sponsored by Gats Be the fastest front end for the headless web and light step incident response, the all in one incident response platform for DevOps and SREs.

Guest 1

With me as always is mister Scott Talinski. How are you doing today, Scott? Hey. I'm doing good. Ready to go. Ready to learn about some of the database stuff. Yeah. Cool. Well, Adam, thank you so much for coming on. Really appreciate it. Yeah. No problem. Excited to be here. Thanks for having me.

Guest 1

So you are I think I forget how we got connected, but basically, I have been in the web development database space for for many, many years. And Every I don't know. Every couple of weeks, someone says, hey, what about Neo four j? And then, yeah, the opposite weeks, people say, what is the graph? So we thought people kept tagging you and said, hey. You gotta have them on to sort of explain what these things are, and and we'll we'll dive into it. So, do you wanna give us a quick rundown of who you are, where you're from, and, and who you work for? Yeah. Sure. Yeah. So my name is Adam. I'm a member of the DevRel team at at Neo4j.

Guest 2

So I'm a developer advocate and I look after the GraphAcademy platform, which is our free, online self paced learning platform.

Guest 2

So over the past year or so, I've been been, rebuilding the the platform to make them sort of more hands on, and put, like, a a tighter training sorry. A a integration there with, with other Neo4j projects.

Guest 2

Before Neo4j, I spent 10 years as a, or over 10 years as a a freelance web developer and run my own web development company.

Topic 1 02:33

Adam Cowley has worked with Neo4j for 5 years

Guest 2

And I've been basically building websites for about 20 years, which is Scary.

Guest 1

Scott and I have that had that realization as well. We're like, man, like, we're we're getting into this, you know, like, we're starting to We're starting to hear people that have, like, not ever done I e ten and sometimes even I e 11. And just like, wow. Like,

Guest 2

Here we are. Like, what was the 1st browser you used? So I guess it was must have been Internet Explorer. But, I'd I so I I I tried to be a little bit different in, In in school and college, you use Netscape as well at the same time. Oh,

Topic 2 02:57

Adam Cowley used Netscape browser in school to be different

Guest 1

that's impressive. Of course.

Guest 2

You you know that they've got these, like, animated charts where it shows, like, the percentage of of, of things and and how they change over time? So I was watching 1 the other day, and I was captivated by it. It was just, like, browser usage over the last sort of 30 or so many years. And I I just remember watching, like, Netscape Go go from, like, 35% all the way down to nothing. It just disappeared in, like, 2006.

Guest 2

But I I've got huge back for the the the people that are in, like, 2005, 2006, so we use Netscape. Yeah. They're still going hard on it. Yeah. It's it's funny to see.

Guest 1

We've had some, like, major browser changeovers over the year. Like like, maybe 10 years ago, we had Opera switch over to using Chrome. And, Who else? Well, Chrome even existing. Yeah. Chrome, yeah, Chrome forked web kit into Blink. That was a big one for us as well. And Yeah. It's kinda interesting. I wonder what the next big step is, in that with a browser game. I I guess they're now allowing a third party browsers on the On the iPhone, which is gonna be interesting.

Guest 1

I'm very both very excited about it, but both very a little bit cautious about it because, like, The developer in me is like, absolutely. Like, we need more browsers. I want to be able to, like, when a when a feature comes out, I want to use it immediately on Chrome on the Ios and not have to wait 6 years for safari to have us implement it. And then the other part of me is like, oh, well, and now I have to test Like, what, 3 or 4 browsers on the iPhone?

Guest 3

But yeah. You know, Wes, like, me using the Chrome on Android, Chrome on Android always was Pretty dang solid. So if we can get the, like, Android Chrome on Ios, I don't know, man. That'd be pretty pretty sweet for me. So okay. So Neo four j is a a graph database. Right? Is that what you describe it as? Yeah. That's correct. Okay. So what is A graph database.

Guest 2

Good question. Graph databases are a bit of a, a paradigm shift, I guess, to people coming from backgrounds using other databases, but, really, it's it's nothing to to be scared of.

Topic 3 05:05

Graph databases use nodes and relationships instead of tables and rows

Guest 2

So when I talk about graphs, I I don't mean like Highcharts, ChartJS, NivoCharts or something like that. I'm I'm not talking about these things.

Guest 2

I'm talking about graph structures, so edges and vertices Or notes and relationships as as we refer to them in into,

Guest 3

in in Neo4j. Okay. So you have edges and vertices, and and what does what does that I'll mean in terms of, like, let's say we're we're working with, like you're talking to somebody whose only experience is MySQL Or even, like, a a NoSQL.

Guest 3

Like, what's what is truly the difference there? What what is the the difference in how it How it access how you access data, how you store data.

Guest 2

So so, typically, we use nodes to represent, like, Things in in the database while we use, like, relationships to to basically, either to to connect them together or to represent kind of the the verbs of the of the use case. So if we look at it from a relational database point of view where, you know, we've got tables, columns, and rows, and then we've got, you know, foreign keys and things like that in a in a relational database, when so when we translate that over into, into a graph database, so typically, the the rows inside a table would become nodes in in a database, and they, you would then maybe convert the, the table name to be the, to to be a label. So nose can be identified by, 1 or more labels. So we could say if if you had a a table of of people, you'd have sorry.

Guest 2

I said I said going on going on a rabbit hole. Yeah. No. Keep going. I'm interested.

Guest 2

Yeah. So the the the right of the table would typically become nodes, with with the name of the table being, applied as a label.

Guest 2

And then the the foreign keys, in in those tables would, would become relationships. Oh, okay.

Guest 1

So, like, you've got, let's let's take my course platform for an example, because that's kind of what I'm I'm familiar with. So we've got, like purchases. When somebody bought something, we've got users. A user can have purchases.

Guest 1

You've got courses, at which a purchase can relate to a course. Then each course can have videos, and then and then we have progress can be related to that. So with a graph database, literally all those things are just I'm just looking at, like, a visual example of here. Those are just like documents or what do you call it? A node thrown into the ether.

Guest 1

And then you can put a lay you can throw a label on those, in order to identify

Topic 4 07:41

Nodes are like documents that can contain any data

Guest 2

what type of data is inside of it. Is that Yes. So so so if if we take the statement a, person has purchased the course. So we've got Mhmm. In in there, we've got 2 nodes, essentially. 1 representing the the person or the customer, and then 1 representing the course. And then we cut a relationship between those to say that this person has,

Guest 3

purchased the course. Okay. Is is there, like, a a specific type of data that works best In a graph database compared to your typical, relational database or NoSQL database? As the saying goes,

Guest 2

if the connections between your data is important as the data itself, then you have a graph problem, you should be using a graph database.

Guest 2

Or the the the golden rule really is if you're querying a database You've got 3 or more joins, in in that query, then then you'll need a graph database. Oh, interesting.

Guest 1

And do you do you need to know about The data ahead of time when you create it? I I'm assuming you do. Like like, when you want to create, like, a person, ahead of Time do you say a person will have a name and an age and an email address and maybe a password, and they'll have their all the different types. Is that part of creating the database still similar? Or is it like I remember years ago when the NoSQL came out, people were just literally making anything that they wanted and relating it to each other, or is it is it like that? So the,

Topic 5 09:04

Schema is optional but key entities need to be defined upfront

Guest 2

you you need to to to know the the model itself upfront, the the key entities, but the actual details of of those can can change on the fly. So when we look at Neo4j, it's a, schema free database or a schema optional database, meaning that you can apply a schema to it, but it's not necessarily, required.

Guest 1

Okay. So, like, the names of the notes are are required ahead of time, but you can put any data you want inside of those. So do people use that for like, sometimes I find myself doing, like, data scraping or I see this as big in data science as well, Where, you've got a lot of just random data, pieces of metadata about a person. What like like, you have a person And they 1 person has a dog and the dog's name. You wanna throw that in there.

Guest 1

Whereas, like, maybe, another person will have a car and the type of car that they have. And it's not Possible to to go way ahead of time for that. And and I even think about, like, the advertising industry. Like, the advertising industry stores literally every single type of data. Like, if I so much smell a cinnamon bun on my way to on the way to the store, bam, I'm getting an ad for a cinnamon bun. I'm like, there's somewhere in a database Where they have a table called recently smelled, bakery products and then databases in an or or a cinnamon buns in an array there. There's no possible way that they could have, like, scaffolded that that out. Is that the kind of stuff that people are using this for? Yeah. Definitely. Like, real time recommendations is a is a huge,

Topic 6 10:32

Graph databases are good for real-time recommendations

Guest 2

graph use case. So I'm I'm trying to find the the the name of the the guy that wrote the book at the moment, but there's a, out there that says that you can, know more about a person by the connections they have and their their their friendship groups than The, then what they can say about themselves.

Guest 2

So if if you're, a let's say, a smoker, for example, or or or you like cinnamon buns, you have friends who also like cinnamon buns, then, you know, there there's a a connection there.

Guest 2

So I think the the the study said that, You could predict whether somebody else was a smoker based on, whether the that their immediate network were were also smokers.

Guest 2

So you you can tell a lot about a,

Guest 3

a person by their network and and their connections. Yeah. And That that holds true even, like, I mean, in social media, in your follows kind of become your new network of of humans. I I Listen to something where it's talking about how, like, Instagram or Meta kind of knows more about you than many of your Many of your close friends do because of it it has that that those connections and everything about what you're following, what you're liking, whatever. It's built out this total picture of Who you are as a person, and that's how the advertising is so targeted. This episode is sponsored today by LightStep incident response. The all in one incident response for DevOps and SREs.

Guest 3

LightStep instant response is built on the ServiceNow platform, which is used by over 6,000 companies.

Guest 3

You pay for the services you actually use, not the number of the people on your team, which is important because you can scale your incident response team without adding to your bill. You get intelligent on call scheduling and escalations.

Guest 3

You can get the full context on your service health, active alerts, and who's on call. LightStep incident response immediately pinpoint issues and groups alerts with machine learning, reducing your time to respond. And this means less noise and more personalization for precise notifications where you can take control on declaring incidents automatically notifying the right teams. And you can get unified incident response by seamlessly orchestrating alerts and incident triage with on call scheduling across Slack, Teams, Zoom, desktop, and mobile. So what you'll want to do is head on over to lightstep.comforward/ syntax, and you'll get a free 30 day trial. Listeners of Syntax will also receive a free light step incident response t shirt after firing an alert or incident.

Guest 3

So check it out. Are you in need of incident response? Check out lightstep.comforward/syntax.

Guest 3

Let's Bring it back. I have a I I see on your website a lot about Aura DB.

Topic 7 13:18

Neo4j is the underlying technology, AuraDB is the database service

Guest 3

So, like, there'll be things like saying, get started for free with Aura DB. Can you Explain the difference really quick between Aura DB and Neo four j itself. Is is Aura DB a part of Neo four j? Is it how how is it relayed there? So,

Guest 2

Neo4j itself is the underlying technology, whereas Neo4j Aura is the, service, or so databases is a service.

Guest 2

Okay.

Guest 2

So you you can log on to Neo4j Aura, and you can create a, a free instance.

Guest 2

So 50,000 nodes, 175,000 relationships, which is free in in perpetuity, or you can, choose to to scale up and, basically, swipe a credit card, and off you go. Cool. So and and since most of our

Guest 3

our listeners are web developers, right, this is something that is is not just For data science and and people doing applications, I mean, you have a full on Node. Js,

Guest 2

driver and everything for this. Correct? Yep. Yeah. Correct. So there there are five, official drivers, so that's, Java, JavaScript, Python, Go, and

Topic 8 14:22

There are official drivers for Java, JavaScript, Python, Go and .NET

Guest 3

.net? Yeah. Yep.

Guest 2

And then we have, community drivers there as well for for things like, PHP, Ruby, r.

Guest 2

But yeah. So so, essentially, like, regardless of the the the driver you use, so you connect to to Neo4j over, a TCP protocol called Bolt, And then you run what are called Cypher queries, against database. So Cypher is a, a pattern matching language which allows you to Find patterns in your data and then return that, the those results back through the driver, then they can be consumed by, any sort of application.

Guest 1

Oh, this is cool. So the so that that was my next question is, like, how do you actually get data out of it? So it seems that you learn this new language, much like if you are running in MySQL, you'd learn SQL to select it.

Guest 1

It seems like the Cypher language is Much more terse because you don't have to do the joins. Right? Like, that's the whole pain in the butt part of of running MySQL is that If you are selecting data that crosses multiple tables, you have to join those first, and that can be a really big pain in the butt. So it seems like, Neo4j cipher is much easier.

Guest 1

Do you find that to be a little bit of a hurdle for people having to learn this cipher first?

Topic 9 15:39

Learning Cypher querying is a paradigm shift but intuitive

Guest 2

So it's it's another paradigm shift, but I think once people get it, it it starts to to make sense. So I think from from the the the data storage order to to how you query it. So so so we we talk about this, whiteboard friendly data model where the, the the data that you, Well, that's right. The the the data model or the diagram you draw on a whiteboard is exactly the same all the way down to the the database storage layer. So Neo4j stores data in in such a way that makes traversing graphs, easy simple to do, and and and, and, Cypher is is part of that. So if you think of a, a graph diagram which is drawn on on a piece of paper, so you've got circles which are Connected together by arrows and maybe draw some some boxes around the, the relationships to make relationship types.

Guest 2

This is exactly the the same way that that you query the data. So so for me, this is a thing that, sort of 8 8, 9 years ago when I, started looking at different types of databases, this was the thing that that really drew me to to Neo4j is Is that when you query using Cypher, you basically draw those patterns out using ASCII art. So say we've got those those circles on a, on a piece of paper, So we use parenthesis, to draw a circle around, a node and then we have some some some information inside there. We use dashes, square brackets, and a greater than or less than to to represent the the line that goes between the the the 2 circles.

Guest 2

And then we basically draw out the pattern of, you know, person, acted in movie, in genre, and then we go from there.

Guest 1

Man, It's so true. Like, you sometimes you think about, like, just a simple you got a person, and the person has, like, courses and stuff like that. But then, like, You start thinking about, like, this really, really complicated. Like, show me people who visited Lisbon and touched upon an airport code in the last, 3 years, but hasn't eaten a cinnamon bun.

Guest 1

It's unreal that you can express Such a complicated query in what seems to be a little bit of code because, like so I'm coming from a MongoDB background, and whenever we get complicated in MongoDB, you have to switch from regular ass queries into Aggregations.

Guest 1

Which are very Yeah. Because and then aggregations are such a pain to write, and I'm just, like, always googling things and And whatnot. And I'm glad that I have the option. So if I wanna run, like, how much money did I make per year in the last 5 years Minus this one course. I can get that data if I want to, but it's like

Guest 3

a oh, okay. Here we go. You know? I feel the same way about aggregations west. So it's like Whenever I have to I I I go into Google. I type in MongoDB aggregations. I my first thought is, here we go. Yeah. There we go again. Great. I'm so excited.

Guest 1

Yep. I've been there. Yeah. Yeah. Oh, that's great. I I also think that it's really interesting to look at different querying languages because I had the same thing when I hit, Sanity has this they well, they've they've open sourced it. They have a standard called GR0q, g r o q, and it's a different way to query data. And at first, I was like, Well, like, we have GraphQL. Like, what are you doing making another query language? And then and the answer to that is, well, not really.

Guest 1

This is a lot more powerful in that if you want to select things that are on sale, but more than $40 and and And or people that have seen it, you know.

Topic 10 19:00

GraphQL and Neo4j work well together with custom GraphQL library

Guest 1

So it's kind of cool to see languages that are are going past that, and and and that it's necessary to create a new,

Guest 2

query language like Cypher in this case. Yeah. The the really nice thing about Cypher as well is it's It's really powerful. You could do a lot. So, I mean, it's it's powerful, but it's it's maybe a little bit dangerous. You can do too too many things with it, but you can really do whatever you want. So you can match pattern in in the graph, and then you could do some update operations.

Guest 2

Then maybe you can, traverse off the v o n the the nodes that you're on into Into a new subgraph, and then you can you can query that as well.

Guest 2

Do more read by operations. You you can get really creative with, with with what you do. Yeah. So okay. I'm glad West mentioned

Guest 3

GraphQL.

Guest 3

So one of the big confusion points, I think, for for some people is you hear Graph, you hear GraphQL, and you say, How are they re how are they related? I know, you know, our API is a GraphQL API. I know, Wes, you've worked quite a bit with GraphQL.

Guest 3

It seems like, Neil, it seems like graph databases in GraphQL are separate but related ideas. Do they Play well together. Is there, like, a a special pairing there? Do they work well together? How are they related?

Guest 2

They do. And I mean so, Take this with a pinch of salt because I'm I'm paid to to say this, but, it it makes complete sense to to use, Neo4j and GraphQL together, so we've got our own GraphQL library for that.

Guest 2

So what what the the GraphQL library does, the Neo4j GraphQL library does is it translates the, the GraphQL query into a Cypher query and then then runs that against your database.

Guest 2

So I I'm a little bit on on the fence. So, again, like I said, like, I've been doing this for, like, 20 years. So there's part of me that's worried that I'm, too old to to get new things now, but there's something about GraphQL which I'm a little bit skeptical about on the surface, it looks like a a really neat thing to do. So, like, you you can create this contract with the the front end, and then the the the front end has got the these definitions The pull through to to the back end and then you create your your your resolvers to pull data from, wherever wherever that may be. That can be from from 1 database. That can be From many databases.

Guest 2

On the surface of it, that that sounds great.

Guest 2

I worked on on a few projects where we've used, for example, you know, MongoDB for the, the the the main data store and then using Neo4j as as a graph data store on the sides to do the The the traversals and the, you know, people who bought this also bought this type queries.

Guest 2

What happens is you then sort to you you start to query 1 database and then maybe you get, like, a set of IDs back.

Guest 2

You got latency over the wire as you communicate with that 1st database.

Guest 2

You then get that information back, and then you go into the 2nd database with the, more added latency to get that information back, and things start to to get slow and and start to get messy.

Guest 2

Whereas if you saw everything in a, in a graph database, then everything is is all there. It's in one place.

Guest 2

You can hit a, for example, like an anchor node, and then you you traverse out through the graph to to find, to to find the the information that you're looking for. Cool. So okay. So GraphQL,

Guest 3

You have a GraphQL library.

Guest 3

You can query your data. And it's actually really neat some of the stuff here with these, directives. You can use, like, a relationship directive and and, give, like, a Cypher directive that I'm seeing here are pretty neat. But to to be clear, you don't need graphql to to connect to a neo4 j d graph database you can just use Your own node libraries or or whatever drivers. You can use Cypher directly? Yeah. So it seems like you could use GraphQL QL if you want to. You don't have to use GraphQL if you don't want to, but it all all is good with Neo four j, but GraphQL could be a fit. It's a a direct, way to query the graph.

Topic 11 22:58

Neo4j GraphQL library translates GraphQL to Cypher queries

Guest 3

Looks looks super neat here.

Guest 3

I also saw something about, like, syncing with MongoDB.

Guest 3

Is that Is that a a thing? Like, I know you mentioned it just now with, like, latency. Is that the same thing as, what I'm seeing with, like, being able to sync a,

Guest 2

Neo four j Graph with a MongoDB instance. Yeah. So, so you can do it in in 1 of 2 ways. Either you you use Neo4j as your main data store and then you sync the the data out to, to to MongoDB. So you can do that with a a a transaction, handler. And so it's basically writing a a piece of Java that when a transaction Commit to then push the data out, or vice versa. You can have have have the data pulling into, into Neo4j as as as a secondary, secondary database. I I've seen people use things, you know, like, you you maybe write write the data to to MongoDB.

Topic 12 23:46

Data can be synced between MongoDB and Neo4j

Guest 2

That data gets pushed up to Kafka queue, and then that gets distributed and and consumed into, Anthony for Jay, there's also, I think, a a community connector, but, I'm not sure what state that's in at the moment. Wow. Are do you see people migrating to

Guest 3

Neo four j from other databases, or is it typically new applications written from scratch? Like, is it possible to migrate a MySQL or even a MongoDB, database to a Neo four j graph database. Is that something that people are doing, or Or is it just totally too tough? Yes. So, there there are, inbuilt tools that you can,

Guest 2

migrate data with.

Guest 2

Yeah. I I don't know if I I don't know the the tactful way of answering this question, to be honest. So, yeah, it's it's, like, anything is, anything is possible. Right? Like, if you're deciding, you know what? No graph at all. I see you're you're wearing a JavaScript shirt, right? Yeah. That's right. You're a web developer.

Guest 1

Like, could you write Neo4j queries In the client? Or is there like, what does that look like? Do people typically write an API on the server? Do you even need to write an API? Or is Neo4j, like, Is the cipher the API? Like, do you even need to? Like, what does that look like if I wanted to pull in some data and put it in a div? What does the process to getting there look like? So there there are 2 ways of of doing this, really. So you can either put a an API in between your,

Guest 2

you're front end and, Neo4j. And that means that you get to obscure the the credentials and, you know, you you keep, Neo4j safe. You don't expose any password or anything like that. Or you can connect directly from a, JavaScript applications that could be, like, for example, I could react application straight into, into Neo4j using the, the Neo4j JavaScript driver. So, if if you start to play around with, With Neo4j, the first thing you'll you'll see is the, Neo4j browser. Yep. So it's kind of like a a workbench or or toolkits. That's where you write Cypher queries, then you get graph visualizations and and tables back. And so that uses the, Neo4j drive Neo4j JavaScript driver to connect directly to Neo4j run a Cypher query, get the info back, and then display that. And that's built in React.

Guest 2

So we got a a set of tools that that do that. Or if you wanna be more safe, about it, you can put the API in between the 2.

Guest 1

Okay. And what about like, I'm just thinking about one of our sponsors, which is Hasura.

Guest 1

And, like, Hasura takes a Postgres database, and then they put a whole set of code in front of a Postgres database that Allows you to like, their whole service is, they give you a GraphQL API. They do authentication. They do events. They do webhooks. They do Pretty much everything. And you can just connect straight to your geographical database. Is there, does Neo4j cover any of the, like, services as well? Or is it strictly just a database? Like if I wanted to say, add authentication where people can only access, users can only access purchases that are their own. Is that something that is part of Neo four j, or is that something part of just custom implementation code that you would have to write? So that'd be, custom implementation for that. So there there are some features in in, like, community built libraries and and things like the the Neo4j GraphQL Library where,

Guest 2

it supports things like authentication and and and roles and things like that. And you can store your users and and roles, inside Neo4j, you return as as part of the the user information a set of roles. So that could just be an array of strings, and then you, in the The GraphQL library, you can set a directive to say that only users of a certain, role can perform this, mutation.

Guest 1

Okay. Okay.

Guest 1

What about, like, events or webhooks? Is is that something as well? Often in a database, you would want to say, like, When somebody purchases something, fire off these these, things of events. Is that again, is that is that part of the database? Does the database Fire events that you could listen to, or is that again more implementation code?

Guest 2

So, I mean, if like, for me, I would put that in the location layer, but it is possible to write. If if if you know if you know Java, then it's it's possible to write, kind of triggers to, dis distribute date route to to to different sources and, their core webhooks and things like that. Oh, okay. So you could,

Guest 1

Like, like, make, like, a first party

Guest 2

extension or plug in or whatever you call it and add that to your directly to your database, which is written in Java? Yep. That's right. Yeah. So the the the j in if j stands for for Java. So I I I joke when so when when I joined the the company 5 years or so ago,