I want to write a series of posts describing a walk-through of deploying and running Neo4j with Certificates.
This isn't to say that this is the only way, or even the best way, to achieve this - but hopefully this series can serve as an introduction and signpost to getting your own deployment going the way you want.
I also want to keep this fairly version-neutral, talking about the concepts that we need to cover, rather than the syntax we need to write.
In fact, I'm going to try to keep Neo4j as far out of this as possible (at least until we configure it later) and talk more conceptually about what we are doing with securing any application with certificates.
If you've ever worked with certificates for websites
When troubleshooting SSL configurations with customers, we often see that the bigger picture isn't fully understood.
This can lead us to blindly following an online guide, knowing what to type, but not why.
I hope this series of guides can help bridge this knowledge gap to some extent.
Some Quick Terms
There are better, and more correct, descriptions of these terms online, but I need to introduce some of the major characters in our story.
Hopefully, these quick and cheap descriptions can help us use these terms without simply blindly using the jargon.
This is sometimes called a public certificate, or public key.
This is a file which contains the public side of a public-private keypair (the basis of our SSL/TLS story).
Your certificate is public and can be distributed as far and wide as you like, without compromising security - in fact certificates being sent on the wire is how the authenticity is checked between clients and servers.
They're usually signed by a Certificate Authority, an entity we'll mention below.
You can have self-signed certificates, but they're not much use beyong single-instance test setups as.
Or often simply called the "key". This is the other half of the private key and is responsible for the secret stuff.
Again, we usually use the term to mean the file rather than the contents within it, but it's somewhat interchangeable.
We need to protect access to these keys and should never provide them to "third-parties", be that sending server keys to clients, random strangers, or even your friendly Technical Support Engineers.
Generally shortened to "CA". This is an entity that signs, or produces, your certificates.
This could be something you set up, something managed by your IT Security Team, or an external or online entity. More on this point in a moment!
Root & IntermediCertificate Authority
At the top of a certificate trust chain is a Root Certificate Authority that is authorised to sign its own certificate.
It has to be this way, because if their certificate was signed by a different authority, they'd not be at the root.
By the way, any CA that sits between the Root CA and your server certificate is referred to as an Intermediate CA.
So, the first step is to decide where we're sourcing certificates from. What Certificate Authority are we going to use?
We could generate our own CA for our own PKI (certificate hierarchy), which is ok for testing internal projects, but we generally want something more robust for bigger deployments or production environments.
Who the CA is doesn't really matter to the strength of the TLS connection, but more in terms of the level of trust we can put on the certificate it signs.
In another respect, your locally-built CA is fully customisable, you can set it up however you want.
At the other end of this scale, a commercial CA that you'd purchase certificates from online might be the most restrictive in terms of configuration.
Your IT Security Team's CA will hopefully be a good blend, well managed and supported, but most receptive to your needs and standards.
The next thing I'd advise that we consider is how many certificates (and associated keys) do we want to us.
All of your server instances could share a private key and certificate, or they could each have unique certificates.
There's not really any major difference in security, but some organisations might have rules for/against using a single certificate across mulitple servers.
I'd go for individual instance certificates every time, unless I find myself spinning up new nodes in an ad-hoc (containerised?) fashion.
There's a little bit more work in making individual certificates for each server instance, but it's not that hard to do and of course automation will help here a lot.
In most environments, your database servers will not disappear and be recreated at random, so including a quick certificate creation step as part of the set up work should be quick, and wouldn't be performed very often.
A major step (or, mis-step) is getting the nameas the server instances set up in a manner that supports your deployment.
From the outset, I'd suggest that you come up with a canonical hostname for your server instances, if they don't have one already, and you use this name exclusively.
You generally can't get an SSL certificate for an IP address. You can have an IP address listed as a subjectAlternativeName (SAN) on a certificate, but if you're setting these up you probably don't need this guide!
For simplicity lets say we'll always use fully qualified domain names (server1.neo4j.myorg.test), and never the short names (server1) or the IP addresses these map to.
Your IP addresses might change, but well thought out fully qualified domain names can live forever!
If you're going be using wildcard certificates to cover multiple server instances, you can only put the wildcard at the leaf-end of the fully qualified domain name (FQDN).
For example, you can have a wildcard for *.neo4j.myorg.test, but not neo4j.*.myorg.test.
This fully qualified domain name needs to be used everywhere. In your browser URL address bar, your application connection strings, and all the instances in the configuration file such as advertised address and initial discovery addresses.
If you know how, and have the ability to add them, feel free to use SANs to add any short names or aliases, and even the server's IP addresses, if you must!
Don't forget to include the primary FQDN in the SAN address list if you're using SANs, the subject CN may not be checked if a SAN list is present.
So, at this point, we should covered some initial considerations for getting your SSL certificate deployment off to a good start.
- Work out what CA we're going to be working with. Are we building the PKI from scratch in a greenfield deployment, or leveraging an existing internal or external commercial CA.
- Decide on one-certificate for each instance, or a master certificate that needs to be copied to all the instances.
- Produce a logical server naming scheme that matches up with your DNS records, and is neat in its hierarchial structure.
So, what would I do?
In my lab environment I have:
- an internal, private CA that's like one you might use in an organisation
- structured hostnames within a dedicated DNS zone
- unique certificates on each server that match each server's FQDN, along with any service-related domain name included in the SAN list. There's some with an IP address or two in there, just in case!
But server A and server B don't appear on each others' certificate. It's not a multi-server SAN certificate.
In the next part, we can look at generating these certificates with either a local test PKI system like OpenSSL or EasyRSA, and get them configured in Neo4j Server.