Should you use SQL or NoSQL for a new project? In this article I want to simple comparison in terms of features, scalability, ease of use between these two technologies. By the end of it, you’ll have a firm understanding on whether or not SQL or NoSQL are right for your new project, and also some technologies that may be a good starting point.
We’re going to cover a bunch of areas here including:
- Types and Schemas
- ACID Properties
- Technology Options
Prefer Videos? Check out my YouTube video on this topic below!
Types and Schemas
First and foremost, SQL databases store data in tables with rows and columns and a strict format. This means you need to specify the type for every single column, and that trying to insert any data that isn’t of the matching type will fail.
NoSQL databases on the other hand store their data in a variety of formats such as Documents, Key value pairs, and graphs. So depending on the problem you’re trying to solve, one of these technologies may be a better fit for what you’re trying to do.
For example, if you’re building an application where you need to perform very fast lookups by individual IDs, a document model would be a perfect fit. Whereas if your data is naturally comprised of nodes and links between them, such as in a social network or organization hierarchy, a graph based nosql database may be a better option.
NoSQL databases are very flexible in their schema and don’t have strict validation, so it takes practically no effort to add new fields to the objects you’re storing.
In terms of scale, SQL databases have a variety of ways to deal with an increasing workload. They can scale up, which means they can upgrade their hardware, or they can scale out, which means adding replicas of themselves to distribute the load across different machines. There are also some advanced strategies such as data sharding that allow you to scale even further, especially with write heavy applications.
NoSQL databases typically rely on this sharding system by default. More specifically, your data is distributed across multiple physical or virtual nodes and a hashing function is used to quickly figure out which node the data record is stored on, allowing for some very fast lookup speeds. This allows your workload to be spread evenly across different machines that are each responsible for a subset of the data.
Personally speaking, I find NoSQL scaling to be a lot easier, mostly because modern systems these days handle this all for you. Also keep in mind that SQL based systems tend to struggles with write-heavy applications, and in order to deal with this, you need to use some advanced strategies, whereas NoSQL systems deal with this by default.
Relationships in databases allow us to link our data across different tables. This is one of SQLs greatest strengths – building relationships across tables and allowing you to query based on those relationships.
This type of functionality is not a strong point of NoSQL. In NoSQL, we tend to embed data relationships into objects by storing a copy of the object. This can cause some data redundancy, but it speeds up our query time and simplifies our access pattern.
This is kind of engine specific though – since some NoSQL databases such as Graph Databases definitely have a form of relationships, but their implementation is typically hidden behind the engine itself.
Another big point SQL fanboys love to make is that SQL supports ACID, or atomicity, consistenty, idempotency and durability. Basically, a fancy acronym that implies when we perform database transactions, they do so in a reliable and consistent way. SQL enforces ACID by default.
This is important for mission critical systems that favor consistency over availability, such as a credit card or banking system.
A few years ago, many NoSQL databases did not support ACID. But recently, this has begun to change. Popular nosql engines these days are now supporting ACID, so I would say this is no longer a concern and you can’t go wrong between SQL or NoSQL if you require ACID compliance.
Management & Administration
One key point that many folks overlook is managing your database. SQL solutions tend to require hands-on management such as query plan optimization, data volume upgrades, and other routine maintenance tasks. Although currently, most folks use cloud based managed solutions where a lot of this gets managed for you by the cloud services provider.
NoSQL databases are usually very much hands off. The product handles routine tasks behind the scenes which allow you to spend more time to focus on your application and less time on trivial tasks.
Technology Options for SQL and NoSQL
So you may have an intuition in terms of what option is best for your application. But what are some examples of database engines that you could use today?
For SQL, there are quite a few but the most popular are probably:
- Microsoft SQL Server
Amongst these I’m more a fan of Postgres since its open source and very powerful, MySQL is owned by Oracle (they’re still alive?), and Microsoft SQL Server requires expensive licenses to operate and should be avoided.
For NoSQL, the most popular are
Mongodb uses a document model that’s usually hosted on a managed infrastructure.
Redis is a extremely optimized key value lookup store, but in recent years offers a whole bunch more features.
And Elasticsearch is a full-text-search engine used to power search bars and autocomplete functionality.
One thing I really want you to remember is that in the world of NoSQL, there are a LOT of different options. However many of the options are tailored to a specific type of problem. So its important to pick the right product for the problem you’re trying to solve, and not get married to a particular technology