In Data Engineer’s Lunch #17: NoSQL Part 3: Data Store Types, we discussed the four different types of data stores that underlie NoSQL databases. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!
NoSQL databases can differ from each other in a number of ways, even beyond how they are different from SQL databases. To name a few, Scalability or how well/smoothly they handle being scaled up and down in scope. Whether they are open or closed source in regards to their code. What they have retained in terms of SQL processing capabilities. Query language additions cover additional query possibilities that don’t exist in SQL, like querying via MapReduce. Today, however, we will be discussing the different types of underlying data stores and how they affect the use cases for different types of NoSQL databases.
Data is stored in a pair of a key and a value. They are associated via a hash function. Some key-value store databases have caching so that recently accessed data stays available. The underlying data can be stored in String format, as well as JSON, or BLOB formats. Examples of this type of database include Redis as well as DynamoDB and CosmosDB which are actually multi-model databases and so can do this as well as some other types. Use cases for this type of database include user-profiles and shopping carts.
Data is stored in documents, which can be represented in XML, JSON, or BSON. Inside of the documents, they ultimately use a similar key-value model. Examples of this type of database include MongoDB and CouchDB. The multi-model databases DynamoDB and CosmosDB can also be used as document stores. The advantages of this type of database are schema flexibility. The inside of a document can be arranged in any way that is useful. They can also store data in ways that are useful for particular applications, making things like building webpages out of a database easier. Since you only need to access one document to build a page, it can be easier and faster than querying data from multiple tables. Use cases include ECommerce, trading platforms, and mobile app development.
Column stores use column families to store data. Inside of a column family, which is a collection of similar columns, the columns consist of tuples of names and values. Examples of this type of database include Google’s BigTable, built based on the original paper. Cassandra and HBase are also examples of this type. The advantages of column stores are the speed for querying subsections of rows and the speed for aggregations on particular rows. The use case for this type of database is analytics, making it useful for data warehouses, business intelligence tools, and CRMs.
Graph stores use a directed graph to store data. The nodes and edges contain properties that hold data in a manner similar to key-value stores. Examples include InfoGrid, InfiniteGraph, and Neo4j. The advantage of this type of database is that directed graphs are good for modeling complex relationships and networks. Use cases include social networks and knowledge graphs.
If you want a more in-depth discussion about NoSQL data store types, check out the embedded live recording below! If you missed last week’s Data Engineer’s Lunch #16: Introduction to awk for Data Engineering, be sure to check it out! As mentioned above, the live recording of Data Engineer’s Lunch #16 is embedded below. Also, check out our YouTube page for more videos and the Data Engineer’s Lunch playlist here! Don’t forget to subscribe while you are there!
Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.
We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!