Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.
Real-time data processing is the current state-of-the-art in business platform data engineering practices. Long gone are the days of batch processing and monolithic ETL engines that are turned on at midnight. Today’s demands come from the number of mobile users and things on the internet. Mobile phones and tablets have steadily increased in the realm of customer experience as companies create mobile only or mobile-first interfaces to interact with their commercial systems or business processes. Similarly, there are other “things” in the “Internet of Things” (IoT) such as rental bikes and scooters, key-less home locks, as smart home thermostats.
This resource for monitoring Datastax, Cassandra, Spark, & Solr performance is just the first iteration of a longer initiative to create the best knowledge base on these real-time data platform technologies such as DataStax Enterprise (Cassandra, Spark, and Solr) as well as for Kafka, Docker, and Kubernetes. Our firm, Anant, has been working with Solr/Lucene for several years, and then over the years picked up Spark and Cassandra, and then made the logical move to become experts at and partners with Datastax.
Datastax OpsCenter is good but we’re wise enough to say, however, that it is just the beginning of the toolset needed to really understand what is happening under the hood in the component technologies that comprise of the Datastax Enterprise Platform. When monitoring to scale complex systems such as business platforms you need to review all signals, not just those that come from the database.
Spark,Mesos, Akka, Cassandra, Kafka, Kubernetes? If you don’t already know what these mean and you have no goal or objective to make software that works at a global level, then you don’t need to be reading this article at all. Seriously, it’ll be a waste of your time. These technologies, now open sourced, originated from the extremely high-end university research laboratories of the University of Berkeley and the halls of high-tech companies such as Google, Twitter, LinkedIn, and Facebook. They were built for different purposes for their creators but now being available to the public, they have been flourishing on their own in the wild ether of the Internet. Why would any CIO, CTO, CMO, or a CEO consider these technologies?
In our opinion Cassandra is one of best nosql database technologies we’ve used for high availability, large scale, and high speed business platforms, More specifically, we work with Datastax Enterprise version for Cassandra where the clients are above a certain size and need to have enterprise grade support 24/7 365 days a year with expertise around the world. There are many topics in which I could have written about as my first “Cassandra” post on our blog, but decided to write about what I call the three stooges of Cassandra data modeling: Larry (Tombstones), Curly (Data Skew), and Moe (Wide Partitions).