Business Platform Team

Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.

Tag Archives: spark


Machine Learning with Spark and Cassandra: Model Deployment

Introduction

What is model deployment?

Model deployment is the process that we take to put our trained models to work. It involves moving our model to somewhere with the resources to do serious processing. That place also needs the ability to receive or retrieve data to be processed. We place that trained model within an architecture that delivers data to the model for processing. It then retrieves and delivers or stores the results so that they can be used or seen by users. Similar choices need to be made about whether the model gets retrained, updated, or replaced during operation.

Continue reading
Connecting Cassandra to kafka hero

Cassandra Lunch #18: Connecting Cassandra to Kafka

In this blog, we recap Cassandra Lunch #18, where we had guest speaker, Ryan Quey, discuss and demo a personal project where he uses multiple technologies, including Cassandra and Kafka, to build an app that grabs podcast data related to topics he inputs, stores it, processes it and displays it on searchable a front-end. In case you missed it, the video of Cassandra Lunch is also embedded in the blog!

Continue reading

Open Source Notebooks and Cassandra: Doing SQL on Cassandra Tables

In this blog post, we will introduce a few open-source notebooks that we can use to do SQL on Cassandra. At the bottom of the blog, we have an accompanying webinar that you can watch to see a live demo using 2 of the notebooks we discuss in this blog. This is Part 3 of our series on “Doing SQL and Reporting on Apache Cassandra with Open Source Tools”, and Parts 1 and 2 are also linked below. Also, be on the lookout for part 4 coming soon!

Continue reading

Spark and Cassandra For Machine Learning: Model Selection Tests

Model-selection tests are used to determine which of the two trained machine learning models performs better. The point of model selection tests is to predict which model will generalize better to unseen data and thus comparisons of single test results are not enough. Today we will run through a number of different model selection tests, discuss how they work and how we interpret their results.

Continue reading

Spark and Cassandra: Doing SQL and Joins on Cassandra Tables

In this blog post, we will introduce Spark, a unified analytics engine for large-scale data processing, and discuss how to use it to do SQL on a NoSQL database like Cassandra. We will also give you a quick demo to show how you can quickly test it out yourself. This is Part 2 of our series on “Doing SQL and Reporting on Apache Cassandra with Open Source Tools”, and part 1 is linked below. Also, be on the lookout for Parts 3 and 4 coming soon!

Continue reading