Business Platform Team

Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.

Tag Archives: spark


Data Engineer’s Lunch #45: Apache Livy

In Data Engineer’s Lunch #45: Apache Livy, we discussed Apache Livy, a REST API for interacting with Spark Clusters. It also helps with submitting jobs and managing Spark Contexts and cached data. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!

Continue reading

Apache Cassandra Lunch #72: Databricks and Cassandra

In Apache Cassandra Lunch #72: Databricks and Cassandra, we discussed how we can connect Databricks and Cassandra. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!

Continue reading
Introduction to Databricks

Data Engineer’s Lunch #42: Introduction to Databricks

In Data Engineer’s Lunch #42: Introduction to Databricks, we introduce Databricks and discuss how we can use it for data engineering. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!

Continue reading

Apache Cassandra Lunch #65: Spark Cassandra Connector Pushdown

In Apache Cassandra Lunch #65: Spark Cassandra Connector Pushdown, we discussed Spark predicate pushdown in the context of the Spark Cassandra connector. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!

Continue reading

An Overview and Comparison of Datastax Dependencies for Cassandra, Spark and Graph

If you like wrestling with dependencies and version incompatibility issues as much as I do, then this post is for you! This post arises out of a project requiring execution of Gremlin traversals using multiple query APIs, and from both a DSE Graph Analytics cluster and an external Spark cluster at the same time. What we found is that keeping all the different available DSE libraries straight wasn’t always easy. The goal of this blog post is to summarize what we learned from the project so that others can know which library to use and why, particularly when using Spark and Graph with Cassandra. We will look at seven libraries, including DSE Java Driver, OSS Unified Java Driver, dse-java-driver-graph, OSS Spark Cassandra Connector, DSE GraphFrames, BYOS, and dse-spark-dependencies.

Continue reading

Join Anant's Newsletter

Subscribe to our monthly newsletter below and never miss the latest Cassandra and data engineering news!