Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.
In this blog post, we will discuss a number of ways of doing dependency management when running spark scripts. This particular post is not a part of any of our ongoing series. We often discuss using spark during our Data Engineer’s Lunch events every Monday. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now! We last discussed Spark at a recent Cassandra Lunch. The topic was ETL in Cassandra with Airflow and Spark, Our most recent discussion of Spark can be found here.
Continue readingIn Data Engineer’s Lunch #26, we will discuss how to use Akka Actors for concurrent data processing operations. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!
Continue readingIn Apache Cassandra Lunch #46: Apache Spark Jobs in Scala for Cassandra Data Operations, we discuss how we can do Apache Spark jobs in Scala Cassandra data operations. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!
Continue readingIn Apache Cassandra Lunch #45: Alpakka Cassandra and Twitter, we discuss how you can stream tweets using Twitter4S (Scala Twitter client) and save them to Cassandra using Alpakka Cassandra. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!
Continue readingAlpakka is an open-source project designed to implement stream-aware and reactive integration pipelines for Java/Scala which is built on top of Akka Streams. This blog talks specifically about using Alpakka Cassandra and Akka Streams together with Twitter4S (Twitter client written in Scala) to pull new Tweets from Twitter for a given hashtag (or set of hashtags) using Twitter API v1.1 and write them into a local Cassandra database.
Continue readingSubscribe to our monthly newsletter below and never miss the latest Cassandra and data engineering news!