Business Platform Team

Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.

Tag Archives: S3

Airflow and Spark: Running Spark Jobs in Airflow (Docker-based Solution)

Airflow and Spark: Running Spark Jobs on Airflow (Docker-based Solution)

In this blog post, we set up Apache Spark and Apache Airflow using a Docker container, and in the end, we ran and scheduled Spark jobs using Airflow which is deployed on a Docker container. This is very important because, with Docker images, we are able to solve problems we encountered in development. For example, problems that relate to a different environment, dependencies issues e.t.c, thereby leading to fast development, and deployment to production.

Continue reading

Join Anant's Newsletter

Subscribe to our monthly newsletter below and never miss the latest Cassandra and data engineering news!