Business Platform Team

Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.

Monthly Archives: May 2021


Data Catalog Overview: Amundson

Open Source Data Catalog Overview: Amundsen

In this blog post, the second in a series about Open Source Data Catalogs, we will be talking about the Open Source Data Discovery and Metadata Engine known as Amundsen. We will be going over what the main idea of Amundsen is, what kinds of technologies make up Amundsen, methods of installation and development, and then go through the installation process of Amundsen using the docker method along with a few obstacles we ran into while doing so. We will also discuss the main microservices that make up Amundsen, configuration options for them, and how to add authentication to Amundsen. Finally, we conclude with some ending thoughts and conclusions on Amundsen from the perspective of a short dive into it.

Continue reading
Airflow and Spark

Data Engineer’s Lunch #25: Airflow and Spark

In Data Engineer’s Lunch #25: Airflow and Spark, we discuss how we can use Airflow to manage Spark jobs. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!

Continue reading
Cover image for Cassandra Lunch #50

Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra

In Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra, we will discuss how you can use Apache Spark and Apache Cassandra to perform basic Machine Learning tasks. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!

Continue reading
Cover slide for the Pandas for Data Engineering webinar

Data Engineer’s Lunch #24: Pandas for Data Engineering

In Data Engineer’s Lunch #24: Pandas for Data Engineering, we discussed using Pandas for performing Data Engineering tasks in Python. This topic is part of our ongoing series on Python ETL tools. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!

Continue reading
Thanos and Cortex

Data Engineer’s Lunch #23: Thanos and Cortex

In Data Engineer’s Lunch #23, Rahul Singh covers the topics of Thanos, Cortex, and a recap of last week’s Prometheus. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!

Continue reading