Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.
In Apache Cassandra Lunch #61, we will discuss different ways of indexing and working with Elassandra as well as showcasing a project I built utilizing Kafka with Elassandra. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!Continue reading
In Data Engineer’s Lunch #33: Spark Cassandra and Elasticsearch for Data Engineering, we will discuss how you can use Spark and Spark jobs to load data from a CSV file, and save + load the data into Cassandra and Elasticsearch. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend a Data Engineer’s Lunch live, it is hosted every Monday at noon EST. Register here now!Continue reading
In this blog post, the second in a series about Open Source Data Catalogs, we will be talking about the Open Source Data Discovery and Metadata Engine known as Amundsen. We will be going over what the main idea of Amundsen is, what kinds of technologies make up Amundsen, methods of installation and development, and then go through the installation process of Amundsen using the docker method along with a few obstacles we ran into while doing so. We will also discuss the main microservices that make up Amundsen, configuration options for them, and how to add authentication to Amundsen. Finally, we conclude with some ending thoughts and conclusions on Amundsen from the perspective of a short dive into it.Continue reading
In this blog, we recap Cassandra Lunch #18, where we had guest speaker, Ryan Quey, discuss and demo a personal project where he uses multiple technologies, including Cassandra and Kafka, to build an app that grabs podcast data related to topics he inputs, stores it, processes it and displays it on searchable a front-end. In case you missed it, the video of Cassandra Lunch is also embedded in the blog!Continue reading
The first part of any machine learning project is to gather data. This sounds easy. You may think that this puts you in the perfect position to work with data you have in relational databases. In some circumstances that may be correct. However, most of the ways that we store data in databases for business platforms are sub-optimal for using machine learning. They require more work to gain the insights we want out of our data.Continue reading