Business Platform Team

Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.

Tag Archives: data lake

Apache Spark Companion Technologies: Data Lakes

Data lakes are a tool for long term data storage. They can be implemented on-premises for use cases requiring high security or in the cloud for more accessible solutions. The Databricks runtime includes code specifically for easing the connection between spark and Data lake technologies as well as its own companion tech, Delta Lake. Delta Lake makes interacting with data in data lakes easier and more consistent but it is possible to work with data lakes without it, as we will see today.

Continue reading
Data Engineers lunch #5

Data Engineer’s Lunch #5: What is a Data Lake?

In Data Engineer’s Lunch #5: What is a Data Lake?, we discuss what data lakes are, why we need them, how we get data in and out, and different implementations of data lakes. The live recording of the Data Engineer’s Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend Data Engineer’s Lunch in person, it is hosted every Monday at 12 PM EST. Register here now!

Continue reading