Business Platform Team

Anant Corporation Blog: Our research, knowledge, thoughts, and recommendations about building and managing online business platforms.

Tag Archives: data processing


What are the Benefits of Data Mining?

Data mining is the practice of gathering and examining large databases to produce new information and predict future outcomes. Modern businesses rely on massive amounts of data, so much so that an IDG survey of 70 IT and business leaders recently found that 92% of respondents want to deploy advanced analytics more broadly across their organizations. This post will discuss some of the benefits of data mining that these leaders saw within their organizations.

Continue reading

Real Time Business Platforms

Scaling Business Platform Performance with Spark, Mesos, Akka, Cassandra, Kafka, Kubernetes – Part 1/6

Spark,Mesos, Akka, Cassandra, Kafka, Kubernetes? If you don’t already know what these mean and you have no goal or objective to make software that works at a global level, then you don’t need to be reading this article at all. Seriously, it’ll be a waste of your time. These technologies, now open sourced, originated from the extremely high-end university research laboratories of the University of Berkeley and the halls of high-tech companies such as Google, Twitter, LinkedIn, and Facebook. They were built for different purposes for their creators but now being available to the public, they have been flourishing on their own in the wild ether of the Internet. Why would any CIO, CTO, CMO, or a CEO consider these technologies?

Continue reading

ETL w/ Node?

Gain the Upper Hand on ETL using Node.js

Node.js is a JavaScript runtime that is fast, easy to learn, and has an enormous package library.  It is built on top of the Chrome V8 Engine, which uses an asynchronous event-driven model that can be used for creating scalable web applications.  So, why use it for ETL operations? Continue reading

What Makes a Good ETL Project?

Bad

  1. Bad ETL (extract, transform, load) projects are ones that don’t have a strategy for different types of information or lack of knowledge management on how to add/remove different data sources, add/remove processors & translators, and add/remove different sinks of information.
  2. It doesn’t necessarily have to be on any particular platform, just that it has structure.. just as any software should have.. an architecture.

Continue reading

DC Data Wranglers: It’s a Balloon! A Blimp! No, a Dirigible! Apache Zeppelin: Query Solr via Spark

I had the pleasure this past Wednesday of introducing Eric Pugh (@dep4b) to the Data Wranglers DC Meetup group. He spoke about using Solr and Zeppelin in data processing and; specifically, the ways big data can easily be processed and displayed as visualizations in Zeppelin. Also broached was Docker, an application Anant uses, and its role in setting up environments for data processing and analysis. Unfortunately, no actual blimps or zeppelins were seen during the talk, but the application of data analysis to events they usually fly over was presented on last month during a discussion about Spark, Kafka, and the English Premier League.

 

Instead of trying to completely rehash Eric’s presentation, please check out his material for yourself (available below). In short, he showed how multiple open-source tools can be used to process, import, manipulate, visualize, and share your information. More specifically, Spark is a fast data processing engine which you can use to prepare your data for presentation and analysis. Whereas, Zeppelin is a mature, enterprise-ready application; as shown by its recent graduation from Apache’s Incubator Program; and is a great tool to manipulate and visualize processed data.

 

 

 

Please don’t hesitate to reach out with any questions or if you are interested in participating or speaking at a future Data Wranglers DC event. Each event is recorded, livestreamed on the Data Community DC Facebook page and attended by 50 or more individuals interested in data wrangling, data processing, and possible outcomes from these efforts. After the monthly event, many members continue their discussions at a local restaurant or bar.

 

I hope to see you at an event in the near future!