ETL w/ Node?

Gain the Upper Hand on ETL using Node.js

Node.js is a JavaScript runtime that is fast, easy to learn, and has an enormous package library.  It is built on top of the Chrome V8 Engine, which uses an asynchronous event-driven model that can be used for creating scalable web applications.  So, why use it for ETL operations?

It’s Asynchronous

One of the big reasons to use Node.js for ETL (Extract, Transform, Load) is because of its asynchronous nature.  If you have hundreds of rows of data that need to be transformed and transmitted, Node.js can quickly process each one with non-blocking calls.  This way your scripts are doing more processing and less waiting around.

It Works Natively with JSON

JSON is a common data-interchange format used by many APIs.  It is supported by products such as Apache Solr, PostgreSQL, and WordPress.  In products like these, data may be returned in the JSON format, stored as a JSON object, or accepted inside the body of an HTTP POST method.  What makes this significant is that, for one, there is no shortage of services that utilize JSON in one or more ways.  Additionally, JSON is ready to be used by any Node.js script or application.  Thus, not only can you find any number of JSON data sources to load into a Node.js script, but as soon as you do, it’s is already formatted in JavaScript Object Notation, ready to be read and manipulated as needed.

It’s Backed by an Enormous Package Library

Npm is free, publicly available, and easy to use!  It contains over 465,000 modules used by more than 7 million developers, with downloads reaching several billion every month.  Public packages are open source, but there are also Enterprise options with different hosting options for private packages.  This is useful for corporations who want to use npm as a library repository.  The point is that if you’re trying to get something done, someone else has probably made it most of the way there.  Once you determine the pieces you’ll be working with, you can search for related packages on the npm site.  You don’t need to reinvent the wheel when it comes to Node.js, since there are a number of packages for connecting to various APIs, doing data validation, converting file formats, and much, much more.

Want to learn more about the technologies and techniques we’ve vetted?  Curious about how we get the job done?  Read more on our homepage and our blog, or feel free to contact us.


Join Anant's Newsletter

Subscribe to our monthly newsletter below and never miss the latest Cassandra and data engineering news!