by Thor Olavsrud

How the 9 Leading Commercial Hadoop Distributions Stack Up

News
Mar 27, 20146 mins
Big DataOpen Source

All of the leading commercial Hadoop distributions are compatible with Apache Hadoop, so what sets them apart? Here's how the leading commercial distributions identified by Forrester Research compare.

Big data and Hadoop are in the process of transforming enterprise data management architectures. It’s a gold-rush market with pure-plays, enterprise software vendors and cloud vendors are all competing to stake a claim. The open source Apache Hadoop project includes the core modules — Hadoop Common, Hadoop Distributed File System (HDFS), Hadoop YARN and Hadoop MapReduce — but without the support or packaged solutions of a commercial vendor. All of the leading commercial distributions are compatible with Apache Hadoop, so what sets them apart? Here’s how the 9 leading commercial Hadoop distributions identified by Forrester Research stack up.

Pivotal Software Leverages Its Greenplum Engineers

Pivotal Software Leverages Its Greenplum Engineers

Spun out of EMC and VMware, with former VMware CEO Paul Maritz at the helm, Pivotal Software has EMC technical consultants and data scientists behind it. In addition to the columnar Greenplum Database technology it brought from EMC, Pivotal’s Hadoop distribution has an MPP Hadoop SQL engine called HAWQ that provides MPP-like SQL performance on Hadoop.

“Pivotal was the first EDW vendor to provide a full-featured enterprise-grade Hadoop appliance; it was also the first to roll out an appliance family that integrated its Hadoop, EDW and data management layers in a single rack,” Gualtieri writes. “Pivotal’s road map will make its Hadoop solution significantly more competitive; its innovations focus on improving the HAWQ SQL engine and integration with other Pivotal products.”