Hadoop

StarCluster, Cloudera Manager, EC2 Part 2

This is a continuation of my previous post on this topic. First, a disclaimer – I have focused on CentOS primarily. I will try and update this to accommodate more than one Linux distro (and even consider writing for Solaris-based implementations in the future). Here’s the skeleton of the cloudera manager setup plugin: import posixpath …

StarCluster, Cloudera Manager, EC2 Part 2 Read More »

Cloudera Hadoop, StarCluster and Amazon EC2

I ran into an incredible tool known as StarCluster, which is an open-source project from MIT (http://star.mit.edu/cluster/). StarCluster is built using Sun Microsystem’s N1 Grid Engine software (Sun used it to do deployment for HPC environments). And the folks at MIT developed on a fork of that (SGE – Sun Grid Engine) and StarCluster was …

Cloudera Hadoop, StarCluster and Amazon EC2 Read More »