administration

Cloudy with the chance of “On Prem”

Lots of Data, consumed fast and in many different ways Over the years, the IT industry has had tectonic shifts in the way we do business. From mainframes we progressed to micro-computers to distributed computing environments in order to address the massive volumes, velocity, and dimensionality of data in the modern age. One trend that …

Cloudy with the chance of “On Prem” Read More »

Network throughput testing – Distributed iperf3

The tool iperf (or iperf3) is popular as a synthetic network load generator that can be used to measure the network throughput between two nodes on a TCP/IP network. This is fine for point-to-point measurements. However, when it comes to distributed computing clusters involving multiple racks of nodes, it becomes challenging to find a tool …

Network throughput testing – Distributed iperf3 Read More »

Tactical vs Strategic Thinking – Part II

This is in continuation of my previous post regarding Tactical vs Strategic thinking. I received some feedback about the previous post and thought I’d articulate the gist of the feedback and address the concern further. The feedback – It seems like the previous post was too generic for some readers. The suggestion was to provide examples …

Tactical vs Strategic Thinking – Part II Read More »

Tactical vs Strategic thinking – Systems Engineer style

Tactical vs Strategic thinking Let me attempt to delineate the two, in context of IT and specifically Infrastructure (because without that, these are just buzz-words that are being tossed around). Tactical – This calls for quick thinking, quick-fixing type of mentality — for e.g., when we need to fix things that are broken quickly, or how …

Tactical vs Strategic thinking – Systems Engineer style Read More »

Tuning the TCP stack and establishing throughput requirements matters, when data traverses over a WAN

In a past life when I used to work for a wireless service provider,  they used a vended application to evaluate how much data bandwidth customers were consuming and that data was sent to the billing system to form the customers’ monthly bills. The app was a poorly written (imho) and was woefully single-threaded, incapable …

Tuning the TCP stack and establishing throughput requirements matters, when data traverses over a WAN Read More »

StarCluster, Cloudera Manager, EC2 Part 2

This is a continuation of my previous post on this topic. First, a disclaimer – I have focused on CentOS primarily. I will try and update this to accommodate more than one Linux distro (and even consider writing for Solaris-based implementations in the future). Here’s the skeleton of the cloudera manager setup plugin: import posixpath …

StarCluster, Cloudera Manager, EC2 Part 2 Read More »

Cloudera Hadoop, StarCluster and Amazon EC2

I ran into an incredible tool known as StarCluster, which is an open-source project from MIT (http://star.mit.edu/cluster/). StarCluster is built using Sun Microsystem’s N1 Grid Engine software (Sun used it to do deployment for HPC environments). And the folks at MIT developed on a fork of that (SGE – Sun Grid Engine) and StarCluster was …

Cloudera Hadoop, StarCluster and Amazon EC2 Read More »