Ryan Greenhall

Thoughts on Software Development

Archive for the ‘devops’ Category

Monitoring Hadoop Clusters using Ganglia.

with 36 comments

I spent a couple of days this week working with my Forward colleague Abs configuring Ganglia to monitor our Hadoop cluster and automating the installation to our production servers. The goal of this article is to provide an overview of the Ganglia architecture combined with our experience of getting it to play nicely with Hadoop.

Ganglia Overview

Ganglia is comprised of three components:
  1. Ganglia Monitoring Deamon (gmond) – The Ganglia Monitoring Deamon (gmond) needs to be installed on each machine that you want to monitor.  In our case this included our slave and master Hadoop nodes. The gmond service collects server metrics and exposes them over TCP.
  2. Ganglia Meta Deamon (gmetad) – The meta Deamon polls all of the available gmond data sources (over TCP) and makes the data available for the web interace. We decided to use a dedicated server for the collection and presentation of the gathered metrics.
  3. Ganglia Web Application – Provides a PHP based web app that presents various visualisation around server performance over various time periods.

ganglia-hadoop-configuration

Installing gmond on your Hadoop servers.

We found the following installation guide, Installing ganglia-3.1.1 on Ubuntu 8.04 Hardy Heron, helpful when installing gmond on our Hadoop servers.

We placed the gmond configuration in the default location: /etc/ganglia/gmond.conf and made the following changed to the defaults.
cluster {
    name = "hadoop"
    owner = "your company"
    latlong = "unspecified"
    url = "unspecified"
}

/* Specifies the port that gmond will receive data on */
udp_recv_channel {
  port = 8649
}

/* Specifies the port and host that this gmond service will send data to. Our gmond services post to themselves rather than gmond services on other machines */
udp_send_channel {
    host = your.hadoop.host.name
    port = 8649
    ttl = 1
}

/* Specifies the port that metrics can be retrieved from */
tcp_accept_channel {
  port = 8650
}
Start gmond using sudo gmond.  To ensure that gmond is collecting stats correctly use: telnet localhost 8650.  This should output a stream of XML containing collected stats.

Configuring Hadoop to send metrics to gmond

Fortunately for us, Hadoop provides gmond monitoring integration through org.apache.hadoop.metrics.ganglia.GangliaContext31, which is configured in hadoop-metrics.properties.  A restart of the tasktracker is required for hadoop specific metrics to appear in the Ganglia web app.
/etc/init.d/hadoop-tasktracker restart

Ganglia Monitoring Server

We decided to install gmetad and the Ganglia web app on a standalone machine.  Once again we found Installing ganglia-3.1.1 on Ubuntu 8.04 Hardy Heron very helpful in installing these two components.  Once gmetad has been installed it needs to know which datasources to poll for metrics.  To do this we added the following entries into /etc/ganglia/gmetad.conf:
data_source "master" master.hadoop:8650
data_source "slave1" slave1.hadoop:8650
data_source "slave2" slave2.hadoop:8650
data_source "slave3" slave3.hadoop:8650
data_source "slave4" slave4.hadoop:8650
data_source "slave5" slave5.hadoop:8650
Finally, start gmetad to be see server metrics in the Ganglia web app (http://your.ganglia.host/ganglia).
sudo metad

Written by Ryan Greenhall

October 22nd, 2010 at 2:04 pm

Posted in devops

Exposed Application Configuration

with 16 comments

Problem:

As a developer or operations person
I want easy access to the current configuration of a web application
So that I can diagnose configuration problems more effectively

Solution:

Expose application properties as a simple HTML page.  Using a URI such as: /internal/status allows the page to be hidden from end users through appropriate configuration of your web server.  For example:

status page example

In this example status page, each configurable property is listed alongside the configured value.  The page even provides the location of the properties file should modifications need to be made.

Teams can go one step further and expose “health checks” through such a page.  In this example the application has three
dependencies that need to be satisfied for correct operation:

1) Need to be able to access a HTTP endpoint;
2) Need a directory to exist (and have read/write permissions)
3) Need to be able to connect to a database.

For each of these properties we can check whether the dependency is satisfied. For example, does the directory exist?
Can we read from the directory?  Any failure can then be exposed visually, providing early warning signs immediately
after a deployment that the application is not healthy and requires further investigation.

For more information on this topic and many other techniques for smoothing the path from dev to production I highly recommend Sam Newman’s QCon 2010 presentation: From Development to Production

Written by Ryan Greenhall

June 3rd, 2010 at 2:41 pm

Posted in devops