April 2008 Archives

Installing Memcached

| | Comments (0) | TrackBacks (0)

Memcached is a great caching application which is widely used in many web 2.0 site, such as LiveJournal, Facebook, Sourceforge etc. I'm also very interest in this software, so first step, let's try to install and use it. For more information about Memcached, please refer to http://danga.com/memcached/

Here, I'd like show you how to install Memcached on RedHat Advanced Server Update 6.

Dependencies


Installation:

  • libevent

./configure --prefix=/usr
make && make install

  • memcached

There have many configuration options to use when you configure memcached, all the support options are available from:
./configure --help

./configure --prefix=/home/memcached --with-libevent=/usr
make && make install

Starting memcached
/home/memcached/bin/memcached -d -m 100 -l 192.168.1.136 -p 11211 -u nobody

-d: run as daemon
-m: how many memory in size of MB to allocate to memcached
-l: the listen on address which running the memcached
-p: the port allocate for memcached
-u: the user which running memcached daemon

After starting memcached daemon, we can use 'netstat' to verify the port 11211 is opened.
#netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State
tcp        0      0 192.168.1.136:11211         0.0.0.0:*                   LISTEN

Some simple Perl scripts to test memcached

We have installed and run the memcached on 192.168.1.136, next step, I'll do some tests on another box 192.168.1.137 to push/pull the data from 192.168.1.136.

Before run the Perl testing scripts, we need install 'Cache::Memcached' module, find it from cpan. Then let's run the following two scripts on the client 192.168.1.137.

1. Pushing the data to memcached
Name: push_mem.pl

#!/usr/bin/perl
use Cache::Memcached;
my $memd = new Cache::Memcached {
      'servers' => [ "192.168.1.136:11211"],
    };

# Set a value
$memd->set("my_key", "123");

$memd->disconnect_all();
exit;

2. Pulling the data from memcached
Name: print_mem.pl

#!/usr/bin/perl
use Cache::Memcached;
my $memd = new Cache::Memcached {
      'servers' => [ "192.168.1.136:11211"],
    };
my $val = $memd->get( "my_key" );
if ( $val )
{
     print "Value is '$val'\n";
}

$memd->disconnect_all();
exit;

Result: Value is '123'

Ok, we can see that the value '123' was stored in memcached when I run the 'push_mem.pl' for the first time, and then we get the same value from memcached when I run 'print_mem.pl'.

Here, I only show you the the simplest examples on how to use memcached, but in a production environment, that will be more complex.

Tips:
We can telnet memcached with port 11211 to check the memcached status, it will be helpful for troubleshooting.

[root@web1 scripts]# telnet 192.168.1.136 11211
Trying 192.168.1.136...
Connected to web1.isoracle.com (192.168.1.136).
Escape character is '^]'.
input command: stats
STAT pid 23810
STAT uptime 622
STAT time 1209240944
STAT version 1.2.5
STAT pointer_size 32
STAT rusage_user 0.000999
STAT rusage_system 0.061990
STAT curr_items 1
STAT total_items 1
STAT bytes 58
STAT curr_connections 2
STAT total_connections 5
STAT connection_structures 3
STAT cmd_get 1
STAT cmd_set 1
STAT get_hits 1
STAT get_misses 0
STAT evictions 0
STAT bytes_read 43
STAT bytes_written 36
STAT limit_maxbytes 104857600
STAT threads 1
END

And we can calculate the hint rate, the hint rate= get_hits/ cmd_get

In this article, I don't wanna talk more details about how to build a mail system step by step, you may get lots of configuration documents about the popular MTA such as Sendmail, Postfix and qmail etc from Google.

I was once a mail system engineer to maintain a commercial mail system which sent out >100 million mails/day, here the number is just the site mail, not including the campaign email, campaign mail will be even more usually. So here, I'd like share with you some of my experiences about how to build a mail system with high scalability, manageability and performance.

1. Split
Most of the MTA has it's own internal policy to keep sending the mails which get the soft bounce(4xx) for a few days, the soft bounce error may due to the network latency or the other reasons, so the mail queue in a single box may become larger and larger, hence, this will cause the delay for sending the 'good mail' (the mails which can be delivered successfully in one time). You know, the site mail is critical and important for the business, they need to be delivered to end users timely.

How to resolve this issue? The answer is split.
 
Here I'd like introduce the concept of 'fallback', what does it mean? It means if the mail in the primary server is not delivered successfully for the first time, then it will be transferred to the fallback mail server for delivering, the benefit is the mail queue on the primary server will not get too high, so the 'good mail' can be delivered to end users timely, also this will reduce the primary mail server load.
 
Currently, most of the MTA supports this feature, you can check the MTA offical document to get more details. From my point of view, it's not difficult to implement the fallback feature on the current mail system, you don't need change a lot.
 
2. Load balance
For a commercial mail system, one or two servers are hard to handle the huge number of mails effectively, so usually, we should consider to use load balance to separate them into each single mail server.
 
For example, assumeing I have 50 powerful servers which act as the primary mail servers,  and 40 common servers for the fallback pool, we can setup two VIP domain names: mx.vip.isoracle.com and fallback.vip.isoracle.com for 'primary' and 'fallback' pools, then we can configure load balance(i.e. F5) to distribute the mails to mx.vip.isoracle.com or fallback.vip.isoracle.com pool for mail delivering.
 
By this way, without any downtime and impcat to the end users, we can easily add more and more servers into the current 'primary' or 'fallback' pool or remove them out from current pool when there have single node issue, the only one thing we need to do is add/remove the entries in the load balance or DNS server, so overall, the system scalability can be greatly improved.
 
If you don't have budget to buy hardware load balance such as F5 or NetScaler, Nginx or DNS lookup round robin is another choice.
 
3. OS and Storage
This is a common topic, I just want to emphasis that we'd better use high I/O performance storage to store the mail queue. SCSI hard disk is preferred, it occupies less CPU resource.
 
BTW, if we use SSD (Solid-stat Disk) to store mail queue, will the I/O performance be greatly improved? ^^
 
For the operating system, I think Linux is good enough. Also, I heared about that the last released FreeBSD 7.0 performance is perfect, but still need more tests to confirm. 
 
3. Monitoring
For any mail system, I think monitoring is very very important.
 
Basically, we need keep close eyes on the following items:
  • Mail queue per single host
  • CPU, Memory and Load, especially the load and CPU
  • Storage usage info, especially for the volume which stores the mail queue

 

4. Troubleshooting and Reporting
You know, as a mail system engineer, we often face many kinds of mail system issue, and usually, we need check the mail log to see what happened, how to get the useful infomation  from the huge mail system log timely for troubleshooting?

I developed a troubleshooting and reporting system when I worked on the mail system, it will pull all the raw mai log from each mail servers to a center mail log server, and some Perl/Shell scripts will analyze and process the huge mail log hourly, at last, the useful data including the sender, receiver, queue id, send time, relay IP, receiver IP, DSN and detailed error log etc will all be stored into Oracle database for analysis. And I also wrote another cgi web page, you just only need input the issue time range and the receiver email address, then you can easily get all the detailed error log from this web page, to be honest, this tool greatly help me during the mail issue troubleshooting.
 
Since we have the mail log information stored in the database, then we can also easliy write tool to generate the charts to show many key metrics, such as delivery rate, soft bounce#, hard bounce# etc. RRDTool is a good choice to store and generate the charts, I like it very much.
 
You know, mail system is very complex, so this topic cannot touch each area of a mail system, please provide your comments/feedback if there is any.