As I mentioned yesterday, the FreeBSD 7 Release doesn't support Broadcom 5722 network adapter which is combined in Dell R300, today, I installed 7.0 Stable 200807 amd64 on Dell R300, and the last stable version can identify the 5722 network adapter.

Tomorrow, this box as the load balance which is running Nginx will be put on rack for production use. Crossing my fingers.
I planed to install FreeBSD 7.0 Release on Dell R300, but unfortunately, 7.0 Release doesn't support the network adapter Broadcom 5722 which is combined in Dell R300.

Some of my friends said that 7.0 stable in snapshot can work with 5722, but I'm not sure if it's stable enough to support the production environment, so finally, I'll have to replace it with Dell 1950 which combines Broadcom 5708C :(

Web cache is very important to a high traffic site, by holding those pictures, css and js files those are not changed frequently, it can reduce pressure to the backend servers, it's very helpful to improve the site speed.

As you may know, Squid is widely used for web cache service, but we have another choice, that is Varnish. Here is the introduction about Varnish from it's offical site:

"Varnish is a state-of-the-art, high-performance HTTP accelerator. Varnish is targeted primarily at the FreeBSD 6/7 and Linux 2.6 platforms, and takes full advantage of the virtual memory system and advanced I/O features offered by these operating systems."

Some of my friends have already implemented Varnish as their web cache server in the production platform, and most of them replaced Squid with Varnish, they told me that Varnish has better performance than Squid in the high traffic site.

So I wanna install a web cache server for my site, although my site traffic is pretty low :)

Firstly, I'd like show you the basic workflow about how web cache works with web server.


 
varnish_workflow.jpg


Ok, let's start to install Varnish, the last stable version of Varnish is 1.1.2, you may download the source code from here. My current web server is running on FreeBSD, Apache, PHP and MySQL, and Varnish will be installed on RedHat Linux.

Installation
1. Createing group and user 'www' for Varnish
# groupadd www
# useradd www -g www

2. Compiling and installing Varnish
# cd varnish-1.1.2
# ./configure --prefix=/home/varnish
# make && make install

3. Creating cache file and log file folder
# mkdir -p /data02/cachefile
# chown -R www.www /data02/cachefile
# mkdir -p /data02/cachelog
# chown -R www.www /data02/cachelog

4. Varnish configuration file
Example from my site:

backend mywebserver {
       set backend.host = "YOUR_WEB_SERVER_IP";
       set backend.port = "80";
}

acl purge {
       "localhost";
       "127.0.0.1";
       "192.168.1.0"/24;
}

sub vcl_recv {
       if (req.request == "PURGE") {
               if (!client.ip ~ purge) {
                       error 405 "Not allowed.";
               }
               lookup;
       }

       if (req.http.host ~ "^www.isoracle.com") {
               set req.backend = mywebserver;
               if (req.request != "GET" && req.request != "HEAD") {
                       pipe;
               }
                 elseif(req.url ~ "\.(php|cgi|pl)($|\?)") {
                       pass;
                 }
               else {
                       lookup;
               }
       }
       else {
               error 404 "isoracle.com Cache Server";
               lookup;
       }
}

sub vcl_hit {
       if (req.request == "PURGE") {
               set obj.ttl = 0s;
               error 200 "Purged.";
       }
}

sub vcl_miss {
       if (req.request == "PURGE") {
               error 404 "Not in cache.";
       }
}

sub vcl_fetch {
       if (req.request == "GET" && req.url ~ "\.(txt|js)$") {
               set obj.ttl = 3600s;
       }
       else {
               set obj.ttl = 10d;
       }
}


For more detailed description about the Varnish configuration file, please refer to the offical document.

5. Starting varnishd
# /home/varnish/sbin/varnishd -n /data02/cachefile -f /home/varnish/vcl.conf -a 0.0.0.0:80 -s file,/data02/cachefile/cache.dat,500M -g www -u www -w 50,200,10 -T 127.0.0.1:1800 -p client_http11=on

For more options about starting up varnishd, please refer to "varnishd --help".

6. Starting varnishncsa
# /home/varnish/bin/varnishncsa -n /data02/cachefile -w /data02/cachelog/varnish.log &

7. Checking Varnish status
# /home/varnish/bin/varnishstat
This command will show you the current Varnish status, such as cache hints, cache misses etc, they're useful for health check.

Verification
We can use curl to check the HTTP header information:

# curl --head http://www.isoracle.com/
HTTP/1.1 200 OK
Server: isoracle/Stable 1 (Unix) PHP/5.2.0
Last-Modified: Sat, 12 Jul 2008 21:54:39 GMT
ETag: "96347-cc05-4879281f"
Content-Type: text/html
Content-Length: 52229
Date: Sun, 13 Jul 2008 01:36:12 GMT
X-Varnish: 1652882410 1652882077
Age: 8877
Via: 1.1 varnish
Connection: keep-alive

We can see that Varnish works now.

Finally, you need change your DNS record to point the web server IP to cache server IP to let the requests hint the cache server.


To be honest, as an Oracle DBA, I'm not very familar with some of the MySQL SQL syntax, there have some difference between Oracle and MySQL.

Here, I wanna create a table in MySQL, the table name like "Test Table", and one of the column name like "First Name", when I try to create this table, it returns the following error:

mysql> create table "Test Table2" (id int, "First Name" varchar(200));

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '"Test Table2" (id int, "First Name" varchar(200))' at line 1

I created this MySQL database by following the normal procedure.

How to fix the problem?

1. You can use ` (backquote) to include the talbe name and column name, like:
create table `Test Table2` (id int, `First Name` varchar(200));

2. Set global 'sql_mode=ANSI_QUOTES'
mysql> set global sql_mode=ANSI_QUOTES;

Comments: This is more similar with 'alter system set ...' in Oracle, once you set this parameter, it will take effect immediately, it will be valid until you restart your MySQL.

3. Modify the my.cnf configuration file, add the following entry in [mysqld] section, and then restart MySQL to take effect.
sql_mode = ANSI_QUOTES

4. Add the option when you start MySQL
mysqld_safe --user=mysql --sql-mode=ANSI_QUOTES &


I prefer the third solution.

Installing Memcached

| | Comments (0) | TrackBacks (0)

Memcached is a great caching application which is widely used in many web 2.0 site, such as LiveJournal, Facebook, Sourceforge etc. I'm also very interest in this software, so first step, let's try to install and use it. For more information about Memcached, please refer to http://danga.com/memcached/

Here, I'd like show you how to install Memcached on RedHat Advanced Server Update 6.

Dependencies


Installation:

  • libevent

./configure --prefix=/usr
make && make install

  • memcached

There have many configuration options to use when you configure memcached, all the support options are available from:
./configure --help

./configure --prefix=/home/memcached --with-libevent=/usr
make && make install

Starting memcached
/home/memcached/bin/memcached -d -m 100 -l 192.168.1.136 -p 11211 -u nobody

-d: run as daemon
-m: how many memory in size of MB to allocate to memcached
-l: the listen on address which running the memcached
-p: the port allocate for memcached
-u: the user which running memcached daemon

After starting memcached daemon, we can use 'netstat' to verify the port 11211 is opened.
#netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State
tcp        0      0 192.168.1.136:11211         0.0.0.0:*                   LISTEN

Some simple Perl scripts to test memcached

We have installed and run the memcached on 192.168.1.136, next step, I'll do some tests on another box 192.168.1.137 to push/pull the data from 192.168.1.136.

Before run the Perl testing scripts, we need install 'Cache::Memcached' module, find it from cpan. Then let's run the following two scripts on the client 192.168.1.137.

1. Pushing the data to memcached
Name: push_mem.pl

#!/usr/bin/perl
use Cache::Memcached;
my $memd = new Cache::Memcached {
      'servers' => [ "192.168.1.136:11211"],
    };

# Set a value
$memd->set("my_key", "123");

$memd->disconnect_all();
exit;

2. Pulling the data from memcached
Name: print_mem.pl

#!/usr/bin/perl
use Cache::Memcached;
my $memd = new Cache::Memcached {
      'servers' => [ "192.168.1.136:11211"],
    };
my $val = $memd->get( "my_key" );
if ( $val )
{
     print "Value is '$val'\n";
}

$memd->disconnect_all();
exit;

Result: Value is '123'

Ok, we can see that the value '123' was stored in memcached when I run the 'push_mem.pl' for the first time, and then we get the same value from memcached when I run 'print_mem.pl'.

Here, I only show you the the simplest examples on how to use memcached, but in a production environment, that will be more complex.

Tips:
We can telnet memcached with port 11211 to check the memcached status, it will be helpful for troubleshooting.

[root@web1 scripts]# telnet 192.168.1.136 11211
Trying 192.168.1.136...
Connected to web1.isoracle.com (192.168.1.136).
Escape character is '^]'.
input command: stats
STAT pid 23810
STAT uptime 622
STAT time 1209240944
STAT version 1.2.5
STAT pointer_size 32
STAT rusage_user 0.000999
STAT rusage_system 0.061990
STAT curr_items 1
STAT total_items 1
STAT bytes 58
STAT curr_connections 2
STAT total_connections 5
STAT connection_structures 3
STAT cmd_get 1
STAT cmd_set 1
STAT get_hits 1
STAT get_misses 0
STAT evictions 0
STAT bytes_read 43
STAT bytes_written 36
STAT limit_maxbytes 104857600
STAT threads 1
END

And we can calculate the hint rate, the hint rate= get_hits/ cmd_get

In this article, I don't wanna talk more details about how to build a mail system step by step, you may get lots of configuration documents about the popular MTA such as Sendmail, Postfix and qmail etc from Google.

I was once a mail system engineer to maintain a commercial mail system which sent out >100 million mails/day, here the number is just the site mail, not including the campaign email, campaign mail will be even more usually. So here, I'd like share with you some of my experiences about how to build a mail system with high scalability, manageability and performance.

1. Split
Most of the MTA has it's own internal policy to keep sending the mails which get the soft bounce(4xx) for a few days, the soft bounce error may due to the network latency or the other reasons, so the mail queue in a single box may become larger and larger, hence, this will cause the delay for sending the 'good mail' (the mails which can be delivered successfully in one time). You know, the site mail is critical and important for the business, they need to be delivered to end users timely.

How to resolve this issue? The answer is split.
 
Here I'd like introduce the concept of 'fallback', what does it mean? It means if the mail in the primary server is not delivered successfully for the first time, then it will be transferred to the fallback mail server for delivering, the benefit is the mail queue on the primary server will not get too high, so the 'good mail' can be delivered to end users timely, also this will reduce the primary mail server load.
 
Currently, most of the MTA supports this feature, you can check the MTA offical document to get more details. From my point of view, it's not difficult to implement the fallback feature on the current mail system, you don't need change a lot.
 
2. Load balance
For a commercial mail system, one or two servers are hard to handle the huge number of mails effectively, so usually, we should consider to use load balance to separate them into each single mail server.
 
For example, assumeing I have 50 powerful servers which act as the primary mail servers,  and 40 common servers for the fallback pool, we can setup two VIP domain names: mx.vip.isoracle.com and fallback.vip.isoracle.com for 'primary' and 'fallback' pools, then we can configure load balance(i.e. F5) to distribute the mails to mx.vip.isoracle.com or fallback.vip.isoracle.com pool for mail delivering.
 
By this way, without any downtime and impcat to the end users, we can easily add more and more servers into the current 'primary' or 'fallback' pool or remove them out from current pool when there have single node issue, the only one thing we need to do is add/remove the entries in the load balance or DNS server, so overall, the system scalability can be greatly improved.
 
If you don't have budget to buy hardware load balance such as F5 or NetScaler, Nginx or DNS lookup round robin is another choice.
 
3. OS and Storage
This is a common topic, I just want to emphasis that we'd better use high I/O performance storage to store the mail queue. SCSI hard disk is preferred, it occupies less CPU resource.
 
BTW, if we use SSD (Solid-stat Disk) to store mail queue, will the I/O performance be greatly improved? ^^
 
For the operating system, I think Linux is good enough. Also, I heared about that the last released FreeBSD 7.0 performance is perfect, but still need more tests to confirm. 
 
3. Monitoring
For any mail system, I think monitoring is very very important.
 
Basically, we need keep close eyes on the following items:
  • Mail queue per single host
  • CPU, Memory and Load, especially the load and CPU
  • Storage usage info, especially for the volume which stores the mail queue

 

4. Troubleshooting and Reporting
You know, as a mail system engineer, we often face many kinds of mail system issue, and usually, we need check the mail log to see what happened, how to get the useful infomation  from the huge mail system log timely for troubleshooting?

I developed a troubleshooting and reporting system when I worked on the mail system, it will pull all the raw mai log from each mail servers to a center mail log server, and some Perl/Shell scripts will analyze and process the huge mail log hourly, at last, the useful data including the sender, receiver, queue id, send time, relay IP, receiver IP, DSN and detailed error log etc will all be stored into Oracle database for analysis. And I also wrote another cgi web page, you just only need input the issue time range and the receiver email address, then you can easily get all the detailed error log from this web page, to be honest, this tool greatly help me during the mail issue troubleshooting.
 
Since we have the mail log information stored in the database, then we can also easliy write tool to generate the charts to show many key metrics, such as delivery rate, soft bounce#, hard bounce# etc. RRDTool is a good choice to store and generate the charts, I like it very much.
 
You know, mail system is very complex, so this topic cannot touch each area of a mail system, please provide your comments/feedback if there is any.
 

I have a DB which stores the archived logs in NFS file system, it's running Oracle 9i and mainly used for Logminer auditing.

Due to the business requirement, I upgraded it to 10g, after that, when I run the Logminer to process the archived log, I got the following errors:

ERROR at line 1:
ORA-01284: file /path/to/arc/100.arc cannot be opened
ORA-00308: cannot open archived log '/path/to/arc/100.arc'
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
Additional information: 2
ORA-06512: at "SYS.DBMS_LOGMNR", line 68
ORA-06512: at line 1

Explanation about ORA-27054:

27054, 00000, "NFS file system where the file is created or resides is not mounted with correct options"
// *Cause:The file was on an NFS partition and either reading the mount tab
// file failed or the partition wass not mounted with the correct
// mount option.
// *Action: Make sure mount tab file has read access for Oracle user and
// the NFS partition where the file resides is mounted correctly.
// For the list of mount options to use refer to your platform
// specific documentation


Solutions from Metalink:

1. Umount the NFS file system and remount it with the following options:
rw,hard,bg,proto=tcp,suid,rsize=32768,wsize=32768,noac

2. Apply patch 5146667.

In my this case, I tried the option 1 to fix the problem.

I just hear about this news from Yahoo news, MySQL acts as an important role in LAMP, it's widely used in web2.0 site, let's see what's the next step of Sun-MySQL.

BTW, Oracle also buy BEA (products: weblogic and Tuxedo)
Dell 2950 server combines PERC5 Raid controller which needs the last Linux kernel to identify the raid controller driver. At the first time, I try to install RedHat Advanced Server 4 Update2 x86_64 on this server, but this version cannot identify the raid driver of PERC5.

Finally, from Dell tech support, I get to know that only RHEL 4 Update 5 or RHEL 5 can be installed on Dell 2950. So, if you also need install RedHat RHEL on Dell 2950, please try RHEL 4 Update 5 or RHEL 5 version.