[Yanel-dev] Cluster performance testing

Wed Feb 2 16:03:23 CET 2011

Hello everybody out there

I performed some simple performance testing with a couple of old
machines to find out how well Yanel scales in a cluster. I also did some
tests with a data repository on a remote storage system using NFS.

I think the results are pretty good. We see a 36% speed increase when
adding 50% more computing power (the second machine has only a single
core instead of two and half as much RAM). Testing with NFS showed that
there is a variable slowdown if you need to access data from a remote
host. However, the NAS I used wasn't really made for such a scenario.
I would except the performance with a dedicated SAN to be much better.

When interpreting the data, please note:

* The machines I used for testing are not equal (the second machine 
is slower then the first one, so don't expect the throughput to double).

* The about page includes two XInclude directives, hence the slowdown.
I briefly tried uncommenting the XInclude directives, which immediatly
led to a speed increase. Perhaps the XInclude implementation could be
optimized somehow?

Cheers
Cedric
-------------- next part --------------
Hardware/Software
-----------------

* Host yanel-cluster-node1: 
    AMD Athlon 64 X2 Dual Core CPU 3600+, 2G RAM
    Ubuntu 10.04 LTS, Jakarta Tomcat 5.0.30
    Sun Java 6 SDK Version 6.22-0ubuntu1~10.04
    Yanel revision 56412 from Subversion

* Host yanel-cluster-node2: 
    AMD Athlon 64 CPU 2800+, 1G RAM
    Ubuntu 10.04 LTS, Jakarta Tomcat 5.0.30
    Sun Java 6 SDK Version 6.20dlj-1ubuntu3
    Yanel revision 56412 from Subversion

* Testing system:
    Intel Core 2 Duo P8400, 2G RAM
    Hardened Gentoo Linux (2011/02/01)
    Sun Java 6 SDK Version 6.23 (vanilla)
    JMeter Version 2.0.1-r4

* Storage/NAS:
    LaCie/Intel NAS with 4 x 3.5" IDE drives in RAID5 mode

Latency between all nodes (both cluster nodes, the testing system, 
and the NAS mounted over NFS) on the network was consistently below 1 ms.

Configurations
--------------

Jmeter:
(1) Total of 10 threads, 1 secs ramp-up, 1000 requests per thread.
    Request Yanel Website homepage (path /yanel/yanel-website/).

(2) Total of 10 threads, 1 secs ramp-up, 1000 requests per thread.
    Request Yanel Website about page (path /yanel/yanel-website/en/about.html).

Cluster:
(A) One Apache HTTP balancer, four Tomcats with Yanel (local storage).
    Balancer with two Tomcats on node1 + two Tomcats on node2.

(B) One Apache HTTP balancer, four Tomcats with Yanel (remote storage).
    Balancer with two Tomcats on node1 + two Tomcats on node2.
    Yanel website realm's data-repository is mounted on NFS.

(C) One Apache HTTP balancer, two Tomcats with Yanel (local storage).
    Balancer with two Tomcats on node1.

Results
-------

1)  A. FIRST RUN 
        Throughput: 73.3 reqs/sec
        Latency: 18/133/2514/103 ms (min/avg/max/dev)
        Error rate: 0.01%

       SECOND RUN
        Throughput: 79.4 reqs/sec
        Latency: 17/122/735/93 ms (min/avg/max/dev)
        Error rate: 0.00%

    B. FIRST RUN 
        Throughput: 53.5 reqs/sec
        Latency: 44/183/4017/243 ms (min/avg/max/dev)
        Error rate: 0.00%

       SECOND RUN
        Throughput: 68.9 reqs/sec
        Latency: 41/142/535/52 ms (min/avg/max/dev)
        Error rate: 0.00%

    C. FIRST RUN
        Throughput: 58.6 reqs/sec
        Latency: 17/167/2151/110 ms (min/avg/max/dev)
        Error rate: 0.00%

2)  B. FIRST RUN
        Throughput: 15.1 reqs/sec
        Latency: 6/646/2326/605 ms (min/avg/max/dev)
        Error rate: 0.00%

       SECOND RUN
        Throughput: 15.4 reqs/sec
        Latency: 91/637/13059/654 ms (min/avg/max/dev)
        Error rate: 0.02%

Notes
-----

About remote storage: After moving the realm's data repository to the NFS
share and reconfiguring the realm, performance on the first run started out at
about 3-4 requests/sec throughput. This is compared to about 30-40 reqs/sec
initial throughput for local storage. The throughput drastically increased
over time however, probably because NFS started caching the files locally. 

As soon as a new/other page is accessed though, latency increases again and
throughput drops back down (initially). After that it usually takes a couple
of seconds to get back up to acceptable values.