Evaluation of a group communication middleware for clustered J2EE application servers
Clusters have become the de facto platform to scale J2EE application
servers. Each tier of the server uses group communication to maintain
consistency between replicated nodes. JGroups is the most commonly used
Java middleware for group communications in J2EE open source
implementations. No evaluation has been done yet to evaluate the
scalability of this middleware and its impact on application server
scalability. We present an evaluation of JGroups performance
and scalability in the context of clustered J2EE application servers.
We evaluate the JGroups configuration used by popular software such as
the Tomcat JSP server or JBoss J2EE server. We benchmark JGroups with
different network technologies, protocol stacks and cluster sizes. We
show, using the default protocol stack, that group communication
performance using UDP/IP depends on the switch capability to handle
multicast packets. Fast Ethernet can give better results than Gigabit
Ethernet. We experiment with another configuration using
TCP/IP and show that current J2EE application server clusters up to 16
nodes (the largest configuration we tested) can scale much better with
this configuration. We attribute the superiority of TCP/IP based group
communications over UDP/IP multicast to a better flow control
management and a better usage of the network switches available in
cluster environments. Finally, we discuss architectural improvements
for a better modularity and resource usage of JGroups channels.
This paper has been submitted for publication. Download the current
version of the paper here.
JGroups configurations
You can get here the
benchmark we used for the evaluation.
You can also have a look at the results.
JGroups TCP protocol stack
TCP(bind_addr=bindAddress;loopback=true):
TCPPING(initial_hosts=firstServer[serverPort]):
MERGE2:
FD(timeout=5000):VERIFY_SUSPECT(timeout=1500):
FRAG:GMS(join_timeout=3000;join_retry_timeout=2000)
The parameter "bindAddress" represents the address of the cluster
node. The parameter "firstServer" is the address of the first started
machine in the cluster on server serverPort.
JGroups default stack used in Tomcat clustered server
UDP(bind_addr=bindAddress;
mcast_addr=228.1.2.3;mcast_port=45566;ip_ttl=32):
PING(timeout=3000;num_initial_members=6):
MERGE2:
FD(timeout=5000):
VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(gc_lag=10;retransmit_timeout=3000):
pbcast.STABLE(desired_avg_gossip=10000):
UNICAST(timeout=5000):
FRAG:
pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;
print_local_addr=false)
JGroups stack using the FC flow control layer
UDP(bind_addr=bindAddress;
mcast_addr=228.1.2.3;mcast_port=45566;ip_ttl=32):
PING(timeout=3000;num_initial_members=6):
MERGE2:
FD(timeout=5000):
VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(gc_lag=10;retransmit_timeout=3000):
pbcast.STABLE(desired_avg_gossip=10000):
UNICAST(timeout=5000):
pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;print_local_addr=false):
FC(max_credits=200000;min_credits=5200;down_thread=false):
FRAG
JGroups stack using the FLOW_CONTROL layer
UDP(bind_addr=bindAddress;
mcast_addr=228.1.2.3;mcast_port=45566;ip_ttl=32):
PING(timeout=3000;num_initial_members=6):
MERGE2:
FD(timeout=5000):
VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(gc_lag=10;retransmit_timeout=3000):
pbcast.STABLE(desired_avg_gossip=10000):
UNICAST(timeout=5000):
FLOW_CONTROL
pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;print_local_addr=false):
FRAG
|