Scalable HA Clustering with JBoss AS 7 / EAP 6

Overview

In a recent blog-post Clustering in JBoss AS7/EAP 6 we showed how basic clustering in the new EAP 6 and JBoss AS 7 can be used. The EAP 6 is basically an AS 7 with official RedHat-support. Our cluster we described in that post was small and simple. This post will cover much more complex cluster structures, how to build them and how we can utilize the new domain-mode for our clusters. There are multiple ways to build and manage bigger JBoss cluster environments. We will describe two ways to do so: One using separating techniques also applicable to older JBoss versions and the other way using an Infinispan feature called distribution.

Scalability vs. Availability

The main challenge when building a cluster is to make it both highly available and scalable.

Availability for a cluster means: If one node fails, all the sessions on that node will be seamlessly served by another node. This can be achieved through session-replication. Session-replication is preconfigured and enabled in the ha profile in the domain.xml. Flat replication means that all sessions are copied to all other nodes: If you have got four nodes with 1GB memory for each of them, your cluster can only use 1GB of memory because basically all nodes store copies from each other. I. e. your cluster will not have 4*1GB=4GB memory. If you would add more nodes to this cluster you would not get more memory, you will even lose some memory due to overhead for replication. But you will get more availability and more important more network traffic due to replication overhead (all changes need to be redistributed to all other nodes). Let us call this cluster topology full-replication.

Scalability means if you add more nodes to your cluster you get more computing power from your cluster. With computing power we mean both: CPU-power and memory. Consider a cluster with a bunch of nodes which are identical but do not know about each other. Some load-balancer will ensure that every node got work to do. That concept will scale very well but if a node crashes all its data is lost – bad luck for the user who just filled a big shopping-cart. This cluster concept has another advantage. You could drain all sessions from one node, update the application, JBoss or the operating system, then put it back up and continue with another node. This would not work with the ha-cluster due to serialversionUIDs and a lot of other possible incompatibilities. Ok, for more complex live-updates including database-scheme changes or other nasty things it is not that easy. But there is a tendency that updates on this cluster-topology are easier than on the first topology. Let us call this cluster topology no-replication.

As the names full-replication and no-replication already mention both cluster topologies are extremes but it shows a simple fact: Increasing availability will not increase the computing-power of a cluster, at least not in terms of memory. And just increasing the computing power will not increase the availability of a cluster. In this way the dimensions scalability (computing power) and availability are orthogonal. Increasing both aspects at the same time is more complex and will be covered in the next sections.

Scalable HA Clusters

As already mentioned we will present two ways to build a cluster that is both: highly availably and scalable. The first way will use a concept we call sub-clusters and the second way will use a feature of the Infinispan-cache called distribution.

Using Sub-Clusters

Topology Concept

This cluster will be a scalable cluster which is built of multiple sub-clusters. These sub-clusters will be highly-available, i. e. the nodes of one sub-cluster will replicate each other. But the nodes of different sub-clusters will not replicate each other. The complete cluster will scale up by adding additional sub-clusters.

How many sub-clusters you use and how many nodes your sub-clusters will contain, depends on the application you will be running. The first thing to observe is that the size of the sub-clusters is bounded from below by your availability requirements. And the amount of sub-clusters is bound from below by your needs for computational power. The different sub-clusters can be distributed far over the internet. Maybe you have got a few sub-clusters local at you company and some more at multiple cloud-providers or other server-farms. A sub-cluster reaching over bigger infrastructure borders is a bad idea due to the lack of performance you will experience.

We recommend to make use of the domain-mode. The sub-clusters will then be represented by server-groups.

Setting up an example cluster

For this example we will be using two sub-clusters with two nodes each sub-cluster. Let us start to set up a domain with 5 servers. The first as domain-controller and four normal hosts. You can read the last post of this series  on how to do this. Each sub-cluster will be represented by a server-group. So let us build two server-groups: subcluster1 and subcluster2.

Now edit the host.xml on your nodes and add servers to your group. If you start your servers you will observe a bad thing: All your four servers replicate each other – your cluster is not scalable but more available than we intended.

Preventing uncontrolled replication

By default clustered JBoss servers within the same network will find each other and replicate all sessions of all applications they got in common. If we want to form multiple sub-clusters we need to prevent that behaviour. We only want specific servers to replicate each other. JBoss servers stick together because they use the same multicasts so we only need to change these. This is the standard ha-sockets socket-binding-group from the domain.xml:

<socket-binding-group name="ha-sockets" default-interface="public">
    <!-- Needed for server groups using the 'ha' profile -->
    <socket-binding name="ajp" port="8009"/>
    <socket-binding name="http" port="8080"/>
    <socket-binding name="https" port="8443"/>
    <socket-binding name="jgroups-diagnostics" port="0" multicast-address="224.0.75.75" multicast-port="7500"/>
    <socket-binding name="jgroups-mping" port="0" multicast-address="${jboss.default.multicast.address:230.0.0.4}" multicast-port="45700"/>
    <socket-binding name="jgroups-tcp" port="7600"/>
    <socket-binding name="jgroups-tcp-fd" port="57600"/>
    <socket-binding name="jgroups-udp" port="55200" multicast-address="${jboss.default.multicast.address:230.0.0.4}" multicast-port="45688"/>
    <socket-binding name="jgroups-udp-fd" port="54200"/>
    <socket-binding name="modcluster" port="0" multicast-address="224.0.1.105" multicast-port="23364"/>
    <socket-binding name="osgi-http" interface="management" port="8090"/>
    <socket-binding name="remoting" port="4447"/>
    <socket-binding name="txn-recovery-environment" port="4712"/>
    <socket-binding name="txn-status-manager" port="4713"/>
    <outbound-socket-binding name="mail-smtp">
        <remote-destination host="localhost" port="25"/>
    </outbound-socket-binding>
</socket-binding-group>

As you see there are four multicasts configured: jgroups-diagnostics, jgroups-mping, jgroups-udp and modcluster. jgroups-diagnostics and modcluster should keep their values for all sub-clusters. jgroups-mping and jgroups-udp on the other hand need to be different for each sub-cluster. As you see their multicast-address is set to the value of the property jboss.default.multicast.address. We will set a different value for that property in each server-group (i. e. sub-cluster) later.

Mod-Cluster

When you leave the modcluster-multicast on its default value all nodes of all sub-clusters will be seen by the apache-side of mod_cluster. So mod_cluster will do load-balancing and it sees many nodes which all have the same application deployed. It assumes that it can do fail-over and all that stuff on all nodes of all sub-clusters. What we need to configure the modcluster subsystem of the JBoss servers to put each sub-cluster in a different load-balancing-group. We can do that in the domain.xml in our relevant profile as follows:

<subsystem xmlns="urn:jboss:domain:modcluster:1.1">
    <mod-cluster-config advertise-socket="modcluster" connector="ajp" load-balancing-group="${mycluster.modcluster.lbgroup:StdLBGroup}">
       <!-- some more stuff-->
    </mod-cluster-config>
</subsystem>
Setting the properties

We use a property as value for the load-balancing-group as we did for the multicasts. That leaves it up to us to use the same profile or socket-binding-group in different versions. I. e. we do not need to create a new profile/socket-binding-group for each sub-cluster, which would be annoying. Setting properties for a server-group is really easy and self-explaining:

<server-groups>
    <server-group name="subcluster1" profile="ha">
        <system-properties>
            <property name="jboss.default.multicast.address" value="230.0.1.1"/>
            <property name="mycluster.modcluster.lbgroup" value="LBGroup1"/>
        </system-properties>
        <socket-binding-group ref="ha-sockets"/>
    </server-group>
    <server-group name="subcluster2" profile="ha">
        <system-properties>
            <property name="jboss.default.multicast.address" value="230.0.1.2"/>
            <property name="mycluster.modcluster.lbgroup" value="LBGroup2"/>
        </system-properties>
        <socket-binding-group ref="ha-sockets"/>
    </server-group>
</server-groups>

Note: You can set properties in various places of the configuration. Look here for more information.

How to scale the cluster

First how to scale up: If you have not done so in advance you need to add a new server-group on your domain-controller. In the last blog-post we showed how to deploy an application through the command-line-interface (cli). Now we will use the cli to create a new server-group on a running domain-controller. So connect the cli to your domain-controller and execute the following three commands:

/server-group=server-group3:add(profile=ha,socket-binding-group=ha-sockets)
/server-group=server-group3/system-property=jboss.default.multicast.address:add(value=230.0.1.3)
/server-group=server-group3/system-property=mycluster.modcluster.lbgroup:add(value=230.0.1.3)

This will create a new server-group named server-group3 which looks just like server-group2 or server-group1 from above.

Now there are only two things to do:

  1. Create and configure our sub-cluster members. Do not forget to configure servers into our new server-group.
  2. Add the demanded deployments to the new server-group.

That’s it, you just scaled your cluster up by adding a new sub-cluster.

When scaling the cluster down you will need to take care of existing sessions. That is why you can not just kill all the servers of that sub-cluster. The first step is to ensure that the sub-cluster(s) we want to take down will not get any fresh sessions. When using mod_cluster this is pretty simple: There are various disable-buttons. The big marked “Disable Nodes”-button of the screenshot below is the right one. If you click that button mod_cluster will not direct new sessions into our sub-cluster anymore. But existing sessions will still be processed in our sub-cluster. Now you will need to wait until all these sessions vanish. Then you can stop the servers via cli or web-console. The existing but unused server-group does not hurt anybody and can maybe get reused when you scale the cluster up later, but you can also delete them.

A different approach: Infinispan-distribution

Infinispan is a distributed cache and plays the major role in clustering. As explained in the first post of this series the standard ha-profile or standalone-ha.xml uses Infinispan for four purposes:

  • distributing and caching web-sessions over the cluster (container: web)
  • distributing and caching stateful-session-beans over the cluster (container: ejb)
  • 2nd level cache for hibernate (container: hibernate)
  • distribution of some general objects over the cluster (container: cluster)

Infinispan has multiple different modes. For this post there are two relevant modes: Replication is the standard mode and means that if some object in a cache-container changed, that object will be redistributed to all cluster-nodes. Every cluster-node has the same data. That mode does not scale up in the dimension of memory. The second mode is called distribution. This mode does scale up: You can define how many copies of an object the cluster will hold. That number is a constant and thus it will not increase with the number of cluster-nodes. So your cluster will scale up in the dimension of memory.

Infinispan is a key-value-based cache and in the distributed mode it uses consistent-hashing to determine on which cluster-nodes the constant number of copies, num_copies, are/will be located. If an object O has been put into the the distributed cache on node N, it will not necessarily get put into the local cache of N. Consistent-hashing in general does not ensure that an object will be put into the local cache. It only ensures that it will be put into num_copies caches. On the other hand you can activate a 1st-level-cache for the distributed cache. That 1st-level-cache will cache remote objects for a configurable amount of time (l1-lifespan).

Topology Concept

This cluster will not have a very complex topology. You should set up two independent domains to keep yourself the possibility of live-updates. Each domain will have one productive server-group containing all nodes. The nodes will have Infinispan-distribution enabled on the cache-containers cluster, web and ejb. That’s it. The cluster will be scaled up by simply adding more nodes to that server-group.

There are two important tuning parameters for this cluster: num_copies, and the number of cluster-nodes. num_copies is bounded from below by your availability requirements. That is because the bigger num_copies, the more nodes can die without data-loss. The number of cluster nodes on the other hand is bound from below by num_copies because there can not be at least num_copies copies in the cluster without at least num_copies cluster-nodes. The number of cluster-nodes is also bound from below by your requirements on computational power for your cluster.

How to configure distribution

This approach is rather easy to configure in theory. Just open your domain.xml locate the Infinispan-subsystem in the relevant profile and

  • For the container cluster convert the replicated-cache to a distributed-cache.
  • For the container web change the default-cache to dist
  • For the container ejb change the default-cache to dist

Note that you can control the number of copies of a cached object with the attribute owners of the distributed-cache tag. That attribute’s default-value is "2".

With the AS 7 this currently does not work because of a bug we found: AS7-4881. During the writing of this post this bug has been fixed on the trunk. You can download the source from github and compile the AS 7 or just download a nightly build from here to try it out by yourself.

Summary

Clustering of a bigger environment requires more detailed configurations. We covered different approaches to build a more complex cluster. One thing that you should keep in mind that simply adding nodes to a cluster does not always make the cluster more powerful. Another important thing which you should keep in mind are third-party systems like databases. If you just use one database-server then at some point your cluster will not scale up any more because the database-server cannot handle more requests. Watch out for these thrid-party dependencies, even a mail-server could get to be a problem at some point.

We still have not covered two important things: messaging and ejb calls from a java-client (i. e. not from a web-application through modcluster). The next post will cover load-balancing and fail-over of standalone remote EJB clients.

Any questions or feedback? If so feel free to comment on this post or contact us via email:

  • heinz.wilming (at) akquinet.de
  • immanuel.sims (at) akquinet.de

29 thoughts on “Scalable HA Clustering with JBoss AS 7 / EAP 6

  1. I just wanted to say these have been EXCELLENT posts. They’re extremely informative. With JBoss AS 7 being so radically different from previous versions, it’s good to see such detailed tutorials on setting up clusters. I look forward to more posts in this series.

      1. Absolutely! Any lessons you can give on clustering JMS or Hibernate Search (ESPECIALLY Hibernate Search) would be great!

      2. immanuel
        when I’m in domain cluster mode and I have two more slaves nodes, if the domain node is down the others wil be managed by who?
        the slaves nodes reconnect if I put a new domain controller?

      3. If the domain controller spawns up on the same address as the old it will manage the slave-nodes of the old controller.

        But you may be cautious if the configuration of the old and the new domain-controller are different.

  2. Very well written tutorial. It’s true that at some point , DB may become a bottleneck. To avoid this problem, third party integration can be used. Most of the issue discussed here can be overcame by using a third party distributed cache provider. NCache can be another option here as it is an enterprise level distributed cache for .NET and Java and also provides a fast and reliable storage for ASP.NET and JSP Sessions.

  3. Very interesting post. I have a question related to these caching mode. What is the result of default mode (repl. asynch) when the ear has no distributable in its deployment descriptor? does it means exactly that the http session are not replicated.
    More generally when we just want sticky session without the needs for http replication session to achieve horizontal scaling what is the best mode to use replicated or distributed with nb-node=1? (in your feeling)

    another note you mentionned a bug in distrib. mode in 7.1.2 but this seems to be related to remote ejb call which seems different from this cluster demo, is it the case?
    thanks for all these valuable posts.
    yannick

    1. If you just want a “computing-power-cluster” both replication and distribution will be bad. Replication will prevent memory scalability. Distribution will allow scalability with performance loss.

  4. And how can I handle logs from host servers? Is there any way to put it all into some file on domain master server?

    1. I have not found any official way to do that yet (I believe there is none). But I think you could use a file-logger on a unix-pipe.

      1. Thanks for reply.
        I use jms queues to send logs from slaves to host and it works just fine for me :)

  5. Very nice articles here. Keep up the good work. Anyways, do you have ideas on how to make Domain Controller highly available? I mean that if the node that has the domain controller process dies, what can we do to bring the DC back up again? Surely manually, but how about an automatic way to accomplish this? It seems that DC resides inside the host controller process, so I was wondering if we could change the DC address runtime, perhaps even without a reboot? I know that there are operations like “write-local-domain-controller”, “write-remote-domain-controller”, “remove-local-domain-controller”, “remove-remote-domain-controller”, and “reload”, but I am asking if you have any actual experience with these operations? For example, fixing the domain master host to virtual IP address and then writing master=true/false and domain-controller=… automatically by an automated process like heartbeat script?

    1. I think it is not critical if the domain controller goes down. The host controllers are independent from the domain controller at runtime. The domain controller is only needed for management and configuration tasks.

      However it is possible to start the HC with a backup configuration, if the DC is down and you need to restart a host controller:

      1. Start the HC with backup option (DC must be available)
      ./domain.sh … –backup

      2. Start the HC with cached backup configuration
      ./domain.sh … –cached-dc
      HC uses the backup configuration and the HC does not try to connect to the remote DC

  6. Thank you very much for the articles. They are very informative and helpful. Can one follow these instructions even when using TCP instead of UDP? Does anyone have an example of 5 servers as in this article but using TCP?

    1. That’s a point where it will get more tricky as far as I see. You could try to use a property to set different TCPPING hosts. I’m not sure about this, but I think JGroups does not evaluate property expressions. So maybe this way will not work. If it does not work, you have two other options which are rather nasty: 1. Set up another profile with different TCPPING hosts for each server-group. 2. Divide the cluster with firewall rules or tcpwrapper or something like that while using the same TCPPING initial hosts.
      Especially for the second option: Don’t forget to set the number of initial members to the correct value.

  7. Thanks for the reply…I created this scenario with 2 server groups and each group having 2 servers each…both having full-ha profile…on my stack i created a protocol for tcpping, and the results were interesting ….the 4 servers were appeared on the log file as members…i m not sure whether i can then use system properties as u have done in your example to prevent uncontrolled replication…i will try that and see what happens

  8. Can you tell me the configuration that i need to do in my Apache(modcluster.conf) to run with above jboss setup. .

  9. In “How to scale up the cluster”, you say: “create a new server-group named server-group3 which looks just like server-group2 or server-group1 from above…”.

    I wonder: will server-group3 add computation power to server-group1 (or server-group2)? If not, how can I do that?

    Thanks.

    1. No it will not add computation power to server-group one. But the computation power of the overall cluster will be increased. The idea of the sub-cluster concept is to not add computation power to server-group1 to avoid unnecessary replication-overhead. As explained above it has some advantages over the usage of infinispan distribution. Infinispan distribution would maybe work more like you want.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s