Are you using JBoss Messaging in a firewalled environment? Do your long running JMS consumers fail to receive JMS notifications after some time of inactivity? If you answer yes to any of these questions you might be interested in our field report where we describe how we diagnosed and fixed such issues with JBoss Messaging and long running JMS consumers within a Java Swing Client.
Using Our customers application is a EJB based server application which runs on a Java EE application server. The clients are Swing based and use RMI to interact with the server. Due to the nature of the application it is common that the data displayed in one client is concurrently modified by another client. Therefore the application uses JMS messages to notify all interested parties about data changes. The clients then are able to reload the changed data and update their view.
This approach worked well until we migrated the application from another Java EE application server to the JBoss Enterprise Application Platform. Users kept reporting that after some time of inactivity the update feature in the clients did not work anymore. So we began to investigate why this was happening.
We suspected JBoss Messaging to be guilty. We could confirm that by examining the message count of the specific queue.
Examining the Message Count of a JMS queue or topic
Point your Browser to the URL of the jmx-console of your jboss installation (e.g: http://localhost:8080/jmx-console). Then, find the MBean that represents your JMS queue/topic. In case you followed the JBoss provided examples, it is jboss.messaging.destination:name=MyName,service=Queue. Look at the MessageCount attribute of that MBean. The message count of a given JMS queue should always go down to zero, especially if the system is idle. A non-zero message count might indicate that the messages are not consumed by the JMS clients anymore.
As soon as the client was broken it seemed that it was unable to receive JMS notifications anymore. We also added some logging statements to the client side code that is executed after a JMS message has been received to confirm that. Besides that the client still worked fine and was able to access the server. A manual refresh of the view also displayed the correct data.
With this initial observation we tried to reproduce the issue in our test and development environment. Unfortunately we were not able to reproduce it in our development environment. Luckily the test environment exhibited exactly the same behavior. Looking at our network diagram we soon had an idea what could cause the difference: A firewall that is located “in front” of the test server is the major difference between our development and the test environment as you can see comparing image 1 and 2. So we kept investigating in that direction.
To further isolate the problem we followed the divide and conquer approach. We put our JBoss stack aside for a while and wrote a simple Java based client/server application to see if we could find any network related issues. On startup the client connects to the server and sends a text to the server. The server simply returns the text to the client. After successfully receiving the reply the client waits some time and then sends the next message. The connection that was established initially is kept open the whole time. We increased the wait interval between two client/server interactions exponentially. This was done to find out if the firewall closes any of our connections after some time of inactivity.
As suspected we could see that after approximately one hour the echo client/server did not work anymore. Obviously the firewall dropped the connection. Interestingly both client and server were not aware that the connection was gone. Could that happen to JMS too?
A look at the usual resources (Google, JBoss wiki, JBoss forum) did not offer obvious solutions so we kept on digging. Next we used wireshark to have a look at what was going on at network level.
Wireshark is a free packet analyzer computer application. It is used for network troubleshooting, analysis, software and communications protocol development, and education. Originally named Ethereal, in May 2006 the project was renamed Wireshark due to trademark issues. (cited from Wikipedia).
Wireshark turned out to be a real helpful tool. Initially we needed some time to figure out how to use that tool properly. On the long term that time was well invested because Wireshark got us a lot of understanding what was going on at the network level.
We could see that all connections were opened by the client. This makes sense in a firewall scenario as firewalls usually forbid opening connections from the server to the outside world. However it was unclear to us what would happen if a client lost its connection to the server. If in that case the server wanted to push some JMS messages to the client it needed a way to open a new connection.
The JBoss Remoting documentation says “JBoss Messaging uses JBoss Remoting for all client to server communication. For full details of what JBoss Remoting is capable of and how it is configured please consult the JBoss Remoting documentation.”. Therefore we had a look at the JBoss Remoting documentation.
In this documentation you can find detailed information about the transports that are used within JBoss Messaging.
JBoss Messaging uses the so called bisocket transport:
The bisocket transport, like the multiplex transport, is a bidirectional transport that can function in the presence of restrictions that would prevent a unidirectional transport like socket or http from creating a server to client push callback connection. … For example, security restrictions could prevent the application from opening a ServerSocket on the client, or firewall restrictions could prevent the server from contacting a ServerSocket even if it were possible to create one.
The bisocket transport uses a control connection that is established between client and server. As soon as the server needs to push some JMS messages to the client it sends the request to open a new data connection via the control connection. The client that receives this request creates a new connection to the server. The server side waits for this new incoming client connection and associates it to the JMS session. This ensures that the server can establish new data connections to the server without actually acting as tcp client (See here for a detailed description and visualization of the mechanism).
Now it was obvious to us that dropping this control connection could render the JMS system useless. Luckily the jboss remoting documentation described a feature that seemed quite useful to us. The JBoss remoting bisocket transport can be configured using the properties PING_FREQUENCY and PING_WINDOW_FACTOR.
Setting ping frequency to a milliseconds value causes the server side to send a ping packet once every interval. If the client side does not receive a ping packet within PING_FREQUENCY * PING_WINDOW_FACTOR it assumes the control connection to be broken and opens a new control connection. These properties looked really helpful so we decided to experiment with them. To change the values you need to edit the file JBOSS_HOME/server/default/deploy/jboss-messaging.sar/remoting-bisocket-service.xml and search for the MBean jboss.messaging:service=Connector,transport=bisocket.
We were puzzled to see the configuration jboss messaging uses:
<attribute name="pingFrequency" isParam="true">214748364</attribute> <attribute name="pingWindowFactor" isParam="true">10</attribute>
These settings effectively render the ping feature useless. Ignoring the warning text that is in the file we set the values to some more appropiate values:
... <!-- There should be no reason to change these parameters - warning! Changing them may stop JBoss Messaging working correctly --> ... <attribute name="pingFrequency" isParam="true">1200000</attribute> <attribute name="pingWindowFactor" isParam="true">2</attribute>
As our main focus was to create some traffic on the connection, we chose to use this rather large values (20 minutes). In case of the loss of the control connection it would take about 40 minutes for the client to notice and re-establish the control connection. As we suspected the firewall to have an inactivity timeout of about 60 minutes this seemed fine to us.
Restarting the server with the changed configuration exposed a new issue. The JMS subsystem didn’t start up properly. Instead we could see the following Message in the Logfile and on the Console:
ERROR [ConnectionFactory] Parameter pingFrequency has a different value ( 1200000) to the default shipped with this version of JBM
(214748364). There is rarely a valid reason to change this parameter value. If you are using ServiceBindingManager to supply the remoting
configuration you should check that the parameter value specified there exactly matches the value in the configuration supplied with JBM.
This connection factory will now not deploy. To override these checks set 'disableRemotingChecks' to true on the connection factory.
Only do this if you are absolutely sure you know the consequences.
So we also had to add the mentioned setting in the file JBOSS_HOME/server/default/conf/jboss-messaging.sar/connection-factories-service.xml.
With this last change we were able to restart the server again. Using wireshark we could see the ping packets going from server to client. We left our client open overnight and were very happy to see that the connection was still up the next day. More than that, JMS was still working the next day. Anyone can imagine our disappointment to find out that JMS was dysfunctional again after about one hour.
Some more investigation then revealed that we only had solved half of the puzzle yet. The control connection did not cause any trouble anymore. However the data connection suffered from the same timeout issue. Pushing some JMS messages from the server to the client caused the creation of a data connection. If after that the system was left idle for more than one hour the data connection also was dropped by the firewall. Obviously the ping we enabled was only effective for the control connection not the data connection. So we further dig in the jboss remoting source code and documentation. We found another property timeout which can be used to enable the SO timeout on the client side. This causes blocking methods like read() to timeout after the specified interval. The method throws an Exception in that case.
The socket remains valid in case of that particular exception. In consequence the jboss remoting code just ignores that exception and continues to read from that socket. Luckily for us you can influence that behavior by setting the property continueAfterTimeout=false. This causes the connection to be closed after such a timeout occurs.
As we confirmed before that the control connection is stable we were confident that this change might be the missing piece. So we added the following configuration to the bisocket configuration (JBOSS_HOME/server/default/deploy/jboss-messaging.sar/remoting-bisocket-service.xml):
<attribute name="timeout" isParam="true">1800000</attribute> <attribute name="continueAfterTimeout" isParam="true" >false</attribute>
Again we used wireshark to verify that the data connection were closed after the timeout (30 minutes) and that JMS was functional beyond the timeout. In this case JMS was able to open another data connection via the control connection mechanism explained before.
As everything looked good we were set for another long running test. Finally we were able to use our client, leave it idle overnight and again use it the next day.
The most important point is that we fixed the issue. The fix is currently under testing and will soon go into production. We think that the remoting configuration that jboss messaging is using by default is quite firewall unfriendly. The fact that JBoss Messaging actively enforces this default settings (see disableRemotingChecks above) discourages any experiments with the settings. This is a pity as JBoss Remoting turns out to be a powerful library that contains all the options necessary to fix our problem.
Looking at the JBoss forum it seems that other people had similiar issues. However it seems that there is no official JBoss approved solution. We would like to see more appropiate configuration defaults and/or some official information how to tweak the settings for firewall environments. Until then we hope that this field report may help other users who face the same problem.
Wireshark was a really helpful tool during the process. Java EE usually relieves us from the burden to deal with all the low level details. Sometimes you need to get your feet wet. It is good to have a powerful tool like Wireshark that protects you from getting thoroughly soaked.
In this report we explained how we investigated a problem where long running JMS consumers with a Java Swing client lost their connection with the JMS server. The connection loss was caused by a firewall that closed connections silently after one hour of inactivity. We found some JBoss Messaging/JBoss Remoting configuration changes that allowed us avoid the problem. We did the following changes to our installation
<attribute name="pingFrequency" isParam="true">214748364</attribute> <attribute name="pingWindowFactor" isParam="true">10</attribute>
<attribute name="timeout" isParam="true">1800000</attribute> <attribute name="continueAfterTimeout" isParam="true" >false</attribute> <attribute name="pingFrequency" isParam="true">1200000</attribute> <attribute name="pingWindowFactor" isParam="true">2</attribute>
<mbean code="org.jboss.jms.server.connectionfactory.ConnectionFactory" name="jboss.messaging.connectionfactory:service=ConnectionFactory" xmbean-dd="xmdesc/ConnectionFactory-xmbean.xml"> <depends optional-attribute-name="ServerPeer"> jboss.messaging:service=ServerPeer </depends> <depends optional-attribute-name="Connector"> jboss.messaging:service=Connector,transport=bisocket </depends> ... </mbean>
<mbean code="org.jboss.jms.server.connectionfactory.ConnectionFactory" name="jboss.messaging.connectionfactory:service=ConnectionFactory" xmbean-dd="xmdesc/ConnectionFactory-xmbean.xml"> <depends optional-attribute-name="ServerPeer"> jboss.messaging:service=ServerPeer </depends> <depends optional-attribute-name="Connector"> jboss.messaging:service=Connector,transport=bisocket </depends> ... <attribute name="DisableRemotingChecks"> true </attribute> </mbean>