Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Tpae 7.

5 Communication Problem Error Message


Why is it happening and resolutions to some of its causes
Background
Starting it Tpae 7.5 the framework switched to use the browsers XMLHttpRequest object to send requests to
the server for such things as data validation and navigation within an application With that change it made it
much easier for the client-side framework to know when there was a problem with the request. Starting with
7.5.0.1 any request sent from the browser that was not received by the Tpae server framework would generate
the following message:

Unfortunately, this message was used for more than its intended purpose and became the generic message for
all communication issues with the server. There are a number of different reasons this message can be
displayed and in some cases there is a possibility the user hasnt lost any data (And really the only loss of data
are changes made in the browser that did not make it to the server).

Why the reload button?


So when this message is displayed, the client (browser) has changes that the server does not, so to make sure the
state of the client and the server are the same, the framework forces the user to reload. After reload the user
should sill be in the same application will all changes they made up until the error happened.
What can cause this error?
The problem with the "Communication error" message is, it can be displayed for multiple reasons which can be
perceived as one reoccurring problem.
Here is a list of known reasons why this message is displayed:

The user cannot connect to the server.


o User lost internet/network connection
o The server is down
o A network problem
o Server refuses connection

Page 1 of 8

An unexpected exception occurred on the server.


Basically the client request made it to the Application server but some error prevented it from being
handled properly by the Tpae application. Typically these are runtime errors that cannot be
anticipated by the application. An example of this is the following exception in the server logs:
com.ibm.ws.webcontainer.srt.SRTServletRequest parseParameters SRVE0133E: An error occurred
while parsing parameters
In case of this exception there was an error trying to retrieve information from the request that
prevented Tpae from handling it. Note: There is more information on this exception and how to
prevent it later in the document.

A User has multiple instances open sharing the same state.


Tpae has a ui session object that maintains the current state of the user; what app they are in, what
page they are, what record(s) they are viewing, any changes they made, etc. The framework keeps
track of that state object the client is using by the uisessionid parameter that is sent along with each
request Tpae makes. Since all browser windows and tabs share the same server session, each
instance of Tpae opened in a browser window/tab needs a different ui session so the state of each
window can be maintained independently. Without that there is potential for data corruption or
application errors. For example user is in application A that has a table. The user opens up a new
Tpae instance in another browser window and navigates to application B. If those two windows
shared the same state then the first windows state would no longer be valid because the user went to
a different application in the second window. So if the user tired to click the New Row button in
application A, an error would be returned since that table no longer exists on the server.
Unfortunately there are a few ways users can open new instances of Tpae that share the same ui
session and they are:
o The user has a bookmarked Tpae link with the uisessionid parameter
Here is an example of such a url:
http://localhost/maximo/ui/?event=loadapp&value=meter&uisessionid=3333
For legacy and supporting product reasons Tpae will allow the identifier for the state object to be
defined in the URL. So if a user opens multiple instances of Tpae with this bookmark, the
instances will share the same state.
o The user opened multiple instances of Tpae with File > New Window or File > Duplicate Tab in
IE or copies the url from an existing browser tabs address bar (with the uisessionid parameter)
and pastes that in a another window or tab.
One thing to understand is you can use Tpae in multiple browser windows and/or tabs, its how you
open up those other instances that matter.

Tpae Application has timed out waiting for a client request


This is the original purpose for the communication error message. First a little background; Starting
in Tpae 7.5 events (requests) sent to the server can be asynchronous so the user will not have to wait
for the response to continue on. An example of this is data validation. The user is in field A, they
type in a value and tab out to field B. Prier to 7.5 the user would have to wait for the server
validation of field A before modifying field B. Now the user doesnt have to wait and they can enter
in a value in B, while A is being validated. Given this asynchronous nature, more than one request

Page 2 of 8

can be sent to the browser at relatively the same time. Given the nature of TCP, there is no
guarantee that the request to validate field A makes it to the server before the request to validate B.
To insure that the requests are handled by the Tpae in the same order they are sent from the browser,
each request has a sequence number that the framework uses to maintain proper order in handling
requests. Lets say the update request for field A has a sequence number of 1 and field Bs request
has a sequence of 2. Lets also say for some unlikely (but possible) reason, Bs request makes it to
the server first. The Tpae framework will be expecting a request with sequence of 1, so it will
park Bs request and wait for As request. Once As request reaches the server, it is handled and it
Bs request is queued to be handled next. Now what happens if As request never makes it to the
server? Thats where the communication error comes in. The framework cant wait forever for
earlier sequenced events and will eventually timeout and send the communication error. When this
happens, the changes made to field A and B will be lost along with any other updates made after
modifying field A.

Http Server resending a long running request


If the application server takes too long to handle a request, the Http Server will resend the request to
the application server. The problem is Tpae does not like that, and rejects the "duplicate" request
(because it already handled it) and it's the error response of the second request that is returned to the
browser. Here is an example: A user is in the Service Request application updates a few fields and
saves the record. For some reason that save takes 100 seconds to complete. After waiting 60
seconds for the application server to respond, the Http Server timesout and resends the save request
to the application server, Tpae errors (because it already handled that request) and the comm error is
displayed to the user

Why is this error occurring in 7.5 and not in earlier versions?


As mentioned above Tpae changed the way it communicates with the server and is able to detect problems that
earlier versions could not. So in past versions when a number of the above problems occurred, Tpae would
just hang (or freeze up). Eventually the user would give up, refresh the browser and continue on their way. Im
assuming most users would think either the network was slow or server was being slow and ignore it. If it
happened more frequently, they most likely opened Tpae/Maximo occasionally hangs PMR that cannot be
reproduced. In the case of a user having multiple instances open sharing the same state, Tpae would hang in
that case as well, but a user could also be committing data to the wrong record. In 7.5 this is prevented.
How to determine the cause of your Communication errors
Unfortunately there isnt an easy way in existing releases to determine 100% the cause but there are few logged
messages that can help. If you see the following message in the logs (SystemErr specifically):
Ignoring out of order request. UISession 23 reqPageSeqNum: 7 OutOfOrderPageSeqNum: -1 Ignoring out of order request.
UISession 23 reqPageSeqNum: 7 OutOfOrderPageSeqNum: -1

The important number in that message is OutOfOrderPageSeqNum: -1. It most likely was caused by the user
having multiple tabs open sharing the same state.

Page 3 of 8

Another message to look for is:


Passed the wait time for handling the request. UISession: 23 nextRequestSeqNum: 7 seqnum:6
Youll notice that the nextRequestSeqNum (the sequence number the server is expecting) is greater than the
seqnum (the sequence number sent on the request). In this case what most likely happened was the request was
sent again by the Http Server.
Now if you see that same message but the seqnum is greater than the nextRequestSeqNum:
Passed the wait time for handling the request. UISession: 23 nextRequestSeqNum: 3 seqnum:4
this means a request sent from the browser never made it to the Tpae application on the server. And this can
occur for the number of reasons specified above: connection and/or network issues, unexpected errors, etc.
Unfortunately, there isnt a lot of information with these messages and the user still doesnt have any
knowledge why they received the error. Thats where changes for APAR IV31642 come in.
APAR IV31642
The description for this APAR states:
Tpae/maximo 7.5 communication error occurring when having multiple instances open.
The fix for this APAR does more than just address that issue; it gives the user a better understanding of why
they are getting the Communication Problem message by showing a different message for each cause. For
instance if the user is having connection issues they will receive this message:

Now youll notice the error code listed under the message. This is the error code returned by the browser and
can be different per browser. For instance on FireFox a code of 0 means it could not connect to the
network/internet. For IE the Error code will be 12029. For IE any error code in the 12000s typically means
there was some issue with the connection to the server and/or network. List of IE status codes can be found
here: http://msdn.microsoft.com/en-us/library/aa383770%28VS.85%29.aspx
If the browser can connect to the internet/network but for some reason the server rejected the request, the error
code will be an Http status code in the 400s. For example an error code of 404 is a page not found error.

Page 4 of 8

Seeing an error code in the 400s should be rare in Tpae, but possible. For more information on Http Status
codes see: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
If an unexpected error occurred on the server, the user will see this message:

In most cases the error code should be 500 which means there was an internal server error. If users see this
message they should note the time the error occurred and what they were doing at the time. There should be an
error in the logs that needs to be investigated.
In the above cases there is little Tpae can do to prevent the message from being displayed, but in the case of a
user having multiple tabs open sharing the same state, there are a couple of changes made to mitigate the
problems caused by this.
In the scenario where a user has a bookmarked URL with the uisession id parameter, the new property
mxe.webclient.allowURLDefinedUISessionID was added to prevent that. If disabled (set to 0) this property
will ignore the uisessionid parameter when creating a new session state object. By default this property is
enabled (even if the property doesnt exist, its considered enabled). Typically disabling this property should
not have any negative impact except in the case where you have some automated process that connects to Tpae.
For example: automated testing. If the test scripts are written to use a URL with a specific uisessionid, running
your test scripts will break, but in a production environment you shouldnt have such problems.
In the case where the user opens another tab or window with the File > New Window or File >
Duplicate Tab or copies the url to another tab/window there isnt anything Tpae can do to prevent it but it will
now notify the user that this occurred and will prevent the state of both windows/tabs from being corrupted.
Once the second window/tab is opened it has taken over the state from the first window. If the user goes back
to the first window, instead of getting the Comm error when they use the application, they will be redirected to
the Tape exit page and presented with this message:

Page 5 of 8

The user is not logged out and since this windows state has been taken of by the other window/tab they will not
lose any changes (except the ones they tried to make when they went back to the first window). Clicking on the
Return button will send the user back into Tpae (they will not need to log in) on the Start Center and the user
will now have two windows open with different states.
One important note about the three new messages: The messages were added to interim/hotfix patches for
7.5.0.1, 7.5.0.2, and 7.5.0.3 but in English only. Typically messages are not included in interim/hotfix patches
because they are not translated but given that the Comm Error can be shown for multiple reasons, it was
determined it was best to deliver the new messages to help diagnose the cause.

What to do about Http Server resending a long running request?


APAR IV31643 was opened for Tpae to handle the resending of the request more gracefully and the fix does
that. So now when the repeated request is sent to Tape, the framework will recognize it and once the handling
of the original request is completed, its response will be sent back in the repeated request (since thats what the
Http Server will send back to the browser). Tpae will also log that such a thing happened so it can aid in the
investigation of the long running action. Here is an example of one of those long entries:
Long running request resent. App: metergrp, Page: mainrec, User: WILSON, Total Duration: 60109. Events Handled:
Event: click, Target id: toolactions_button_1-toolbarbutton_image, value: , Duration: 0
Event: SAVE, Target id: toolactions_button_1-toolbarbutton_image, value: , Duration: 60047

In this case the log shows the act of saving in the metergrp application took just over 60 seconds to complete. If
this is seen repeatedly, then the save action in the metergrp app should be investigated as to why its taking so
long.
To the user, they will not know that this has happened. To them the action just takes a long time.

Page 6 of 8

There is an Http Server configuration change that can also prevent this from happening. Setting the Http
Servers ServerIOTimeout to a larger number than the default. Basically this setting is the number of seconds
the Http Server will wait for a response from the Application server before resending the request. For more
information on this setting and how to change it go to:
http://www-01.ibm.com/support/docview.wss?uid=swg21318463.
Note: There is still the possibility that the fix for APAR IV31643 will not prevent an error. If the original
request takes too long to complete that the Http Server times out the repeated request (since it is waiting
for the original request to finish), then the Http Server will error and return a 500 http status to the
browser. In this case the user will see the unknown error message mentioned earlier. One way to
prevent this is to increase the ServerIOTimeout plugin property. Increasing it to 120 or 180 should
hopefully help until the investigation into why the action/event is taking so long is completed.

What is the SRTServletRequest parseParameters SRVE0133E exception and how to prevent it?
As mentioned above one of the causes of the communication problem is the unexpected exception:
com.ibm.ws.webcontainer.srt.SRTServletRequest parseParameters SRVE0133E: An error occurred while parsing parameters. {0}
java.net.SocketTimeoutException: Async operation timed out
at com.ibm.ws.tcp.channel.impl.AioTCPReadRequestContextImpl.processSyncReadRequest(AioTCPReadRequestContextImpl.java:189)
at com.ibm.ws.tcp.channel.impl.TCPReadRequestContextImpl.read(TCPReadRequestContextImpl.java:111)
at ........
Caused by: com.ibm.io.async.AsyncTimeoutException(Async operation timed out, [Timeout, rc=0])
at com.ibm.io.async.AbstractAsyncFuture.waitForCompletion(AbstractAsyncFuture.java:359)
at com.ibm.io.async.AsyncFuture.getByteCount(AsyncFuture.java:218)
at com.ibm.ws.tcp.channel.impl.AioSocketIOChannel.readAIOSync(AioSocketIOChannel.java:215)
at com.ibm.ws.tcp.channel.impl.AioTCPReadRequestContextImpl.processSyncReadRequest(AioTCPReadRequestContextImpl.java:182)

in the Tpae logs. It is believed this exception occurs as a result of an IE bug when it tries to resend a request
after an error occurs. Heres whats happening:
Browsers can have multiple connections open with a server. And by default servers can typically tell the
browser to keep the connection open for reuse, this is done to save the overhead of opening connections. These
connections have an inactivity/idle timeout on both the server and the browser. I believe IE's is 60 seconds
while the Http Server's is 5 to 20 seconds (this can be changed via configuration). As you are using a browser
there is the likelihood of multiple connections being open and at least one of those connections has a chance to
sit idle for as long or longer than the server's idle timeout (remember by default it is less than IE's). As a user is
using Tpae, IE can have multiple open connections to the server and it is possible for one (or more) of those
connections sit idle for approximately the server's idle timeout. When the connection "times out" on the server
and before the server can send the close connection acknowledgement to the browser, the browser uses that
connection to send a request. Instead of the expected acknowledgement from the server of receiving the
request, the server tells IE that the connection is closed, causing an error. The way IE handles such an error is
to try and send the request again with another connection. This is where the IE bug comes into play. When IE
resends the request, only the header information is sent and not the posted body, basically half the request is
sent), the server errors trying to read the parameters from the post body. (One thing to note: It's the parameters
in the posted body that tell Tpae what event is being sent from the browser, so without that info Tpae wouldn't
know what to do). Others have seen similar things with IE as evidenced by this Microsoft KB article:
Page 7 of 8

http://support.microsoft.com/kb/895954
Unfortunately enabling the IE hotfix mentioned in the article, does not fix the problem so we are left with
coming up with ways to work around until an IE solution becomes available. As of now there isnt a Tape
solution to this problem but it is something that Tpae development is currently investigating. In the mean
time there are a couple of things clients can do to prevent this error: The first is to use FireFox instead of IE.
So far there has not been any evidence that this error occurs with FireFox. The second involves a configuration
change to the Http Server.
As mentioned above the server can tell the browser to keep connections open for reuse. This is done by
the Connection keep-alive response header value. This can be disabled by setting the KeepAlive directive to off
in the HttpServers httpd.conf file. For more information on the KeepAlive directive here are a couple links:
http://www-01.ibm.com/software/webservers/httpservers/doc/v2047/manual/ibm/en_US/9atperf.htm#keepa
http://www-01.ibm.com/software/webservers/httpservers/doc/v2047/manual/mod/core.html#keepalive
IMPORTANT: It is not recommended that every client disable the KeepAlive directive. This can impact
performance for clients with high latency. If you see the SRVE0133E exception on a rare occasion, it may not
be worth the potential performance impact to prevent the error.

Page 8 of 8

You might also like