Does HTTPSession, Stateful Session Bean Replication / Clustering Actually Work In The Practice?
Stateful web/ejb components are convenient to use and maintain - they look and feel almost like real objects (see the perfect anti-facade). You don't have to synchronize the state between layers - after transaction everything is flushed transparently to the database. This happens without any expensive copying and data / DTO transformation between layers.
In a clustered environment your stateful HTTPSession/EJB is attached to a single cluster node and is not available on the others. It is essential to come back every time to the node with your data, what can be achieved with session affinity. In case the node fails, your conversational state gets lost.
State replication and clustering seems to be the solution to the problem, but it actually is not. The problem: most application servers use the "in memory application" which is asynchronous. The state is replicated after the transaction and not in the transaction. It means: if you server fails just after the successful transaction and before the replication - your session state gets lost. This increases the availability, but is not acceptable for many applications.
You could also replicate the state synchronously, or write it synchronously to a database to overcome this problem, but this not only won't scale, but also would significantly increase the probability of dead locks.
On the other hand: the state of a HTTPSession or a Stateful Session Beans is not transactional and not persistent per definition, so you should not rely on its high availability anyway. It should be only considered as a "conversational" cache which gets persisted during the next transaction. It means: you could loose the contents of your shopping cart, but not your order at the server failure.... This should be acceptable. If not, you should store the state in the database, and not the session...
[See Chapter 1 (the basics), page 26 in "Real World Java EE Patterns Rethinking Best Practices" for state / transactions discussion]
"[..] or write it synchronously to a database to overcome this problem, but this not only won't scale"
You can think about using some kind of in-memory-datagrids for this, too.
Don't want to do any advertising here .. but there are quite some cool products out there (e.g. Coherence)
Posted by Markus on September 29, 2009 at 12:00 PM CEST #
you are right - terracotta.org would work as well. Such grids, however, are not a part of the Java EE and have to be installed separately.
Posted by Adam Bien on September 29, 2009 at 12:35 PM CEST #
From my point of view, for most applications it is not worth the effort to replicate session state. For which kind of apps is it really necessary to be able to handle it ? (o.k. maybe for really big companies like Amazon...)
Although some people argue to use big, fat sessions (e.g. Extended Persistence Context in Stateful Session Beans) I like the simple and (almost) stateless server architecture.
An interesting approach I use is to hold (again) the (almost) whole session state on the client (e.g. with a GWT client). Of course this type application is not done within some hours (when you have a real backend you have to communicate with).
Posted by Martin on September 29, 2009 at 04:21 PM CEST #
The Sun Java System Application Server uses a special kind of database (the Clustra DB) to store session information, and it reaches 99.99% availability.
It's also highly scalable, as session information is *not* replicated through the wire, but goes directly to a highly available clustra database.
I'll miss Sun software design. Let's hope Oracle keeps the good stuff in there!
Posted by Antonio on September 29, 2009 at 06:39 PM CEST #
I guess you mean the HADB. It is scalable but not very fast :-).
But you agree, that session state replication shouldn't be necessary, right?
Posted by Adam Bien on September 29, 2009 at 08:14 PM CEST #
At least for HttpSession WebSphere has a tuning that can synchronize the session before the end of the service method. Of course this is a performance hit but that is the trade-off.
Posted by Stuart Smith on September 30, 2009 at 01:50 AM CEST #
you are right - this is a good compromise. It is, however, still not comparable with transactional storage because the replication could take some time and meanwhile other clients could access the other nodes as well (except session affinity is activated).
Actually the server should lock all nodes, then replicate and eventually release the lock...
thanks for your feedback!,
Posted by Adam Bien on September 30, 2009 at 11:25 AM CEST #
Posted by David Mann on November 16, 2013 at 01:06 AM CET #