PDA

View Full Version : Clustered Session Load Balancing for the Triad


ferg
12-09-2010, 05:17 PM
As part of the Caucho quarterly meeting, the question of triad load balancing came up in the context of reviewing the distributed session model, and the explanation is an interesting one.

In Resin 4.0, all session data is replicated for all three triad servers for both reliability and load balancing.

For reliability, the Resin triad has 3 server instead of 2 because of server maintenance. When you take down a server for scheduled maintenance, it's important to keep 2 servers available to back each other up during that maintenance. If one of the 2 remaining servers crashes or freezes, the 3rd server can temporarily take the entire load until the cluster is restored. This avoids ever having a single point of failure, even during maintenance.

For session/cache load balancing, each session is owned by a specific triad based on a hash of the session id. Since one third of the sessions are owned by server A, one third by server B, and one third by server C, the load is evenly distributed between the triad. This triad load balancing lets the cluster pod scale up to its full size of 64 servers. (Sites with more servers use multiple cluster pods.)

rein
12-13-2010, 03:24 PM
Based on the documentation http://caucho.com/resin-4.0/admin/clustering-overview.xtp#Clustered Sessions and information you mentioned above I have the following doubts..


Consider we have 6 instances (nodes) of Resin and nodes 1,3,5 are part of the triad.We also have a hardware load balancer (no sticky sessions)

# When a request comes to node 4 how does it retrieve a session?

# Where does the session id hashing come into the picture? At an individual node level or at the triad?

# Do the nodes that are not part of the triad have any local store of its own?

# Where does the triad store the session data ? Is it configurable?

# What happens when one of the nodes in the triad goes down? How do the other nodes which are not in the triad know about this?

Thanks

ferg
12-13-2010, 04:15 PM
The triad is always the first three servers: 1, 2, 3.

When a request comes to server 4 and server 4 does not have the current session information, it will ask the owning triad for the session data.

With sticky sessions, server 4 will generally have a cached version of the session, and be more efficient since it wouldn't need to contact the triad.

The session id hashing is used in both places. Sticky sessions is still an advantage for caching efficiency. The hash of the session is used to determine the owning triad server, which is primarily important for locking but also for cache efficiency.

The non-triad nodes have a local cache. The true backing is the triad, but for efficiency the non-triad node will have a copy of the session data so it can avoid a network request if the data is in the cache.

The session data is in resin-data/<server>. The resin-data location can be configured, but not the structure underneath it.

The heartbeat service continually checks which servers are up with a dedicated TCP connection. If a triad server goes down, all other servers will see the TCP connection close within a second for normal exits. If the machine is unplugged, the heartbeat 60s later will detect the crashed server.

rein
12-13-2010, 04:42 PM
thanks for the clarifications....

Have some more doubts

* In the case of a new session how does it get distributed across the triads?
** For example when a request comes to node 4 (non triad) and a session does not exists.

* If sticky sessions are disabled will the local cache of non triad nodes be ever used or does it always get the session information from its triad owner?

* In the case of a triad server going down : Are the sessions ,which the dead triad was the owner of equally distributed between the two remaining triad servers ? If so how?



Thanks

ferg
12-13-2010, 11:19 PM
A new session is assigned randomly to one of the triad servers, using a hash of the session id. So each triad server gets 1/3 of the session load.

With non-sticky sessions, the session will be loaded from the triad on each request. (Unless you happen to get lucky and end up on the same server.) It's always better to enable sticky sessions when possible. Even IP-based sticky sessions is better than nothing.

All three triad servers get all copies of the sessions. So even two triad servers can go down and the sessions will be backed up.

rein
12-14-2010, 11:23 AM
After going through the doc http://www.caucho.com/articles/resin-cloud.pdf "Implementing Resin 4.0's distributed cache" section ,its mentioned

"When the mnode owner server responds to the requesting server, it includes both the m-node and the associated data" (with respect to When the application tries to obtain a value from the cache)

Why is the actual data sent back along with the m-node information? I dont think it would be necessary because their could be a case where the local store of the the requesting non -triad server would have the right data and their would not be an update required.
By doing this network traffic could have been reduced....

ferg
12-14-2010, 04:23 PM
The data is only sent back on a cache miss (when the client's cache is out of date).

On a cache hit, the triad sends back the up-to-date mnode, but doesn't send the data because the client already has the data.

rein
12-14-2010, 05:35 PM
Getting a bit confused here ....

How does the triad owner determine whether the data on the non triad server is invalid?
According to the document "The server contacts the owner triad server to request the m-node"during a cache read.Does it send a hash of the data it has in its local store when making this request ,so then the triad owner will check if the hash defers from its own copy and decide whether to send back the new data or not?

Also its mentioned that the m-nodes are replicated on all the triad servers only. If this is the case when the first request comes to a server for an existing session already created by another server ,how does this server know about the triad owner for the session ? Im assuming this server will not have the appropriate mnode and hence no key to figure out the triad owner..

ferg
12-14-2010, 06:10 PM
If the non-triad server (server 4) has a local cache, that means it has an mnode, which may or may not still be valid. The mnode is essentially <key,value> where both key and value are 64-byte hashes, so are fairly small.

The non-triad server sends a "get" call with its current mnode. If the mnode value is still correct, the triad will return an "ok" message.

If the mnode value is no longer correct (the value hash does not match the triad's value hash), the triad will return the new mnode and the actual value.

So yes it does: "Does it send a hash of the data it has in its local store when making this request, so then the triad owner will check if the hash defers from its own copy and decide whether to send back the new data or not?"

The triad owner is based on a hash of the session id. So it's always the same owner even if there's no data on the triad yet. It's the owner based on the hash. Since all servers will always calculate that hash the same way, everyone will agree on which triad server owns the session.

rein
12-14-2010, 06:46 PM
I still dont get the part when a request comes to a server that does not have an mnode.If the triad owner can be figured out just by a hash on the session id then why does the mnode have a key as one of its fields. Also in the case of a triad failure and the server becomes unavailable ,does this hashing technique change such that the non -triad servers when calculating the triad owner contacts the back up triad server instead.

ferg
12-14-2010, 07:52 PM
In a sense, a server always has a mnode, even if the mnode has a null value.

When a session request comes to a server without a cached mnode, the server will use the mnode <key,0> where "0" means no data. Using that technique avoids creating some special cases, and therefore simplifies the implementation. It's like using a NULL object as the tail of a linked list instead of a true null. In some cases, like this, its better for all requests to be true objects and a not special case link null.

The hash calculates the primary, secondary, and tertiary servers. So it actually selects one of:

ABC, ACB, BAC, BCA, CAB, CBA

So a sessionid "baaXnMp" might hash to ACB, meaning the mnode is owned by triad server A and the backup is triad server C.

The primary and secondary servers are all evenly distributed among the triad. That's important because on a triad server crash, we want the load evenly distributed among the remaining two servers.