[ofa-general] some questions on stale connection handling at the IB CM

Discussion:

Or Gerlitz

2007-12-17 15:08:20 UTC

Hi Sean,

Basically, I am trying to understand when a "stale connection" as defined by
12.4.1 handling is carried out by the CM and what are the cases where it
must be handled at the app level (if there are such).

Looking on the code, I see that the CM sends a reject message with the reason being
IB_CM_REJ_STALE_CONN when it gets a REQ or REP whose <QPN, CA GUID> pair is already
present at the remote-qpn rb tree (and in another case which I don't fully understand).

On the other side, when the CM receives a reject message with that reason, the local handle
(id) is moved to the timewait state, where my understanding is that it will sit there for a while
and then a reject/stale-connection callback will be delivered to the user, the id will be removed.

What I don't see is issue of "DREQ, with DREQ:remote QPN set to the remote QPN from the REQ/REP"
as stated in 12.9.8.3.1 (below), is it really missing or I am reading the code wrong?

Also, its quite clear to me that from the view point of the application there are stale
connection cases which the CM can not catch, eg a client DREQ did not arrive to the server
side CM and then the client REQ uses a different QPN, etc. My understanding is that in such
cases the responsibility to close the stale connection/qp is on the server app.

Or.

12.9.8.3.1 REQ RECEIVED / REP RECEIVED
(RC, UC) A CM may receive a REQ/REP specifying a remote QPN in REQ:local QPN/REP:local QPN
that the CM already considers connected to a local QP. A local CM may receive such a REQ/REP
if its local QP has a stale connection, as described in section 12.4.1. When a CM receives
such a REQ/REP it shall abort the connection establishment by issuing REJ to the REQ/REP.
It shall then issue DREQ, with DREQ:remote QPN set to the remote QPN from the REQ/REP, until
DREP is received or Max Retries is exceeded, and place the local QP in the TimeWait state.

Sean Hefty

2007-12-17 19:57:59 UTC

Permalink

Post by Or Gerlitz
Looking on the code, I see that the CM sends a reject message with the reason being
IB_CM_REJ_STALE_CONN when it gets a REQ or REP whose <QPN, CA GUID> pair is already
present at the remote-qpn rb tree (and in another case which I don't fully understand).

IB_CM_REJ_STALE_CONN is sent in the following situations:

* Remote ID in REQ matches a connection that is in timewait. This is treated as
a duplicate REQ that was processed after the connection had been terminated.

* Remote QPN in REQ or REP matches an existing connection, and REQ/REP was not
detected as a duplicate.

Post by Or Gerlitz
On the other side, when the CM receives a reject message with that reason, the local handle
(id) is moved to the timewait state, where my understanding is that it will sit
there for a while
and then a reject/stale-connection callback will be delivered to the user, the
id will be removed.

correct

Post by Or Gerlitz
What I don't see is issue of "DREQ, with DREQ:remote QPN set to the remote QPN
from the REQ/REP"
as stated in 12.9.8.3.1 (below), is it really missing or I am reading the code wrong?

This is missing. But neither the DREQ or DREP that are generated in this case
drive the state machines. Both messages are simply generated and then consumed
by the CM. (I don't even think it's clear if the local and remote IDs in the
DREQ/DREP are relative to the stale connection, or the new connection
request/reply.)

Post by Or Gerlitz
Also, its quite clear to me that from the view point of the application there are stale
connection cases which the CM can not catch, eg a client DREQ did not arrive to the server
side CM and then the client REQ uses a different QPN, etc. My understanding is that in such
cases the responsibility to close the stale connection/qp is on the server app.

Correct - keep-alive messages are still needed by apps to know if their
connections are still valid. IMO, stale connection detection becomes less
useful as the number of systems being connected to increase.

- Sean

Or Gerlitz

2007-12-18 08:53:36 UTC

Permalink

Post by Sean Hefty
* Remote ID in REQ matches a connection that is in timewait. This is treated as
a duplicate REQ that was processed after the connection had been terminated.
* Remote QPN in REQ or REP matches an existing connection, and REQ/REP was not
detected as a duplicate.

OK, thanks for the clarification.

Post by Sean Hefty

Post by Or Gerlitz
On the other side, when the CM receives a reject message with that reason, the
local handle (id) is moved to the timewait state, where my understanding is that
it will sit there for a while and then a reject/stale-connection callback will be
delivered to the user, the id will be removed.

correct

I don't see what the user can do for the case of the CM detecting a
remote qpn match, if they will continue to use the same qpn this will
happen in an endless loop, correct?

Post by Sean Hefty
This is missing. But neither the DREQ or DREP that are generated in this case
drive the state machines. Both messages are simply generated and then consumed
by the CM. (I don't even think it's clear if the local and remote IDs in the
DREQ/DREP are relative to the stale connection, or the new connection
request/reply.)

I agree that its quite unclear from the spec if the IDs to be used in
the DREQ are those of the new connection or the stale one. Specifically,
those of the stale connection might not exist anymore in the CM that
gets the dreq and it would be just dropped, so there's no real gain in
implementing this.

Post by Sean Hefty
Correct - keep-alive messages are still needed by apps to know if their
connections are still valid. IMO, stale connection detection becomes less
useful as the number of systems being connected to increase.

Is there anything the IB stack can do here to make apps coding simpler?
In the past I was suggesting to use inform info "GID out" registration
by the IB CM to catch remote ports going down, but thinking on it again,
when a port goes down an RC QP pair doesn't, unless there was inflight
data, so if the CM will deliver disconnect event it might be false
alarm... and this registration would cause load on the SA so it does not
scale well, unless we make it a feature of the CM which users would
enable on target nodes and not initiators...

Or.

Sean Hefty

2007-12-18 17:15:16 UTC

Permalink

Post by Or Gerlitz
I don't see what the user can do for the case of the CM detecting a
remote qpn match, if they will continue to use the same qpn this will
happen in an endless loop, correct?

I guess so.

Post by Or Gerlitz
Is there anything the IB stack can do here to make apps coding simpler?

Not explicitly. Although after I thought about it more, I do like the idea of
using LAP/APR messages as a sort of keep-alive.

- Sean

Or Gerlitz

2007-12-19 11:20:45 UTC

Permalink

Post by Sean Hefty

Post by Or Gerlitz
I don't see what the user can do for the case of the CM detecting a
remote qpn match, if they will continue to use the same qpn this will
happen in an endless loop, correct?

I guess so.

So in the case of lost DREQ etc, in cm_match_req() we will pass the
checking for duplicate REQs but fall in the check for stale connections
and it can happen in endless loop? this seems like a bug to me.

Can't the CM use the remote QPN database to synthesize a disconnect on
the stale connection in that case?

Post by Sean Hefty

Post by Or Gerlitz
Is there anything the IB stack can do here to make apps coding simpler?

Not explicitly. Although after I thought about it more, I do like the idea of
using LAP/APR messages as a sort of keep-alive.

Yes, this seems to be able to solve the keep-alive thing in a generic
fashion for all ULPs using the IB CM, will you be able to look on this
during the next weeks or so?

Or.

Sean Hefty

2007-12-19 17:57:27 UTC

Permalink

Post by Or Gerlitz
So in the case of lost DREQ etc, in cm_match_req() we will pass the
checking for duplicate REQs but fall in the check for stale connections
and it can happen in endless loop? this seems like a bug to me.

This problem isn't limited to stale connections. If a client tries to
connect, gets a reject for whatever reason, ignores the reject, then
tries to reconnect with the same parameters, then they've put themselves
into an endless loop.

Post by Or Gerlitz
Yes, this seems to be able to solve the keep-alive thing in a generic
fashion for all ULPs using the IB CM, will you be able to look on this
during the next weeks or so?

This method can be used by apps today. The only enhancement that I can
see being made is having the CM automatically send the messages at
regular intervals. But I hesitate to add this to the CM since it
doesn't have knowledge of traffic occurring over the QP, and may
interfere with the app wanted to actually change alternate path information.

- Sean

Or Gerlitz

2007-12-20 13:41:52 UTC

Permalink

Post by Sean Hefty

I don't follow: if they don't ignore the reject, but reuse the same QP
for their successive connection requests, each new REQ will pass the ID
check (duplicate REQs) but will fail on the remote QPN check, correct?
so what can a client do to not fall into that? what does it means to not
ignore the reject? note that even if on getting a reject they release
the qp and allocate new one, they can get the qp number.

Post by Sean Hefty

Post by Or Gerlitz
Yes, this seems to be able to solve the keep-alive thing in a generic
fashion for all ULPs using the IB CM, will you be able to look on this
during the next weeks or so?

You mean one side to send a LAP message with the current path and the
peer replying with APR message confirming this is fine? I guess this LAP
sending has to carried out by both sides, correct? and its not supported
for RDMA-CM users...

As for your comments, assuming an app must notify the CM that it does
not use a QP anymore (and if not we delare it RTFM bug), as long as the
QP is alive from the CM view point, its perfectly fine to sends these
LAPs, doing this once every few seconds or tens of seconds will not
create heavy load, I think. As for the point of interfering with apps
that want to use LAP/APR for APM implementation over their protocols, we
can let the CM consumer specify if they want the CM to issue keep-alives
for them, and what is the frequency of sending the messages.

Or.