callout_drain either broken or man page needs updating

Discussion:

(too old to reply)

Matthew Macy

2016-07-15 03:45:28 UTC

Upon updating my drm-next branch to the latest -CURRENT callout_drain returning no longer means that the function was in fact pending when it was called.

This little bit of code will panic because dwork->wq is NULL, because the callout was _not_ in fact enqueued. So either it's no longer possible to reliably query if a callout was pending while clearing it and we're ok with that or glebius last commit needs some further re-work.

#define del_timer_sync(timer) (callout_drain(&(timer)->timer_callout) == 1)

static inline bool
flush_delayed_work(struct delayed_work *dwork)
{

if (del_timer_sync(&dwork->timer))
linux_queue_work(dwork->cpu, dwork->wq, &dwork->work);
return (flush_work(&dwork->work));
}

Hans Petter Selasky

2016-07-15 04:25:41 UTC

Permalink

Post by Matthew Macy
glebius last commit needs some further re-work.

Hi,

Glebius commit needs to be backed out, at least the API change that
changes the return value when calling callout_stop() when the callout is
scheduled and being serviced. Simply because there is code out there,
like Mattew and others have discovered that is "refcounting" on the
callout_reset() and expecting that a subsequent callout_stop() will
return 1 to "unref".

If you consider this impossible, maybe a fourth return value is needed
for CANCELLED and DRAINING .

Further, getting the callouts straight in the TCP stack is a matter of
doing the locking correctly, which some has called "my magic bullet" and
not the return values. I've proposed in the following revision
https://svnweb.freebsd.org/changeset/base/302768 to add a new callout
API that accepts a locking function so that the callout code can run its
cancelled checks at the right place for situations where more than one
lock is needed.

Post by Matthew Macy
void
tcp_timer_2msl(void *xtp)
{
struct tcpcb *tp = xtp;
struct inpcb *inp;
CURVNET_SET(tp->t_vnet);
#ifdef TCPDEBUG
int ostate;
ostate = tp->t_state;
#endif
INP_INFO_RLOCK(&V_tcbinfo);
inp = tp->t_inpcb;
KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp));
INP_WLOCK(inp);
tcp_free_sackholes(tp);
if (callout_pending(&tp->t_timers->tt_2msl) ||
!callout_active(&tp->t_timers->tt_2msl)) {

Here we have custom in-house race check that doesn't affect the return
value of callout_reset() nor callout_stop().

Post by Matthew Macy
INP_WUNLOCK(tp->t_inpcb);
INP_INFO_RUNLOCK(&V_tcbinfo);
CURVNET_RESTORE();
return;
static void
tcp_timer_2msl_lock(void *xtp, int do_lock)
{
struct tcpcb *tp = xtp;
struct inpcb *inp;
inp = tp->t_inpcb;
if (do_lock) {
CURVNET_SET(tp->t_vnet);
INP_INFO_RLOCK(&V_tcbinfo);
INP_WLOCK(inp);
} else {
INP_WUNLOCK(inp);
INP_INFO_RUNLOCK(&V_tcbinfo);
CURVNET_RESTORE();
}
}

callout_init_lock_function(&callout, &tcp_timer_2msl_lock,
CALLOUT_RETURNUNLOCKED);

Post by Matthew Macy
CC_UNLOCK(cc);
if (c_lock != NULL) {
if (have locking function)
tcp_timer_2msl_lock(c_arg, 1);
else
class->lc_lock(c_lock, lock_status);
/*
* The callout may have been cancelled
* while we switched locks.
*/

Actually "CC_LOCK(cc)" should be in-front of cc_exec_cancel() to avoid
races testing, setting and clearing this variable, like done in hps_head.

Post by Matthew Macy
if (cc_exec_cancel(cc, direct)) {
if (have locking function)
tcp_timer_2msl_lock(c_arg, 0);
else
class->lc_unlock(c_lock);
goto skip;
}
cc_exec_cancel(cc, direct) = true;
....
if ((c_iflags & CALLOUT_RETURNUNLOCKED) == 0) {
if (have locking function)
...
else
class->lc_unlock(c_lock);
}

The whole point about this is to make the the cancelled check atomic.

1) Lock TCP
2) Lock CC_LOCK()
3) change callout state

--HPS

Matthew Macy

2016-07-15 05:14:46 UTC

Permalink

Post by Hans Petter Selasky

Post by Matthew Macy
glebius last commit needs some further re-work.

Yes. This is the cause of the "refcnt 0 on LLE at boot..." regression.

-M

Post by Hans Petter Selasky
If you consider this impossible, maybe a fourth return value is needed
for CANCELLED and DRAINING .
Further, getting the callouts straight in the TCP stack is a matter of
doing the locking correctly, which some has called "my magic bullet" and
not the return values. I've proposed in the following revision
https://svnweb.freebsd.org/changeset/base/302768 to add a new callout
API that accepts a locking function so that the callout code can run its
cancelled checks at the right place for situations where more than one
lock is needed.

Here we have custom in-house race check that doesn't affect the return
value of callout_reset() nor callout_stop().

callout_init_lock_function(&callout, &tcp_timer_2msl_lock,
CALLOUT_RETURNUNLOCKED);

Actually "CC_LOCK(cc)" should be in-front of cc_exec_cancel() to avoid
races testing, setting and clearing this variable, like done in hps_head.

The whole point about this is to make the the cancelled check atomic.
1) Lock TCP
2) Lock CC_LOCK()
3) change callout state
--HPS
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-current

Gleb Smirnoff

2016-07-15 08:43:03 UTC

Permalink

On Thu, Jul 14, 2016 at 10:14:46PM -0700, Matthew Macy wrote:
M> > On 07/15/16 05:45, Matthew Macy wrote:
M> > > glebius last commit needs some further re-work.
M> >
M> > Glebius commit needs to be backed out, at least the API change that
M> > changes the return value when calling callout_stop() when the callout is
M> > scheduled and being serviced. Simply because there is code out there,
M> > like Mattew and others have discovered that is "refcounting" on the
M> > callout_reset() and expecting that a subsequent callout_stop() will
M> > return 1 to "unref".
M>
M> Yes. This is the cause of the "refcnt 0 on LLE at boot..." regression.

No it isn't. The regression is caused by unintentional change of return
value for never scheduled callout. The fix is now being tested, see PR 210884.

The piece of ND6 code that Hans quotes isn't affected by change of return
value for scheduled+running callout, since ND6 doesn't create callouts in
this tricky state.

--
Totus tuus, Glebius.

Hans Petter Selasky

2016-07-15 13:33:40 UTC

Permalink

Post by Gleb Smirnoff
M> > > glebius last commit needs some further re-work.
M> >
M> > Glebius commit needs to be backed out, at least the API change that
M> > changes the return value when calling callout_stop() when the callout is
M> > scheduled and being serviced. Simply because there is code out there,
M> > like Mattew and others have discovered that is "refcounting" on the
M> > callout_reset() and expecting that a subsequent callout_stop() will
M> > return 1 to "unref".
M>
M> Yes. This is the cause of the "refcnt 0 on LLE at boot..." regression.
No it isn't. The regression is caused by unintentional change of return
value for never scheduled callout. The fix is now being tested, see PR 210884.
The piece of ND6 code that Hans quotes isn't affected by change of return
value for scheduled+running callout, since ND6 doesn't create callouts in
this tricky state.

Hi,

Can you explain this a bit more Gleb? Can't user-space tools like
"route" delete LLE entries at _any_ time?

From what I can see, there is nothing preventing
"nd6_llinfo_settimer_locked()" running concurrently with
"nd6_llinfo_timer()". Even though the delay is in the hz range, this
doesn't prevent the race I'm pointing at.

And what about the pending check in "kern/subr_taskqueue.c"?

Won't it be off-by one in case the callout is scheduled when it is being
serviced?

pending = !!(callout_stop(&timeout_task->c) > 0);

How this cannot happen?

--HPS

Matthew Macy

2016-07-15 18:34:44 UTC

Permalink

I misread his comment on the reason for the failure. But, the failure is caused by a regression in callout_stop.

Post by Gleb Smirnoff
No it isn't. The regression is caused by unintentional change of return
value for never scheduled callout. The fix is now being tested, see PR 210884.

Thanks. Let me know when I can update.
-M

Larry Rosenman

2016-07-15 18:38:52 UTC

Permalink

---- On Fri, 15 Jul 2016 01:43:03 -0700 Gleb Smirnoff

Post by Gleb Smirnoff
M> > > glebius last commit needs some further re-work.
M> >
M> > Glebius commit needs to be backed out, at least the API change

that

Post by Gleb Smirnoff
M> > changes the return value when calling callout_stop() when the

callout is

Post by Gleb Smirnoff
M> > scheduled and being serviced. Simply because there is code

out there,

Post by Gleb Smirnoff
M> > like Mattew and others have discovered that is "refcounting"

on the

Post by Gleb Smirnoff
M> > callout_reset() and expecting that a subsequent callout_stop()

will

Post by Gleb Smirnoff
M> > return 1 to "unref".
M>
M> Yes. This is the cause of the "refcnt 0 on LLE at boot..."

regression.
I misread his comment on the reason for the failure. But, the failure
is caused by a regression in callout_stop.

Post by Gleb Smirnoff
No it isn't. The regression is caused by unintentional change of

return

Post by Gleb Smirnoff
value for never scheduled callout. The fix is now being tested, see

PR 210884.
Thanks. Let me know when I can update.
-M
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to

URL: https://svnweb.freebsd.org/changeset/base/302894
has the fix in HEAD.

(It's a one-liner).

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: ***@lerctr.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281