Skip Menu | Logged in as guest | Logout
 
The Basics
Id: 131780
Status: open
Priority: 0/
Queue: openafs-bugs

Dates
Created: Tue Dec 03 17:42:19 2013
Starts: Not set
Started: Not set
Last Contact: Sun Feb 08 04:38:02 2015
Due: Not set
Closed: Not set
Updated: Sun Feb 08 04:38:02 2015 by andersk

History Brief headersFull headers
CC: andersk@mit.edu
Subject: linux: getcwd sometimes fails
Download (untitled)
text/plain 970b
In some cases, the cache manager on a linux OpenAFS client can end up in a state where getcwd returns failure, which causes lots of application breakage. We believe this is the result of having two dentries referring to the same directory inode, which is bad (http://lwn.net/Articles/545119/).

Previous jabber discussions at:
http://conference.openafs.org/openafs@conference.openafs.org/2013-06-11.txt
http://conference.openafs.org/openafs@conference.openafs.org/2013-06-16.txt

A patch that tried to fix it was at http://gerrit.openafs.org/#change,10444 but this proved to not be the root cause.

This is something of an architectural problem, so a proper solution may involve rather broad sweeping changes.

Reproducing the misbehavior seems to be very easy; something like 'ls /afs/athena.mit.edu/user/a/n/andersk; ls /afs/.athena.mit.edu/user/a/n/andersk' seems to create a directory inode with two dentries pointing to it (http://scripts.mit.edu/trac/ticket/387)
Download (untitled)
text/html 1.3k
This is a known limitation of the OpenAFS cache manager on Linux.   Linux does not permit a directory to have two parents.  At the same time, the OpenAFS cache manager does not expose AFS mount points to the VFS as mount points where the device (st_dev) on each side of the mount point is the same.   That coupled with the ability to create two AFS mount points to the same volume or to cross the same mount point by RW path and RO path at the same time results in an unresolvable conflict with Linux VFS requirements.

Treating every AFS volume as a separate device is required to conform with Linux VFS requirements.  However, that breaks an important assumption: that a file is known to be in AFS if st_dev from /afs matches the st_dev from any other object.

The in-kernel AFS module kafs does treat each AFS volume as a separate device.  It also has support for the Linux auto mounter and being GPL can support features such as notification hooks and take advantage of the in-kernel crypto, etc.  

It has long been my personal opinion that OpenAFS developer community should stop putting effort into supporting Linux with the OpenAFS cache manager and should instead focus on adding the missing functionality to the Linux kafs module.
 



Download (untitled)
text/plain 182b
Andrew Deason notes on jabber that there was some previous email discussion of this general issue, starting at https://lists.openafs.org/pipermail/openafs-devel/2012-June/018886.html
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Sun, 12 Oct 2014 23:10:43 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 484b
The getcwd() ENOENTs had disappeared on scripts.mit.edu with
http://gerrit.openafs.org/10804. However, they returned when we upgraded
from 1.6.8 to 1.6.10pre1.

I was suspicious of http://gerrit.openafs.org/11358 because it seems to
undo the effect of 10804. So I reverted 11358 on the primary server,
which hasn’t seen the problem since.

I’m going to go revert 11358 on more of our servers with a log message in
the differing case (code == 0). Any other ideas?

Anders
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Mon, 13 Oct 2014 00:02:24 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 801b
On Sun, 12 Oct 2014, Anders Kaseorg wrote:
> The getcwd() ENOENTs had disappeared on scripts.mit.edu with
> http://gerrit.openafs.org/10804. However, they returned when we upgraded
> from 1.6.8 to 1.6.10pre1.
>
> I was suspicious of http://gerrit.openafs.org/11358 because it seems to
> undo the effect of 10804. So I reverted 11358 on the primary server,
> which hasn’t seen the problem since.
>
> I’m going to go revert 11358 on more of our servers with a log message in
> the differing case (code == 0). Any other ideas?

One thing I quickly discovered is that this case happens very often on our
servers, like a few times per minute on average. Should I be surprised to
see force_drop = 1 with code == 0 happening so often without (AFAICT) any
replica volumes involved?

Anders
Download (untitled)
text/html 1.1k
On Mon Oct 13 00:02:35 2014, andersk@MIT.EDU wrote:

> One thing I quickly discovered is that this case happens very often on our
> servers, like a few times per minute on average. Should I be surprised to
> see force_drop = 1 with code == 0 happening so often without (AFAICT) any
> replica volumes involved?

The case in which it triggers is if tvc != vcp.   In which case the current code

 1. sets force_drop = 1
 2. goto bad_dentry
 3. as a side effect valid = 0

One thing jumps out at me, in the case of code == 0, didn't we get back a valid tvc that requires a call to afs_PutVCache(tvc)?
afs_PutVCache(tvc) is not called when jumping to bad_dentry

dp->d_name.name is hashed to tvc which we have determined is wrong so we are going to forcibly unhash it.
The second question that comes to mind is: What is going to add an entry that refers to vcp?   Should we be doing that since we want to forcibly update the value that afs_lookup(dp->d_name.name) returns?

Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Mon, 13 Oct 2014 09:14:45 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 3.1k
On Mon, 13 Oct 2014 01:00:35 -0400
Jeffrey Altman via RT <openafs-bugs@openafs.org> wrote:

> <URL: https://rt.central.org/rt/Ticket/Display.html?id=131780 >
>
> On Mon Oct 13 00:02:35 2014, andersk@MIT.EDU wrote:
>
> > One thing I quickly discovered is that this case happens very often
> > on our servers, like a few times per minute on average. Should I be
> > surprised to see force_drop = 1 with code == 0 happening so often
> > without (AFAICT) any replica volumes involved?
>
> The case in which it triggers is if tvc != vcp. In which case the
> current code

So what happens is: at some point we look up 'foo' in /afs/cell/dir and
get back vcache 'vcp'. Later on for revalidate we again look up 'foo' in
/afs/cell/dir and get back some other vcache 'tvc'. In your situation
I've gathered that you expect that to be the same file (iirc it's a
mountpoint?)

So what you would want to log is the values of tvc and vcp, since you
don't expect them to be different. You'd ideally want to see the
contents of those structures, too, to see why we're getting different
results, but there's a lot of stuff in struct vcache. (Maybe log at
least the fid and the vcache states? tvc->f.fid and tvc->states)

Just printing the pointer values would at least say something, though.
It would indicate if this is happening for the same vcache every time,
or if we're constantly shifting to a new vcache.

Hmm, once guess is that 'vcp' is a directory and 'tvc' is a mountpoint
object, and to fix it we just need to evaludate the tvc mntpt and it
will become vcp. Are you running with any kind of fakestat?

Also, I don't think it's surprising that you see it a lot quickly in
succession. When this happens once, in your situation we've seen that
you have some process that is holding open a reference to
/afs/cell/dir/foo; once we've determined that it should be "force
dropped", we will keep doing so every single time that process accesses
something under /afs/cell/dir/foo. (The force_drop prevents new accesses
on the file, but we cannot destroy existing references; the fact that
you see the (deleted) shows that you are seeing a preexisting reference
we could not remove.)

> One thing jumps out at me, in the case of code == 0, didn't we get
> back a valid tvc that requires a call to afs_PutVCache(tvc)?
> afs_PutVCache(tvc) is not called when jumping to bad_dentry

Yes it is. Line 1296 in 1.6.10pre1.

> dp->d_name.name is hashed to tvc which we have determined is wrong so
> we are going to forcibly unhash it. The second question that comes to
> mind is: What is going to add an entry that refers to vcp? Should we
> be doing that since we want to forcibly update the value that
> afs_lookup(dp->d_name.name) returns?

But afs_lookup(pvcp, dp->d_name.name) already returns vcp. If you mean
to update the inode/vcache that dp points to, that may be what d_move
can do in this situation, but iirc there are some difficulties with
locking in that approach. The dentry will be updated to point to the new
file on the next actual lookup, since the old/bad dentry will not be
hashed anymore, so linux will call the real lookup function to find one.

--
Andrew Deason
adeason@sinenomine.net
Download (untitled)
text/html 111b


That line does not execute when tvc != vcp because goto bad_dentry;


Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Mon, 13 Oct 2014 10:36:11 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 615b
On Mon, 13 Oct 2014 11:22:08 -0400
Jeffrey Altman via RT <openafs-bugs@openafs.org> wrote:

> <URL: https://rt.central.org/rt/Ticket/Display.html?id=131780 >
>
> >
>
> That line does not execute when tvc != vcp because goto bad_dentry;

...
Line 1246) goto bad_dentry;
...
Line 1293) done:
Line 1294) /* Clean up */
Line 1295) if (tvc)
Line 1296) afs_PutVCache(tvc);
...
Line 1317) bad_dentry:
Line 1318) if (have_submounts(dp))
Line 1319) valid = 1;
Line 1320) else
Line 1321) valid = 0;
Line 1322) goto done;
...


--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Wed, 15 Oct 2014 00:24:04 -0400
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 289b
On 10/13/2014 10:14 AM, Andrew Deason via RT wrote:
> Hmm, once guess is that 'vcp' is a directory and 'tvc' is a mountpoint
> object, and to fix it we just need to evaludate the tvc mntpt and it
> will become vcp. Are you running with any kind of fakestat?

Yes, -fakestat-all.

Anders
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Wed, 15 Oct 2014 09:06:43 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 540b
On Wed, 15 Oct 2014 00:24:21 -0400
Anders Kaseorg via RT <openafs-bugs@openafs.org> wrote:

> On 10/13/2014 10:14 AM, Andrew Deason via RT wrote:
> > Hmm, once guess is that 'vcp' is a directory and 'tvc' is a mountpoint
> > object, and to fix it we just need to evaludate the tvc mntpt and it
> > will become vcp. Are you running with any kind of fakestat?
>
> Yes, -fakestat-all.

Is using that required? Would it be possible to run without fakestat to
see if this goes away or changes at all?

--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Wed, 15 Oct 2014 22:25:11 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 4.4k
On Wed, 15 Oct 2014 00:24:21 -0400
Anders Kaseorg via RT <openafs-bugs@openafs.org> wrote:

> On 10/13/2014 10:14 AM, Andrew Deason via RT wrote:
> > Hmm, once guess is that 'vcp' is a directory and 'tvc' is a mountpoint
> > object, and to fix it we just need to evaludate the tvc mntpt and it
> > will become vcp. Are you running with any kind of fakestat?
>
> Yes, -fakestat-all.

So, I can reproduce _a_ way of getting this to happen, which I'll
describe below. I don't know if it's what is actually happening in your
situation, so if you would like to know what was actually happening (and
I would, at least), it would still be helpful to get some more
information from some additional log messages. Specifically, the fids
for tvc and vcp, as well as their mvstat, when tvc && tvc != vcp.

The attached patch adds some logging to provide that information, as
well as effectively reverting gerrit 11358 (so you can run with this
without having things break again). Or implement this however you want;
I'm just attaching this to clearly illustrate what I'm after and what
you can do to get it.


Anyway, the motivation for my suspicions above is that the code in our
d_revalidate routine "sometimes" converts vcp from a mtpt object to a
dir. And when we perform the afs_lookup later on, that code also
"sometimes" converts its returned vcache from a mtpt to a dir. This was
concerning because of the duplication in logic and the specifics of the
"sometimes". After looking at it more closely I cannot actually think of
any situation where we consistently get different outcomes, but those
decisions are made in different points in time, which opens up the
possibility of a race.

Specifically, we are looking at if the mtpt has the CMValid flag set. In
order to trigger this race, we either need to clear or set the CMValid
flag after the check in our d_revalidate occurs, but before the check
near the end of afs_lookup. Getting CMValid to be cleared seems easier
than getting it set, since I think getting it set cannot be done while
we are running d_revalidate (but I haven't tried looking into this
much).

The way I have managed to see this happen works like this:

- Someone has /afs/cell/user/adeason as their pwd. The dir and mtpt and
everything is fully cached and has CMValid set.

- Either /afs/cell/user changes, or some relevant callbacks expire;
regardless, something happens such that we need to call afs_lookup in
our d_revalidate.

- Something triggers a lookup for /afs/cell/user/adeason, causing a
d_revalidate to occur. Since CMValid is set, we convert 'adeason' the
mtpt to 'adeason' the dir. Later on we call afs_lookup.

- While afs_lookup is running, something causes CMValid to be cleared.
This can be caused by a volume release, callbacks expiring, or vldb
cache entries expiring. It can also occur by running 'fs checkv',
which is convenient for artificially reproducing the issue.

- The afs_lookup from above sees that CMValid is cleared, and so
returns 'adeason' the mtpt to our d_revalidate. The d_revalidate code
sees that the vcaches do not match, so we d_drop.

- The process with /afs/cell/user/adeason as their pwd now sees ENOENT
when looking up their working directory.

I'm not sure how plausible this is for your situation. I can reproduce
this reliably by adding artificial sleeps and such, but without those
hitting that race window seems pretty coincidental. But iirc, this issue
doesn't happen immediately/constantly; it does take a bit of time...?
This is maybe explainable in that if something is accessing
/afs/cell/user/adeason pretty frequently, as soon as some callbacks
expire we require the afs_lookup and clear CMValid, so every time that
expiration happens maybe we'd have a decent chance at triggering the
race.

But I'm not really sure, so that's why I'd like to get more information.

A real fix for this would probably involve ust converting tvc or vcp
from a mtpt to a dir, if their mvstats mismatch. Ideally we could just
work with mtpt objects and not deal with converting them or something,
but I expect doing that would be more involved.

Oh, and of course, all of this goes away without fakestat. This stuff
with converting mtpts and CMValid and such is only applicable when we
are encountering a mountpoint where fakestat applies. (So, all mtpts
with -fakestat-all, and cellular mountpoints with -fakestat.)

[Note: sorry if I skipped over anything or appear brief; my laptop lost
power during the initial draft and I had to write it all again.]

--
Andrew Deason
adeason@sinenomine.net

Message body is not shown because sender requested not to inline it.

Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Thu, 16 Oct 2014 02:53:17 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 1.8k
On Wed, 15 Oct 2014, Andrew Deason via RT wrote:
> So, I can reproduce _a_ way of getting this to happen, which I'll
> describe below. I don't know if it's what is actually happening in your
> situation, so if you would like to know what was actually happening (and
> I would, at least), it would still be helpful to get some more
> information from some additional log messages. Specifically, the fids
> for tvc and vcp, as well as their mvstat, when tvc && tvc != vcp.
>
> The attached patch adds some logging to provide that information, as
> well as effectively reverting gerrit 11358 (so you can run with this
> without having things break again). Or implement this however you want;
> I'm just attaching this to clearly illustrate what I'm after and what
> you can do to get it.

I added a dentry_path_raw call to get the actual path, and installed this
on one of our servers. A few minutes later, here are the first few
resulting messages. They’re from files, not directories.

[ 227.529670] afs: RT 131780: got back a different vcache for
/afs/sipb.mit.edu/contrib/newsdiffs/newsdiffs/articles/2014-10/.git/packed-refs:
vcp: 536893496.548368.7372916 mvstat 0, tvc: 536893496.548534.7373283
mvstat 0
[ 408.613006] afs: RT 131780: got back a different vcache for
/afs/athena.mit.edu/activity/s/snowriders/web_scripts/forums/cache/sql_27c41f255b262f652cc9713e34bc4d20.php:
vcp: 537622467.4702.1067464 mvstat 0, tvc: 537622467.4702.1067496 mvstat 0
[ 625.653276] afs: RT 131780: got back a different vcache for
/afs/sipb.mit.edu/contrib/newsdiffs/newsdiffs/articles/2014-10/.git/packed-refs:
vcp: 536893496.548534.7373283 mvstat 0, tvc: 536893496.548260.7373349
mvstat 0

> Oh, and of course, all of this goes away without fakestat.

Yeah, we’ll probably switch down to -fakestat after we’re done debugging.

Anders
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Thu, 16 Oct 2014 03:08:52 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 512b
On Thu, 16 Oct 2014, Anders Kaseorg wrote:
> I added a dentry_path_raw call to get the actual path, and installed
> this on one of our servers. A few minutes later, here are the first few
> resulting messages. They’re from files, not directories.

In fact, I can trigger this message on any file just by reading it,
renaming another file over it from a different client, and reading it
again. Is that expected? Should I add additional checks, like maybe
vcp->mvstat != 0 || tvc->mvstat != 0?

Anders
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Thu, 16 Oct 2014 04:18:37 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 1.2k
On Thu, 16 Oct 2014, Anders Kaseorg wrote:
> In fact, I can trigger this message on any file just by reading it,
> renaming another file over it from a different client, and reading it
> again. Is that expected? Should I add additional checks, like maybe
> vcp->mvstat != 0 || tvc->mvstat != 0?

I decided to just filter out regular files.

if (force_drop && tvc && tvc != vcp &&
!(dp->d_inode && S_ISREG(dp->d_inode->i_mode))) {
/* workaround and logging for GCO RT 131780 */
char buf[256];
char *path = dentry_path_raw(dp, buf, sizeof(buf));
afs_warn("afs: RT 131780: got back a different vcache for /afs%s: "
"vcp: %u.%u.%u mvstat %d, tvc: %u.%u.%u mvstat %d\n",
IS_ERR(path) ? "/(bad path)" : path,
(unsigned)vcp->f.fid.Fid.Volume,
(unsigned)vcp->f.fid.Fid.Vnode,
(unsigned)vcp->f.fid.Fid.Unique,
(int)vcp->mvstat,
(unsigned)tvc->f.fid.Fid.Volume,
(unsigned)tvc->f.fid.Fid.Vnode,
(unsigned)tvc->f.fid.Fid.Unique,
(int)tvc->mvstat);
force_drop = 0;
}

Deployed on all servers. No messages yet, except from a similar test
where I renamed a directory from a different client. I’ll let you know if
I see something.

Anders
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Thu, 16 Oct 2014 09:51:48 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 1.4k
On Thu, 16 Oct 2014 03:09:01 -0400
Anders Kaseorg via RT <openafs-bugs@openafs.org> wrote:

>
> <URL: https://rt.central.org/rt/Ticket/Display.html?id=131780 >
>
> On Thu, 16 Oct 2014, Anders Kaseorg wrote:
> > I added a dentry_path_raw call to get the actual path, and installed
> > this on one of our servers. A few minutes later, here are the first few
> > resulting messages. They’re from files, not directories.
>
> In fact, I can trigger this message on any file just by reading it,
> renaming another file over it from a different client, and reading it
> again. Is that expected?

Yes, that's the situation where it's supposed to trigger (where you
could get incorrect results without gerrit 11358). Sorry, for some
reason I thought that wouldn't be as common for you, but that's
obviously why you got a ton of messages before.

Restricting this to non-files is good; you could also probably restrict
it to mountpoints if you're confident that's where the problem was being
seen (or if there's any other factor you can use to restrict it to the
places where you saw the ENOENT problem, you can do that). Detecting a
mountpoint can be done by seeing if both tvc and vcp have mvstat 1 or 2.

What you have now is probably good, but you can probably still trigger
it via the same renaming thing, just with a dir. That's probably not
common since you can't clobber a dir with a rename like you can with
files.

--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Thu, 16 Oct 2014 12:48:53 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 2.5k
Here’s the output from cats-whiskers.mit.edu since last night.

[ 9628.094976] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/course/11/11.309j: vcp: 537721959.3874.27611 mvstat 1, tvc: 538064149.1.1 mvstat 2
[ 9927.969863] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/activity/b/beta: vcp: 536920918.1276.18056 mvstat 1, tvc: 537229438.1.1 mvstat 2
[11373.951663] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/w: vcp: 536955440.1.1 mvstat 2, tvc: 537870333.8.12639 mvstat 1
[15832.372285] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/j: vcp: 536873032.1.1 mvstat 2, tvc: 537870333.348.9094 mvstat 1
[22044.074396] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/c: vcp: 536873011.1.1 mvstat 2, tvc: 537870333.334.9087 mvstat 1
[22060.890220] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/a: vcp: 537845135.1.1 mvstat 2, tvc: 537870333.12.9085 mvstat 1
[22060.941567] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/b: vcp: 536873008.1.1 mvstat 2, tvc: 537870333.332.9086 mvstat 1
[22060.947941] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/b: vcp: 536873008.1.1 mvstat 2, tvc: 537870333.332.9086 mvstat 1
[22091.997311] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/j: vcp: 536873032.1.1 mvstat 2, tvc: 537870333.348.9094 mvstat 1
[23369.997891] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/b: vcp: 536873008.1.1 mvstat 2, tvc: 537870333.332.9086 mvstat 1
[24883.070670] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/c: vcp: 536873011.1.1 mvstat 2, tvc: 537870333.334.9087 mvstat 1
[25066.167329] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/b: vcp: 536873008.1.1 mvstat 2, tvc: 537870333.332.9086 mvstat 1
[26595.207897] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/j/h/jhawk: vcp: 537048845.1.1 mvstat 2, tvc: 536873032.1368.17332 mvstat 1
[26642.805286] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/j: vcp: 536873032.1.1 mvstat 2, tvc: 537870333.348.9094 mvstat 1
[28545.007769] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/j: vcp: 536873032.1.1 mvstat 2, tvc: 537870333.348.9094 mvstat 1
[30339.324735] afs: RT 131780: got back a different vcache for /afs/athena.mit.edu/user/j/h/jhawk: vcp: 537048845.1.1 mvstat 2, tvc: 536873032.1368.17332 mvstat 1

Anders
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Thu, 16 Oct 2014 12:22:30 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 1.2k
On Thu, 16 Oct 2014 12:49:08 -0400
Anders Kaseorg via RT <openafs-bugs@openafs.org> wrote:

> Here’s the output from cats-whiskers.mit.edu since last night.

That looks consistent with what I was suggesting (though some of these
are the "reverse"/harder way, not exactly sure how those happened, but
the result is the same). Just to double check, can you confirm the
following information:

/afs/athena.mit.edu/course/11/ is volume 537721959
/afs/athena.mit.edu/course/11/11.309j/ is volume 538064149
/afs/athena.mit.edu/activity/b/ is volume 536920918
/afs/athena.mit.edu/activity/b/beta/ is volume 537229438
/afs/athena.mit.edu/user/ is volume 537870333
/afs/athena.mit.edu/user/w/ is volume 536955440
/afs/athena.mit.edu/user/j/ is volume 536873032
/afs/athena.mit.edu/user/c/ is volume 536873011
/afs/athena.mit.edu/user/a/ is volume 537845135
/afs/athena.mit.edu/user/b/ is volume 536873008
/afs/athena.mit.edu/user/j/ is volume 536873032
/afs/athena.mit.edu/user/j/h/jhawk is volume 537048845

You could check that the 'mvstat 1' FIDs also match the mountpoints in
there, but that's pretty annoying to do, so I wouldn't bother; the
volume ids are pretty good confirmation. (Unless you have and know how
to use volscan for this? That's probably the easiest way.)

--
Andrew Deason
adeason@sinenomine.net
Download (untitled)
text/plain 267b
On Thu Oct 16 13:22:38 2014, adeason wrote:
> volume ids are pretty good confirmation. (Unless you have and know how
> to use volscan for this? That's probably the easiest way.)

None of us on this ticket have login access to the athena fileservers, so that's moot.
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Thu, 16 Oct 2014 15:01:55 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 1.9k
On Thu, 16 Oct 2014, Andrew Deason via RT wrote:
> That looks consistent with what I was suggesting (though some of these
> are the "reverse"/harder way, not exactly sure how those happened, but
> the result is the same). Just to double check, can you confirm the
> following information:
>
> /afs/athena.mit.edu/course/11/ is volume 537721959
> /afs/athena.mit.edu/course/11/11.309j/ is volume 538064149
> /afs/athena.mit.edu/activity/b/ is volume 536920918
> /afs/athena.mit.edu/activity/b/beta/ is volume 537229438
> /afs/athena.mit.edu/user/ is volume 537870333
> /afs/athena.mit.edu/user/w/ is volume 536955440
> /afs/athena.mit.edu/user/j/ is volume 536873032
> /afs/athena.mit.edu/user/c/ is volume 536873011
> /afs/athena.mit.edu/user/a/ is volume 537845135
> /afs/athena.mit.edu/user/b/ is volume 536873008
> /afs/athena.mit.edu/user/j/ is volume 536873032
> /afs/athena.mit.edu/user/j/h/jhawk is volume 537048845

Yes, according to fs listvol,

File /afs/athena.mit.edu/course/11/ (537721959.27.8256) contained in volume 537721959
File /afs/athena.mit.edu/course/11/11.309j/ (538064149.1.1) contained in volume 538064149
File /afs/athena.mit.edu/activity/b/ (536920918.5.3014) contained in volume 536920918
File /afs/athena.mit.edu/activity/b/beta/ (537229438.1.1) contained in volume 537229438
File /afs/athena.mit.edu/user/ (537870333.15.9083) contained in volume 537870333
File /afs/athena.mit.edu/user/w/ (536955440.1.1) contained in volume 536955440
File /afs/athena.mit.edu/user/j/ (536873032.1.1) contained in volume 536873032
File /afs/athena.mit.edu/user/c/ (536873011.1.1) contained in volume 536873011
File /afs/athena.mit.edu/user/a/ (537845135.1.1) contained in volume 537845135
File /afs/athena.mit.edu/user/b/ (536873008.1.1) contained in volume 536873008
File /afs/athena.mit.edu/user/j/ (536873032.1.1) contained in volume 536873032
File /afs/athena.mit.edu/user/j/h/jhawk (537048845.1.1) contained in volume 537048845

Anders
Download (untitled)
text/html 770b
The function definition for afs_lookup() differs between the kernel and ukernel cases.  The ukernel definition includes an additional flags parameter which can be set to either zero or AFS_LOOKUP_NOEVAL.   One possible partial fix would be to build afs_lookup() with the flags parameter in all cases and then use it in afs_linux_dentry_revalidate() .   If tvc->mvstat indicates a mount point, then call afs_lookup() with AFS_LOOKUP_NOEVAL to ensure that the mount point is returned.   If tvc->mvstat is not a mount point, call afs_lookup() without AFS_LOOKUP_NOEVAL and assume that the failure to resolve the root directory (when it occurs) indicates a network failure (or other issue) that can be ignored.

Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Tue, 28 Oct 2014 01:01:23 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 1.4k
On Thu, 23 Oct 2014 15:41:27 -0400
Jeffrey Altman via RT <openafs-bugs@openafs.org> wrote:

> The function definition for afs_lookup() differs between the kernel
> and ukernel cases. The ukernel definition includes an additional flags
> parameter which can be set to either zero or AFS_LOOKUP_NOEVAL. One
> possible partial fix would be to build afs_lookup() with the flags
> parameter in all cases and then use it in
> afs_linux_dentry_revalidate() . If tvc->mvstat indicates a mount
> point, then call afs_lookup() with AFS_LOOKUP_NOEVAL to ensure that
> the mount point is returned. If tvc->mvstat is not a mount point, call
> afs_lookup() without AFS_LOOKUP_NOEVAL and assume that the failure to
> resolve the root directory (when it occurs) indicates a network
> failure (or other issue) that can be ignored.

No, I don't think that's as helpful. The case that is causing the
problem here is when _revalidate does resolve the mtpt, but afs_lookup
does not (at least, assuming what I said before is correct). Adding the
AFS_LOOKUP_NOEVAL flag doesn't help there.

But my original suggestion was unnecessarily involved. We can just check
the returned tvc against dp->d_inode as well as vcp. This is in gerrit
11559.

If Anders or anyone else reading this wants to apply this to 1.6.10,
apply these two patches (the latter is for master, but it applies fine):

<http://gerrit.openafs.org/11558>
<http://gerrit.openafs.org/11559>

--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Tue, 28 Oct 2014 08:50:33 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Anders Kaseorg <andersk@mit.edu>
Download (untitled)
text/plain 501b
On Tue, 28 Oct 2014, Andrew Deason via RT wrote:
> But my original suggestion was unnecessarily involved. We can just check
> the returned tvc against dp->d_inode as well as vcp. This is in gerrit
> 11559.
>
> If Anders or anyone else reading this wants to apply this to 1.6.10,
> apply these two patches (the latter is for master, but it applies fine):
>
> <http://gerrit.openafs.org/11558>
> <http://gerrit.openafs.org/11559>

Thanks. Applying on scripts.mit.edu; will report back soon.

Anders
Download (untitled)
text/plain 254b
We have another (set of) data point, from the athena dialups, which are running 1.6.9 from Ubuntu.

The user is seeing:

ls -l .
ls: cannot access 1: No such file or directory
[...]
d????????? ? ? ? ? ? 1/

where 1 is most definitely not a mountpoint.
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Wed, 29 Oct 2014 17:46:27 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 826b
On Wed, 29 Oct 2014 18:35:49 -0400
"kaduk@mit.edu via RT" <openafs-bugs@openafs.org> wrote:

> We have another (set of) data point, from the athena dialups, which
> are running 1.6.9 from Ubuntu.
>
> The user is seeing:
>
> ls -l .
> ls: cannot access 1: No such file or directory
> [...]
> d????????? ? ? ? ? ? 1/
>
> where 1 is most definitely not a mountpoint.

I don't think that's the same thing. A d_drop on a dir should not cause
that; that's a failure during lookup (or getattr). That can happen in
some other situations like maybe VNOVNODE I think, or it could be some
other bug.

If you can get that to happen when you're looking at it, I'd look at
trying to trace what code paths we hit when that happens. fstrace or
linux's kernel function tracing thing or something.

--
Andrew Deason
adeason@sinenomine.net
RT-Send-CC: jweiss@mit.edu
Download (untitled)
text/plain 380b
On Wed Oct 29 18:46:36 2014, adeason wrote:
> If you can get that to happen when you're looking at it, I'd look at
> trying to trace what code paths we hit when that happens. fstrace or
> linux's kernel function tracing thing or something.

Is "that" just "the ls which fails to resolve the directory in question"?
We had a machine stuck in that state today, so it may be doable.
Subject: Re: [rt.central.org #131780] linux: getcwd sometimes fails
Date: Thu, 30 Oct 2014 00:36:40 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 754b
On Wed, 29 Oct 2014 22:39:39 -0400
"kaduk@mit.edu via RT" <openafs-bugs@openafs.org> wrote:

> On Wed Oct 29 18:46:36 2014, adeason wrote:
> > If you can get that to happen when you're looking at it, I'd look at
> > trying to trace what code paths we hit when that happens. fstrace or
> > linux's kernel function tracing thing or something.
>
> Is "that" just "the ls which fails to resolve the directory in question"?
> We had a machine stuck in that state today, so it may be doable.

Yes; presumably a failing lstat(). It could go in another ticket, since
I don't think it's the same thing (it could always be put back in here
if I'm wrong). Or it could stay here for just a little more until we
know more.

--
Andrew Deason
adeason@sinenomine.net
Download (untitled)
text/html 1k
It has recently become clear that this problem is not nearly as resolved as we thought.  While debugging a reported problem with the Ruby on Rails installer, we noticed that getcwd() is still frequently failing on the /afs/athena.mit.edu mount point.

$ cd /afs/athena.mit.edu
$ dash -c 'pwd -P'
dash: 1: pwd: getcwd() failed: No such file or directory

We added monitoring using this test case 48 hours ago, and since then it’s happened ten times across the five scripts.mit.edu servers (we are responding to these failures by rebooting).

They are running OpenAFS 1.6.11pre1 plus Gerrit 11616, 11464, 11694, 11641, 11659, on Linux 3.17.8 and Fedora 20.  afsd options are -afsdb -dynroot -fakestat -stat 25000 -daemons 100 -volumes 4000 -files 400000 -chunksize 19.

We also noticed that the same test case currently fails on several of the athena.dialup.mit.edu machines (OpenAFS 1.6.9, Linux 3.13, Ubuntu 14.04), so at least it isn’t just us.
Download (untitled)
text/html 267b
I have good news and bad news.  Both of them are that after upgrading two of the Scripts servers to kernel 3.18.5, they reliably start seeing the getcwd problem on /afs/athena.mit.edu/contrib/scripts within minutes of every boot.

Anders