Skip Menu | Logged in as guest | Logout
 
The Basics
Id: 131852
Status: open
Priority: 0/
Queue: openafs-bugs

Dates
Created: Mon Apr 14 09:49:44 2014
Starts: Not set
Started: Not set
Last Contact: Tue Apr 29 17:01:17 2014
Due: Not set
Closed: Tue Apr 15 13:46:41 2014
Updated: Tue Apr 29 17:01:17 2014 by adeason

History Brief headersFull headers
Subject: crash in 1.6.7 dafileserver
Date: Mon, 14 Apr 2014 09:49:30 -0400 (EDT)
To: openafs-bugs@openafs.org
From: Eric Sturdivant <sturdiva@umd.edu>
Download (untitled)
text/plain 1.4k

Attempted to upgrade to 1.6.7, and immediately get a crash in the
fileserver:

(gdb) where
#0 0x00000000 in ?? ()
#1 0xff1787e4 in krb5_storage_seek (sp=0x27569f50, offset=0, whence=1) at
store.c:173
#2 0xff1511f0 in fkt_next_entry_int (context=0x27569f50, id=0x2756a3b8,
entry=0xf7ffb1e8, cursor=0xf7ffb1dc, start=0x0, end=0x0) at
keytab_file.c:457
#3 0xff15160c in fkt_next_entry (context=0x27569f50, id=0x2756a3b8,
entry=0xf7ffb1e8, cursor=0xf7ffb1dc) at keytab_file.c:512
#4 0xff14f2d4 in krb5_kt_next_entry (context=0x27569f50, id=0x2756a3b8,
entry=0xf7ffb1e8, cursor=0xf7ffb1dc) at keytab.c:769
#5 0x00098810 in pick_principal ()
#6 0x000988e8 in pick_enctype_and_principal ()
#7 0x00098cd0 in get_credv5_akimpersonate ()
#8 0x00097edc in K5Auth ()
#9 0x00097fb0 in GenericAuth ()
#10 0x00098188 in afsconf_ClientAuthSecure ()
#11 0x0003e334 in hpr_Initialize ()
#12 0x0003e58c in hpr_NameToId ()
#13 0x00043024 in MapName_r ()
#14 0x00043590 in h_FindClient_r ()
#15 0x0002ac8c in CallPreamble ()
#16 0x000310a4 in SRXAFS_FetchStatus ()
#17 0x00089678 in _RXAFS_FetchStatus ()
#18 0x0008ed00 in RXAFS_ExecuteRequest ()
#19 0x000afd3c in rxi_ServerProc ()
#20 0x00092494 in rx_ServerProc ()
#21 0x00091db0 in server_entry ()
#22 0xfedcaee8 in _lwp_start () from /lib/libc.so.1


Previous working version was 1.6.2


--
Eric Sturdivant
University of Maryland
Division of Information Technology
Enterprise Unix Services
Subject: Re: [rt.central.org #131852] New Ticket: crash in 1.6.7 dafileserver
Date: Tue, 15 Apr 2014 12:38:42 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 659b
On Mon, 14 Apr 2014 09:49:44 -0400
Eric Sturdivant via RT <openafs-bugs@openafs.org> wrote:

> Attempted to upgrade to 1.6.7, and immediately get a crash in the
> fileserver:

fyi, this code path was not touched with 1.6.7, but it was introduced in
1.6.5. If you just want the security fixes for 1.6.7, you shoud be able
to apply them directly to 1.6.2 in the meantime.

> (gdb) where
> #0 0x00000000 in ?? ()
> #1 0xff1787e4 in krb5_storage_seek (sp=0x27569f50, offset=0, whence=1) at
> store.c:173

This is a crash in libkrb5. That may or may not be our fault, but can
you say what version of libkrb5 this is?

--
Andrew Deason
adeason@sinenomine.net
Download (untitled)
text/html 101b
This is a crash in Heimdal 1.5.2 caused by the use of a keytab file that does not exist.

Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Tue, 15 Apr 2014 13:47:19 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 401b
On Tue, 15 Apr 2014 13:46:41 -0400
Jeffrey Altman via RT <openafs-bugs@openafs.org> wrote:

> This is a crash in Heimdal 1.5.2 caused by the use of a keytab file
> that does not exist.

Do you mean to say that it's a known issue for Heimdal, and there's a
fix for it?

And so I assume we should just test to see if the file exists
beforehand. Simple enough.

--
Andrew Deason
adeason@sinenomine.net
Download (untitled)
text/html 707b
On Tue Apr 15 14:47:30 2014, adeason wrote:
> On Tue, 15 Apr 2014 13:46:41 -0400
> Jeffrey Altman via RT <openafs-bugs@openafs.org> wrote:
>
> > This is a crash in Heimdal 1.5.2 caused by the use of a keytab file
> > that does not exist.
>
> Do you mean to say that it's a known issue for Heimdal, and there's a
> fix for it?

It is not a previously known issue for for Heimdal 1.5.x.   It is a bug.

Either Heimdal needs to create an empty file when asked to resolve the keytab
it needs to check for existence before attempting to read from it.



Download (untitled)
text/plain 843b
I cannot reproduce this, at least based on the simple explanation
provided above. If there is no rxkad.keytab file, krb5_kt_start_seq_get
returns ENOENT, so we never get to the code path indicated in this
ticket. I see this when I try it, and it's pretty easy to trace through
the code:

krb5_kt_start_seq_get (akimpersonate.c:428) ->
fkt_start_seq_get (keytab.c:739) ->
fkt_start_seq_get_int (keytab_file.c:438) ->
open (keytab_file.c:382)

The open call around there seems correct to me:

c->fd = open (d->filename, flags);
if (c->fd < 0) {
ret = errno;
krb5_set_error_message(context, ret,
N_("keytab %s open failed: %s", ""),
d->filename, strerror(ret));
return ret;
}

So... am I missing something?

--
Andrew Deason
adeason@sinenomine.net
Download (untitled)
text/plain 549b
On Mon Apr 14 09:49:44 2014, sturdiva@umd.edu wrote:
> (gdb) where
> #0 0x00000000 in ?? ()
> #1 0xff1787e4 in krb5_storage_seek (sp=0x27569f50, offset=0, whence=1) at
> store.c:173
> #2 0xff1511f0 in fkt_next_entry_int (context=0x27569f50, id=0x2756a3b8,
> entry=0xf7ffb1e8, cursor=0xf7ffb1dc, start=0x0, end=0x0) at
> keytab_file.c:457

Also just noticed, 'sp' and 'context' are pointing to the same thing; that's pretty strange. It'd be
pretty interesting to know what 'print *context' looks like.

--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 12:37:13 -0400 (EDT)
To: Andrew Deason via RT <openafs-bugs@openafs.org>
From: Eric Sturdivant <sturdiva@umd.edu>
Download (untitled)
text/plain 2k
On Wed, 23 Apr 2014, Andrew Deason via RT wrote:

>
> <URL: https://rt.central.org/rt/Ticket/Display.html?id=131852 >
>
> On Mon Apr 14 09:49:44 2014, sturdiva@umd.edu wrote:
>> (gdb) where
>> #0 0x00000000 in ?? ()
>> #1 0xff1787e4 in krb5_storage_seek (sp=0x27569f50, offset=0, whence=1) at
>> store.c:173
>> #2 0xff1511f0 in fkt_next_entry_int (context=0x27569f50, id=0x2756a3b8,
>> entry=0xf7ffb1e8, cursor=0xf7ffb1dc, start=0x0, end=0x0) at
>> keytab_file.c:457
>
> Also just noticed, 'sp' and 'context' are pointing to the same thing; that's pretty strange. It'd be
> pretty interesting to know what 'print *context' looks like.
>
>


(gdb) up
#1 0xff1787e4 in krb5_storage_seek (sp=0x27570158, offset=0, whence=1) at
store.c:173
173 return (*sp->seek)(sp, offset, whence);


(gdb) up
#2 0xff1511f0 in fkt_next_entry_int (context=0x27570158, id=0x27571558,
entry=0xf7ffb1e8, cursor=0xf7ffb1dc, start=0x0,
end=0x0) at keytab_file.c:457
457 pos = krb5_storage_seek(cursor->sp, 0, SEEK_CUR);


(gdb) print *context
$1 = {etypes = 0x0, etypes_des = 0x0, as_etypes = 0x0, tgs_etypes = 0x0,
permitted_enctypes = 0x0, default_realms = 0x0,
max_skew = 300, kdc_timeout = 3, max_retries = 3, kdc_sec_offset = 0,
kdc_usec_offset = 0, cf = 0x26aade28,
et_list = 0x27570828, warn_dest = 0x0, debug_dest = 0x0, cc_ops =
0x275708c8, num_cc_ops = 5, http_proxy = 0x0,
time_fmt = 0xff185198 "%Y-%m-%dT%H:%M:%S", log_utc = 0, default_keytab =
0xff185148 "FILE:/etc/krb5.keytab",
default_keytab_modify = 0x0, use_admin_kdc = 0, extra_addresses = 0x0,
scan_interfaces = 1, srv_lookup = 1,
srv_try_txt = 0, fcache_vno = 0, num_kt_types = 6, kt_types =
0x275713c8, date_fmt = 0xff1851c0 "%Y-%m-%d",
error_string = 0x27571598 "keytab /usr/afs/etc/rxkad.keytab open failed:
Error 0", error_code = 0,
ignore_addresses = 0x0, default_cc_name = 0x0, default_cc_name_env =
0x0, default_cc_name_set = 0, mutex = 0x159ce0,
large_msg_size = 1400, flags = 7, send_to_kdc = 0x0, hx509ctx =
0x26ff4858}


--
Eric Sturdivant
University of Maryland
Division of Information Technology
Enterprise Unix Services
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 12:17:46 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 710b
On Thu, 24 Apr 2014 12:37:23 -0400
Eric Sturdivant via RT <openafs-bugs@openafs.org> wrote:

> error_string = 0x27571598 "keytab /usr/afs/etc/rxkad.keytab open failed:
> Error 0", error_code = 0,

Pretty bizarre; yeah, that would do it. Heimdal maybe redeclaring errno
masking the 'real' one? Or 'open' getting defined to something that
clears errno? Just guessing.

Can you provide the output of:

(gdb) disassemble fkt_start_seq_get_int

Did you build this heimdal yourself, or get it from somewhere?

(And yeah, I realize this is a heimdal bug, but I think OpenAFS would
still like to know how common it is, and if I can reproduce it
non-artificially, etc...)

--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 13:24:02 -0400 (EDT)
To: Andrew Deason via RT <openafs-bugs@openafs.org>
From: Eric Sturdivant <sturdiva@umd.edu>
Download (untitled)
text/plain 11.7k
On Thu, 24 Apr 2014, Andrew Deason via RT wrote:

>
> <URL: https://rt.central.org/rt/Ticket/Display.html?id=131852 >
>
> On Thu, 24 Apr 2014 12:37:23 -0400
> Eric Sturdivant via RT <openafs-bugs@openafs.org> wrote:
>
>> error_string = 0x27571598 "keytab /usr/afs/etc/rxkad.keytab open failed:
>> Error 0", error_code = 0,
>
> Pretty bizarre; yeah, that would do it. Heimdal maybe redeclaring errno
> masking the 'real' one? Or 'open' getting defined to something that
> clears errno? Just guessing.
>
> Can you provide the output of:
>
> (gdb) disassemble fkt_start_seq_get_int
>
> Did you build this heimdal yourself, or get it from somewhere?
>
> (And yeah, I realize this is a heimdal bug, but I think OpenAFS would
> still like to know how common it is, and if I can reproduce it
> non-artificially, etc...)
>
>

Dump of assembler code for function fkt_start_seq_get_int:
0xff150dbc <+0>: save %sp, -112, %sp
0xff150dc0 <+4>: sethi %hi(0x4e800), %l7
0xff150dc4 <+8>: add %l7, 0x2d8, %l7 ! 0x4ead8
<fs_stateReadV+52>
0xff150dc8 <+12>: call 0xff115f3c <__sparc_get_pc_thunk.l7>
0xff150dcc <+16>: nop
0xff150dd0 <+20>: st %i0, [ %fp + 0x44 ]
0xff150dd4 <+24>: st %i1, [ %fp + 0x48 ]
0xff150dd8 <+28>: st %i2, [ %fp + 0x4c ]
0xff150ddc <+32>: st %i3, [ %fp + 0x50 ]
0xff150de0 <+36>: st %i4, [ %fp + 0x54 ]
0xff150de4 <+40>: ld [ %fp + 0x48 ], %g1
0xff150de8 <+44>: ld [ %g1 + 0x2c ], %g1
0xff150dec <+48>: st %g1, [ %fp + -4 ]
0xff150df0 <+52>: ld [ %fp + -4 ], %g1
0xff150df4 <+56>: ld [ %g1 ], %g1
0xff150df8 <+60>: mov %g1, %o0
0xff150dfc <+64>: ld [ %fp + 0x4c ], %o1
0xff150e00 <+68>: call 0xff1a0930 <open64@plt>
0xff150e04 <+72>: nop
0xff150e08 <+76>: mov %o0, %g2
0xff150e0c <+80>: ld [ %fp + 0x54 ], %g1
0xff150e10 <+84>: st %g2, [ %g1 ]
0xff150e14 <+88>: ld [ %fp + 0x54 ], %g1
0xff150e18 <+92>: ld [ %g1 ], %g1
0xff150e1c <+96>: cmp %g1, 0
0xff150e20 <+100>: bge %icc, 0xff150e88 <fkt_start_seq_get_int+204>
0xff150e24 <+104>: nop
0xff150e28 <+108>: sethi %hi(0), %g1
0xff150e2c <+112>: xor %g1, 0x50, %g1
0xff150e30 <+116>: ld [ %l7 + %g1 ], %g1
0xff150e34 <+120>: ld [ %g1 ], %g1
0xff150e38 <+124>: st %g1, [ %fp + -8 ]
0xff150e3c <+128>: ld [ %fp + -4 ], %g1
0xff150e40 <+132>: ld [ %g1 ], %i5
0xff150e44 <+136>: ld [ %fp + -8 ], %o0
0xff150e48 <+140>: call 0xff1a00fc <strerror@plt>
0xff150e4c <+144>: nop
0xff150e50 <+148>: mov %o0, %g2
0xff150e54 <+152>: ld [ %fp + 0x44 ], %o0
0xff150e58 <+156>: ld [ %fp + -8 ], %o1
0xff150e5c <+160>: sethi %hi(0x18000), %g1
0xff150e60 <+164>: xor %g1, -456, %g1
0xff150e64 <+168>: add %l7, %g1, %g1
0xff150e68 <+172>: mov %g1, %o2
0xff150e6c <+176>: mov %i5, %o3
0xff150e70 <+180>: mov %g2, %o4
0xff150e74 <+184>: call 0xff19f9f4 <krb5_set_error_message@plt>
0xff150e78 <+188>: nop
0xff150e7c <+192>: ld [ %fp + -8 ], %g1
0xff150e80 <+196>: b %xcc, 0xff151160 <fkt_start_seq_get_int+932>
0xff150e84 <+200>: nop
0xff150e88 <+204>: ld [ %fp + 0x54 ], %g1
0xff150e8c <+208>: ld [ %g1 ], %g1
0xff150e90 <+212>: mov %g1, %o0
0xff150e94 <+216>: call 0xff1a01bc <rk_cloexec@plt>
0xff150e98 <+220>: nop
0xff150e9c <+224>: ld [ %fp + 0x54 ], %g1
0xff150ea0 <+228>: ld [ %g1 ], %g2
0xff150ea4 <+232>: ld [ %fp + -4 ], %g1
0xff150ea8 <+236>: ld [ %g1 ], %g1
0xff150eac <+240>: ld [ %fp + 0x44 ], %o0
0xff150eb0 <+244>: mov %g2, %o1
0xff150eb4 <+248>: ld [ %fp + 0x50 ], %o2
0xff150eb8 <+252>: mov %g1, %o3
0xff150ebc <+256>: call 0xff1a0c18 <_krb5_xlock@plt>
0xff150ec0 <+260>: nop
0xff150ec4 <+264>: st %o0, [ %fp + -8 ]
0xff150ec8 <+268>: ld [ %fp + -8 ], %g1
0xff150ecc <+272>: cmp %g1, 0
0xff150ed0 <+276>: be %icc, 0xff150ef8 <fkt_start_seq_get_int+316>
0xff150ed4 <+280>: nop
0xff150ed8 <+284>: ld [ %fp + 0x54 ], %g1
0xff150edc <+288>: ld [ %g1 ], %g1
0xff150ee0 <+292>: mov %g1, %o0
0xff150ee4 <+296>: call 0xff1a01d4 <close@plt>
0xff150ee8 <+300>: nop
0xff150eec <+304>: ld [ %fp + -8 ], %g1
0xff150ef0 <+308>: b %xcc, 0xff151160 <fkt_start_seq_get_int+932>
0xff150ef4 <+312>: nop
0xff150ef8 <+316>: ld [ %fp + 0x54 ], %g1
0xff150efc <+320>: ld [ %g1 ], %g1
0xff150f00 <+324>: mov %g1, %o0
0xff150f04 <+328>: call 0xff1a0cf0 <krb5_storage_from_fd@plt>
0xff150f08 <+332>: nop
0xff150f0c <+336>: mov %o0, %g2
0xff150f10 <+340>: ld [ %fp + 0x54 ], %g1
0xff150f14 <+344>: st %g2, [ %g1 + 4 ]
0xff150f18 <+348>: ld [ %fp + 0x54 ], %g1
0xff150f1c <+352>: ld [ %g1 + 4 ], %g1
0xff150f20 <+356>: cmp %g1, 0
0xff150f24 <+360>: bne %icc, 0xff150f84 <fkt_start_seq_get_int+456>
0xff150f28 <+364>: nop
0xff150f2c <+368>: ld [ %fp + 0x54 ], %g1
0xff150f30 <+372>: ld [ %g1 ], %g1
0xff150f34 <+376>: ld [ %fp + 0x44 ], %o0
0xff150f38 <+380>: mov %g1, %o1
0xff150f3c <+384>: call 0xff1a0c24 <_krb5_xunlock@plt>
0xff150f40 <+388>: nop
0xff150f44 <+392>: ld [ %fp + 0x54 ], %g1
0xff150f48 <+396>: ld [ %g1 ], %g1
0xff150f4c <+400>: mov %g1, %o0
0xff150f50 <+404>: call 0xff1a01d4 <close@plt>
0xff150f54 <+408>: nop
0xff150f58 <+412>: ld [ %fp + 0x44 ], %o0
0xff150f5c <+416>: mov 0xc, %o1
0xff150f60 <+420>: sethi %hi(0x18000), %g1
0xff150f64 <+424>: xor %g1, -832, %g1
0xff150f68 <+428>: add %l7, %g1, %g1
0xff150f6c <+432>: mov %g1, %o2
0xff150f70 <+436>: call 0xff19f9f4 <krb5_set_error_message@plt>
0xff150f74 <+440>: nop
0xff150f78 <+444>: mov 0xc, %g1 ! 0xc
0xff150f7c <+448>: b %xcc, 0xff151160 <fkt_start_seq_get_int+932>
0xff150f80 <+452>: nop
0xff150f84 <+456>: ld [ %fp + 0x54 ], %g1
0xff150f88 <+460>: ld [ %g1 + 4 ], %g1
0xff150f8c <+464>: mov %g1, %o0
0xff150f90 <+468>: sethi %hi(0x96c73800), %g1
0xff150f94 <+472>: or %g1, 0x2b6, %o1 ! 0x96c73ab6
0xff150f98 <+476>: call 0xff1a0c9c <krb5_storage_set_eof_code@plt>
0xff150f9c <+480>: nop
0xff150fa0 <+484>: ld [ %fp + 0x54 ], %g1
0xff150fa4 <+488>: ld [ %g1 + 4 ], %g2
0xff150fa8 <+492>: add %fp, -9, %g1
0xff150fac <+496>: mov %g2, %o0
0xff150fb0 <+500>: mov %g1, %o1
0xff150fb4 <+504>: call 0xff1a0cfc <krb5_ret_int8@plt>
0xff150fb8 <+508>: nop
0xff150fbc <+512>: st %o0, [ %fp + -8 ]
0xff150fc0 <+516>: ld [ %fp + -8 ], %g1
0xff150fc4 <+520>: cmp %g1, 0
0xff150fc8 <+524>: be %icc, 0xff151028 <fkt_start_seq_get_int+620>
0xff150fcc <+528>: nop
0xff150fd0 <+532>: ld [ %fp + 0x54 ], %g1
0xff150fd4 <+536>: ld [ %g1 + 4 ], %g1
0xff150fd8 <+540>: mov %g1, %o0
0xff150fdc <+544>: call 0xff19fc64 <krb5_storage_free@plt>
0xff150fe0 <+548>: nop
0xff150fe4 <+552>: ld [ %fp + 0x54 ], %g1
0xff150fe8 <+556>: ld [ %g1 ], %g1
0xff150fec <+560>: ld [ %fp + 0x44 ], %o0
0xff150ff0 <+564>: mov %g1, %o1
0xff150ff4 <+568>: call 0xff1a0c24 <_krb5_xunlock@plt>
0xff150ff8 <+572>: nop
0xff150ffc <+576>: ld [ %fp + 0x54 ], %g1
0xff151000 <+580>: ld [ %g1 ], %g1
0xff151004 <+584>: mov %g1, %o0
0xff151008 <+588>: call 0xff1a01d4 <close@plt>
0xff15100c <+592>: nop
0xff151010 <+596>: ld [ %fp + 0x44 ], %o0
0xff151014 <+600>: call 0xff19f9b8 <krb5_clear_error_message@plt>
0xff151018 <+604>: nop
0xff15101c <+608>: ld [ %fp + -8 ], %g1
0xff151020 <+612>: b %xcc, 0xff151160 <fkt_start_seq_get_int+932>
0xff151024 <+616>: nop
0xff151028 <+620>: ldub [ %fp + -9 ], %g1
0xff15102c <+624>: sll %g1, 0x18, %g1
0xff151030 <+628>: sra %g1, 0x18, %g1
0xff151034 <+632>: cmp %g1, 5
0xff151038 <+636>: be %icc, 0xff15109c <fkt_start_seq_get_int+736>
0xff15103c <+640>: nop
0xff151040 <+644>: ld [ %fp + 0x54 ], %g1
0xff151044 <+648>: ld [ %g1 + 4 ], %g1
0xff151048 <+652>: mov %g1, %o0
0xff15104c <+656>: call 0xff19fc64 <krb5_storage_free@plt>
0xff151050 <+660>: nop
0xff151054 <+664>: ld [ %fp + 0x54 ], %g1
0xff151058 <+668>: ld [ %g1 ], %g1
0xff15105c <+672>: ld [ %fp + 0x44 ], %o0
0xff151060 <+676>: mov %g1, %o1
0xff151064 <+680>: call 0xff1a0c24 <_krb5_xunlock@plt>
0xff151068 <+684>: nop
0xff15106c <+688>: ld [ %fp + 0x54 ], %g1
0xff151070 <+692>: ld [ %g1 ], %g1
0xff151074 <+696>: mov %g1, %o0
0xff151078 <+700>: call 0xff1a01d4 <close@plt>
0xff15107c <+704>: nop
0xff151080 <+708>: ld [ %fp + 0x44 ], %o0
0xff151084 <+712>: call 0xff19f9b8 <krb5_clear_error_message@plt>
0xff151088 <+716>: nop
0xff15108c <+720>: sethi %hi(0x96c73800), %g1
0xff151090 <+724>: or %g1, 0x2d5, %g1 ! 0x96c73ad5
0xff151094 <+728>: b %xcc, 0xff151160 <fkt_start_seq_get_int+932>
0xff151098 <+732>: nop
0xff15109c <+736>: ld [ %fp + 0x54 ], %g1
0xff1510a0 <+740>: ld [ %g1 + 4 ], %g2
0xff1510a4 <+744>: add %fp, -10, %g1
0xff1510a8 <+748>: mov %g2, %o0
0xff1510ac <+752>: mov %g1, %o1
0xff1510b0 <+756>: call 0xff1a0cfc <krb5_ret_int8@plt>
0xff1510b4 <+760>: nop
0xff1510b8 <+764>: st %o0, [ %fp + -8 ]
0xff1510bc <+768>: ld [ %fp + -8 ], %g1
0xff1510c0 <+772>: cmp %g1, 0
0xff1510c4 <+776>: be %icc, 0xff151124 <fkt_start_seq_get_int+872>
0xff1510c8 <+780>: nop
0xff1510cc <+784>: ld [ %fp + 0x54 ], %g1
0xff1510d0 <+788>: ld [ %g1 + 4 ], %g1
0xff1510d4 <+792>: mov %g1, %o0
0xff1510d8 <+796>: call 0xff19fc64 <krb5_storage_free@plt>
0xff1510dc <+800>: nop
0xff1510e0 <+804>: ld [ %fp + 0x54 ], %g1
0xff1510e4 <+808>: ld [ %g1 ], %g1
0xff1510e8 <+812>: ld [ %fp + 0x44 ], %o0
0xff1510ec <+816>: mov %g1, %o1
0xff1510f0 <+820>: call 0xff1a0c24 <_krb5_xunlock@plt>
0xff1510f4 <+824>: nop
0xff1510f8 <+828>: ld [ %fp + 0x54 ], %g1
0xff1510fc <+832>: ld [ %g1 ], %g1
0xff151100 <+836>: mov %g1, %o0
0xff151104 <+840>: call 0xff1a01d4 <close@plt>
0xff151108 <+844>: nop
0xff15110c <+848>: ld [ %fp + 0x44 ], %o0
0xff151110 <+852>: call 0xff19f9b8 <krb5_clear_error_message@plt>
0xff151114 <+856>: nop
0xff151118 <+860>: ld [ %fp + -8 ], %g1
0xff15111c <+864>: b %xcc, 0xff151160 <fkt_start_seq_get_int+932>
0xff151120 <+868>: nop
0xff151124 <+872>: ldub [ %fp + -10 ], %g1
0xff151128 <+876>: sll %g1, 0x18, %g1
0xff15112c <+880>: sra %g1, 0x18, %g2
0xff151130 <+884>: ld [ %fp + 0x48 ], %g1
0xff151134 <+888>: st %g2, [ %g1 + 0x30 ]
0xff151138 <+892>: ld [ %fp + 0x54 ], %g1
0xff15113c <+896>: ld [ %g1 + 4 ], %g2
0xff151140 <+900>: ld [ %fp + 0x48 ], %g1
0xff151144 <+904>: ld [ %g1 + 0x30 ], %g1
0xff151148 <+908>: ld [ %fp + 0x44 ], %o0
0xff15114c <+912>: mov %g2, %o1
0xff151150 <+916>: mov %g1, %o2
0xff151154 <+920>: call 0xff150d00 <storage_set_flags>
0xff151158 <+924>: nop
0xff15115c <+928>: clr %g1 ! 0x0
0xff151160 <+932>: mov %g1, %i0
0xff151164 <+936>: rett %i7 + 8
0xff151168 <+940>: nop




We built it locally, but the only patches we have applied relate to
password quality checks and password lifetime, nothing that should have an
impact on this.

--
Eric Sturdivant
University of Maryland
Division of Information Technology
Enterprise Unix Services
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 12:30:52 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 507b
On Thu, 24 Apr 2014 13:24:16 -0400
Eric Sturdivant via RT <openafs-bugs@openafs.org> wrote:

> We built it locally, but the only patches we have applied relate to
> password quality checks and password lifetime, nothing that should
> have an impact on this.

It's more about build environment and such rather than patches; or more
generally I was wondering why I couldn't reproduce it here. The missing
piece was probably just "sparc". Solaris I assume? 10 or 11?

--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 13:37:30 -0400 (EDT)
To: Andrew Deason via RT <openafs-bugs@openafs.org>
From: Eric Sturdivant <sturdiva@umd.edu>
Download (untitled)
text/plain 720b
On Thu, 24 Apr 2014, Andrew Deason via RT wrote:

>
> <URL: https://rt.central.org/rt/Ticket/Display.html?id=131852 >
>
> On Thu, 24 Apr 2014 13:24:16 -0400
> Eric Sturdivant via RT <openafs-bugs@openafs.org> wrote:
>
>> We built it locally, but the only patches we have applied relate to
>> password quality checks and password lifetime, nothing that should
>> have an impact on this.
>
> It's more about build environment and such rather than patches; or more
> generally I was wondering why I couldn't reproduce it here. The missing
> piece was probably just "sparc". Solaris I assume? 10 or 11?
>
>

solaris 10


--
Eric Sturdivant
University of Maryland
Division of Information Technology
Enterprise Unix Services
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 13:43:46 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 1.6k

...or maybe it's a bug in gcc? (I assume heimdal was compiled with gcc;
what version?)

On Thu, 24 Apr 2014 13:24:16 -0400
Eric Sturdivant via RT <openafs-bugs@openafs.org> wrote:

> Dump of assembler code for function fkt_start_seq_get_int:
> 0xff150dbc <+0>: save %sp, -112, %sp
> 0xff150dc0 <+4>: sethi %hi(0x4e800), %l7
> 0xff150dc4 <+8>: add %l7, 0x2d8, %l7 ! 0x4ead8
> <fs_stateReadV+52>
> 0xff150dc8 <+12>: call 0xff115f3c <__sparc_get_pc_thunk.l7>
> 0xff150dcc <+16>: nop
[...]
> 0xff150e00 <+68>: call 0xff1a0930 <open64@plt>
> 0xff150e04 <+72>: nop
[...]
> 0xff150e28 <+108>: sethi %hi(0), %g1
> 0xff150e2c <+112>: xor %g1, 0x50, %g1
> 0xff150e30 <+116>: ld [ %l7 + %g1 ], %g1
> 0xff150e34 <+120>: ld [ %g1 ], %g1
> 0xff150e38 <+124>: st %g1, [ %fp + -8 ]
> 0xff150e3c <+128>: ld [ %fp + -4 ], %g1
> 0xff150e40 <+132>: ld [ %g1 ], %i5
> 0xff150e44 <+136>: ld [ %fp + -8 ], %o0
> 0xff150e48 <+140>: call 0xff1a00fc <strerror@plt>
> 0xff150e4c <+144>: nop

I don't know a whole lot about sparc, but that really doesn't seem
right. If I'm reading that correctly, it looks like it's trying to use
%l7 to calculate the location of errno, but %l7 was set 2 function calls
ago. That's a "local"/"temp" register, so its value can be overwritten
across calls.

If that's where the problem is, it's a bug in gcc's PIC handling on
SPARC, I think. Or something in that area. Maybe it's possible it's
still a bug in heimdal because it's passing flags to gcc that cause it
to do this or something; this behavior is probably affected by
optimization flags.

--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 15:26:48 -0400 (EDT)
To: Andrew Deason via RT <openafs-bugs@openafs.org>
From: Eric Sturdivant <sturdiva@umd.edu>
Download (untitled)
text/plain 2.9k

From the heimdal config.log:



configure:3183: checking for gcc
configure:3199: found /usr/local/bin/gcc
configure:3210: result: gcc
configure:3439: checking for C compiler version
configure:3448: gcc --version >&5
gcc (GCC) 4.6.1
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

configure:3459: $? = 0
configure:3448: gcc -v >&5
Reading specs from
/afs/.glue.umd.edu/system/1.5/@sys/usr/local/gcc/4.6.1/bin/../lib/gcc/sparc-sun-solaris2.10/4.6.1/specs
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/afs/.glue.umd.edu/system/1.5/@sys/usr/local/gcc/4.6.1/bin/../libexec/gcc/sparc-sun-solaris2.10/4.6.1/lto-wrapper
Target: sparc-sun-solaris2.10
Configured with: ../src/configure --prefix=/usr/local/gcc/4.6.1
--enable-languages=c,c++,fortran,java --with-gmp=/usr/local/gmp
--with-mpfr=/usr/local/mpfr --with-mpc=/usr/local/mpc
Thread model: posix
gcc version 4.6.1 (GCC)


On Thu, 24 Apr 2014, Andrew Deason via RT wrote:

>
> <URL: https://rt.central.org/rt/Ticket/Display.html?id=131852 >
>
>
> ...or maybe it's a bug in gcc? (I assume heimdal was compiled with gcc;
> what version?)
>
> On Thu, 24 Apr 2014 13:24:16 -0400
> Eric Sturdivant via RT <openafs-bugs@openafs.org> wrote:
>
>> Dump of assembler code for function fkt_start_seq_get_int:
>> 0xff150dbc <+0>: save %sp, -112, %sp
>> 0xff150dc0 <+4>: sethi %hi(0x4e800), %l7
>> 0xff150dc4 <+8>: add %l7, 0x2d8, %l7 ! 0x4ead8
>> <fs_stateReadV+52>
>> 0xff150dc8 <+12>: call 0xff115f3c <__sparc_get_pc_thunk.l7>
>> 0xff150dcc <+16>: nop
> [...]
>> 0xff150e00 <+68>: call 0xff1a0930 <open64@plt>
>> 0xff150e04 <+72>: nop
> [...]
>> 0xff150e28 <+108>: sethi %hi(0), %g1
>> 0xff150e2c <+112>: xor %g1, 0x50, %g1
>> 0xff150e30 <+116>: ld [ %l7 + %g1 ], %g1
>> 0xff150e34 <+120>: ld [ %g1 ], %g1
>> 0xff150e38 <+124>: st %g1, [ %fp + -8 ]
>> 0xff150e3c <+128>: ld [ %fp + -4 ], %g1
>> 0xff150e40 <+132>: ld [ %g1 ], %i5
>> 0xff150e44 <+136>: ld [ %fp + -8 ], %o0
>> 0xff150e48 <+140>: call 0xff1a00fc <strerror@plt>
>> 0xff150e4c <+144>: nop
>
> I don't know a whole lot about sparc, but that really doesn't seem
> right. If I'm reading that correctly, it looks like it's trying to use
> %l7 to calculate the location of errno, but %l7 was set 2 function calls
> ago. That's a "local"/"temp" register, so its value can be overwritten
> across calls.
>
> If that's where the problem is, it's a bug in gcc's PIC handling on
> SPARC, I think. Or something in that area. Maybe it's possible it's
> still a bug in heimdal because it's passing flags to gcc that cause it
> to do this or something; this behavior is probably affected by
> optimization flags.
>
>

--
Eric Sturdivant
University of Maryland
Division of Information Technology
Enterprise Unix Services
Download (untitled)
text/html 124b
i remember an issue with errno, gcc, and thread support producing bogus results, but am having little luck finding it.
Download (untitled)
text/html 146b
it was basically this
https://groups.google.com/forum/#!topic/memcached/7KYial6Bbzk

of course, that is a much older gcc.
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 17:02:38 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 1.5k
On Thu, 24 Apr 2014 17:19:21 -0400
D Brashear via RT <openafs-bugs@openafs.org> wrote:

> it was basically this
> https://groups.google.com/forum/#!topic/memcached/7KYial6Bbzk
>
> of course, that is a much older gcc.

Yes, but I don't think the version of gcc matters. In that thread, and I
think also here, the code just isn't getting built with _REENTRANT, so
we get the normal non-threaded errno declaration.

I thought that stuff with 'calculating' the errno address was getting
the TLS errno, but no, I think that's just PIC stuff (%l7 is the GOT
base or whatever; and that happens with other code, so for whatever
reason it's okay to keep %l7 across function calls, apparently).
Building with threading properly seems to always give you an actual
function call to calculate it.

I can reproduce the underlying issue with a small test program. If you
call krb5_kt_start_seq_get against a nonexistant file from the main
thread, you get ENOENT returned. If you call it from another thread, you
get 0 (because open64 is setting the per-thread errno).

So, heimdal needs to be building with _REENTRANT or some other defines,
or it is effectively unusable from threaded code. The relevant defines
are from errno.h on my solaris box:

#if defined(_REENTRANT) || defined(_TS_ERRNO) || _POSIX_C_SOURCE - 0 >= 199506L
extern int *___errno();
#define errno (*(___errno()))
#else
extern int errno;
[...]

So I assume if you enable enough "extensions" or whatever, you get the
function call errno even without _REENTRANT.

--
Andrew Deason
adeason@sinenomine.net
Subject: Re: [rt.central.org #131852] crash in 1.6.7 dafileserver
Date: Thu, 24 Apr 2014 17:30:22 -0500
To: <openafs-bugs@openafs.org>
From: Andrew Deason <adeason@sinenomine.net>
Download (untitled)
text/plain 965b
On Thu, 24 Apr 2014 18:02:49 -0400
Andrew Deason via RT <openafs-bugs@openafs.org> wrote:

>
> <URL: https://rt.central.org/rt/Ticket/Display.html?id=131852 >
>
> On Thu, 24 Apr 2014 17:19:21 -0400
> D Brashear via RT <openafs-bugs@openafs.org> wrote:
>
> > it was basically this
> > https://groups.google.com/forum/#!topic/memcached/7KYial6Bbzk
> >
> > of course, that is a much older gcc.
>
> Yes, but I don't think the version of gcc matters. In that thread, and I
> think also here, the code just isn't getting built with _REENTRANT, so
> we get the normal non-threaded errno declaration.

...and this appears to be because heimdal keeps track of a
PTHREAD_CFLAGS variable to say e.g. -pthreads (gcc) or -mt (sunwspro),
but doesn't use it anywhere. It just adds e.g. -pthread to LIBS, but
that means compiling each individual .c file doesn't get -pthreads, so
we don't get the threaded errno, as we've seen here.

--
Andrew Deason
adeason@sinenomine.net
Download (untitled)
text/plain 266b
I can reproduce the underlying Heimdal issue on Solaris and AIX, and cannot reproduce it on
Linux and FreeBSD.

Here's a little test program that could be handy if we want to notify others and have a way for
them to check.

--
Andrew Deason
adeason@sinenomine.net
Subject: heimdaltest.c
Download heimdaltest.c
application/octet-stream 1.4k

Message body not shown because it is not plain text.