TWiki
>
LinuxSupport Web
>
LinuxSupportInternals
>
KerberosViaSSHNoAFS
(2008-10-29,
JanIven
)
(raw view)
E
dit
A
ttach
P
DF
---++ AFS (and X11) broken when logging in via SSH Starting around the last week of June, we see several support requests along the lines of "no AFS token after SSH" or "no X11 auth forwarding after SSH". There are several overlapping and possible related symptoms. Standard "user experience" is along the lines of <pre>/usr/bin/X11/xauth: timeout in locking authority file /afs/cern.ch/user/f/foo/.Xauthority hepix: E: /usr/bin/fs returned error, no tokens? </pre> Unfortunately X11 forwarding also appears to be the very first write access to AFS, so several causes lead to similar symptoms. In all cases, the user *needs to supply* =ssh -vv ...= output - diagnosing things without this makes no sense at all. This is possibly related to upgrading the last AFS KDC to Heimdal-1. Different scenarios / symptoms: ---+++ corrupted .Xauthority-c %X% *Update: This issue is believed to have been fixed*, with no new cases since mid-July, and no persistent =.Xauthority-c= files to be found... Several users managed somehow to get a permanent =.Xauthority-c= file in their home directory, this appears due to some AFS corruption (the file can be "listed" but not stat'ed or removed or recreated). Bernard says that the "salvage" message links this file somehow to a file under ~/.gconf (where short-lived lock files are created that have already in the past occasionally screwed AFS). The only servers apparently affected are =afs22= and =afs91= (both on 1.4.4). The similarly-sized/-used =afs36= runs 1.4.6 and hasn't had an issue yet. [FIXME - to be confirmed?] "/usr/afs/bin/salvager -part /vicepad -vol 1933776333 -showlog -nowrite" gives something like: <PRE> @(#) OpenAFS 1.4.4 built 2007-03-27 4294967295 0 07/04/2008 12:01:51 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager -part /vicepad -vol 1933776333 -showlog) 07/04/2008 12:01:51 2 nVolumesInInodeFile 64 07/04/2008 12:01:51 CHECKING CLONED VOLUME 1933780659. 07/04/2008 12:01:51 user.abenelli.backup (1933780659) updated 07/03/2008 15:54 07/04/2008 12:01:51 SALVAGING VOLUME 1933776333. 07/04/2008 12:01:51 user.abenelli (1933776333) updated 07/04/2008 12:01 ... 07/04/2008 12:01:51 totalInodes 5367 07/04/2008 12:01:51 dir vnode 1: invalid entry: ./.Xauthority-c (vnode 5402, unique 851942) 07/04/2008 12:01:51 dir vnode 1: ./.Xauthority-c (vnode 5402): unique changed from 851942 to 0 -- deleted 07/04/2008 12:01:51 Found 23 orphaned files and directories (approx. 6356 KB) 07/04/2008 12:01:51 Salvaged user.abenelli (1933776333): 4871 files, 397709 blocks </PRE> %X% **Warning**: if two files claim the same vnode, the salvager will destroy the content of one of them randomly! %X% *NOT a Workaround*: =afs_admin salvage $HOME= will make this file "normal" again, after which it can be removed, after which X11 forwarding works again (but maybe some data is lost, see the warning above). _Current suspicion:_ AFS fileserver bug; possibly related to locking; possibly fixed in 1.4.6. *FIXED*. ---+++ SSH-1, Kerberos-5 TGT only =ssh -1= with _only_ a Kerberos5 TGT (i.e no Krb4 TGT or AFS token) on the sender side will not get AFS tokens on the destination. <pre>ssh(7727) debug1: Trying Kerberos v5 authentication. ssh(7727) debug3: Trying to reverse map address 137.138.4.22. ssh(7727) debug1: Kerberos v5 authentication accepted. ssh(7727) debug1: Kerberos v5 TGT forwarded (foo@CERN.CH). ssh(7727) debug1: Requesting compression at level 6. </pre> If the server happens to be running in debug mode, we also get (on the client): <pre> user_pty(11384) debug3: Cannot get AFS token via Krb5/MIT</pre> This issue is understood, the Kerberos ticket file name (via !KRB5CCNAME) is not transferred to PAM from the "unpriviledged" ssh process that received the forwarded ticket, =pam_krb5afs= then says (in debug mode) <pre>Jul 4 11:00:44 lxcert-amd64 sshd[11383]: pam_krb5[11383]: no v5 creds for user 'foo', skipping session setup Jul 4 11:00:44 lxcert-amd64 sshd[11383]: pam_krb5[11383]: pam_open_session returning 0 (Success)</pre> The CERN sshd is capable of receiving forwarded AFS tokens, and of converting forwarded Krb4 TGTs into AFS tokens, but *not* of doing this for Krb5 (not part of MIT library or =krbafs=, and the daemon is not linked with either Heimdal or the =minikafs= library). *Workaround/Solutions*: * use =ssh -2 ...=, the Kerberos5 TGT is transferred on a different code path (GSSAPI) that actually end up in the right place on the receiver, or * ensure that the sender has a valid Kerberos4 TGT and/or AFS token (and that these also get passed over the SSH connection - but this will happen automatically if the Kerberos5 TGT is being transferred). ---+++ "temporary AFS token" gets dropped - FIXED. %X% *FIXED* Symptom is =dmesg= output like <pre>afs: Tokens for user of AFS id 1234 for cell cern.ch are discarded (rxkad error=19270407)</pre> which means (=translate_et=)<pre> 19270407 (rxk).7 = security object was passed a bad ticket</pre> In other words, the client kernel module (= SSH server) loaded the token, tried to access something on AFS, then got told by the AFS server that the token is useless and decided to remove it again. Seems to mostly affect "non-CERN" client machines (SL4, SL5) [FIXME - true?] This is the most troublesome kind of ticket, it appears that this is really a recent change in behaviour - and the Linux ssh/sshd haven't been updatd for some time. we seem to have two variants: ---++++ for SSH sessions The AFS token has been created from a forwarded KRB5 TGT via pam_krb5afs in these cases ([FIXME]- true for all cases?) Invoking "GetToken" and "SetToken" (same machine or other machine) on the soon-to-be-evicted token gives a working token. Invoking =afs5log -5=, =aklog=, =afs5log= on the forwaded Krb5 TGT yields a working token. ---++++ for "native" =klog= This has also been reported by offsite users invoking =klog= directly on their non-CERNified machines: * "SL5, kernel 2.6.18-92.1.6.el5 and openafs 1.4.7 (openafs-1.4.7-68.SL5.i686)" * "64bit linux box, with a 2.6.23.9 kernel, using openafs version 1.4.7." %X% *Update: FIXED*, was due to a bug in Heimdal padding tickets to 48bytes. ---++++ for "native" =kinit= One instance seen (CT560274). Error message in =/var/log/messages= is something like:<pre> Oct 29 13:47:23 HOST kernel: afs: Tokens for user of AFS id USERID for cell cern.ch are discarded (rxkad error=19270410)</pre> =translate_et= says<pre> 19270410 (rxk).10 = sealed data inconsistent</pre> Looks like the AFS token is usable for a short time, but then gets thrown away by the kernel. ---+++ no AFS token after Public key authentication this is old, has never worked, is well documented e.g. on Q&A but some recent calls appear to fall into this category. Surprises people who usually use Kerberos authentication/GSSAPI but have a "working" pubkey setup that kicks in whenever their credentials are expired, or who normally use pubkey with AFS token forwarding (SSH-1 speciality) and have no valid AFS token at that moment. Symptoms: <pre>ssh(7846) debug1: Trying RSA authentication with key '/home/foo/.ssh/identity' ssh(7846) debug1: Received RSA challenge from server. ssh(7846) debug1: Remote: RSA authentication accepted. ssh(7846) debug1: RSA authentication accepted by server.</pre> or<pre>ssh(7874) debug1: Authentication succeeded (publickey).</pre> *Solutions*: * don't use Pubkey auth, or * don't expect write access to your AFS directory, and don't expect X11 to work (use =ssh -x=) ---+++ bad ~/.ssh/rc prevents X11 forwarding (one case so far, AFS access actually works in this case) In case the user has a ~/.ssh/rc file, normal X11 credential forwarding is broken unless that script is prepared to handle the X11 cookie itself. Symptoms:<pre> ssh(7878) debug2: x11_get_proto: /usr/bin/X11/xauth list :0.0 2>/dev/null ssh(7878) debug1: Requesting X11 forwarding with authentication spoofing. ... ssh(7878) debug2: X11 auth data does not match fake data. ssh(7878) X11 connection rejected because of wrong authentication. ssh(7878) debug2: X11 rejected 1 i0/o0 </pre>Since the file needs to be accessible by =sshd=, it is likely to be at least listable (if not readable/executable) for unauthenticated AFS users - easy to check for. *Solution*: * get rid of ~/.ssh/rc for a test * if really required: read X11 cookie from STDIN, pass via =/usr/bin/X11/xauth add "$DISPLAY" $cookie= if $DIAPY is set.. and make ure the user knowns that we normally don't support such bricolage. ---++ Misc there appears to be a small difference in the AFS token format, as stored in the kernel and obtainable via !GetToken - in some cases this says =Unix UID 1234=, in others =AFS ID 1234=. The =tokens= command actually expects these formats and translated the first into =User's (AFS ID 1234) tokens for afs@cern.ch [Expires ..]= and the second to =Tokens for afs@cern.ch [Expires Jul 5 ..]=. Repeated ssh logins into a machine from the same KRB5 TGT may get one or the other, apparently at random... Origin is [[http://debathena.mit.edu/trac/browser/branches/vendor/third/openafs/src/auth/ktc.c#L648][Openafs src/auth/ktc.c:648, ktc_GetToken()]] <pre> 500 struct ClearToken ct; (some fuzzing with copying over into ct, to be looked at) 636 if (ct.AuthHandle == -1) { 637 ct.AuthHandle = 999; 638 } 639 atoken->kvno = ct.AuthHandle; 648 if ((atoken->kvno == 999) || /* old style bcr 648 ypt ticket */ 649 (ct.BeginTimestamp && /* new w/ prserver looku 649 p */ 650 (((ct.EndTimestamp - ct.BeginTimestamp) & 1) == 650 1))) { 651 sprintf(aclient->name, "AFS ID %d", ct.ViceId); 652 } else { 653 sprintf(aclient->name, "Unix UID %d", ct.ViceId) 653 ;</pre> and that in turn apparently takes the "AuthHandle" from something provided by the client. On that subject, http://osdir.com/ml/file-systems.openafs.general/2003-06/msg00290.html says <pre>The short answer is "it doesn't mean a thing".</pre>
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r5
<
r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r5 - 2008-10-29
-
JanIven
Log In
LinuxSupport
LinuxSupport Web
LinuxSupport Web Home
Changes
Index
Search
Main
FIOgroup
Cern Search
TWiki Search
Google Search
LinuxSupport
All webs
Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback