GSSAPI should be preferably used over (direct) Kerberos5 or even Kerberos4, but the current GSSAPI libraries seem to do stupid things like resolving a server hostname independently of an established connection. This breaks authentication to DNS-loadbalanced machines:
- client wants to connect to server LXPLUS
- client resolves LXPLUS to 1.2.3.4 (which is really LXPLUS001)
- client connects to 1.2.3.4
- client initiates GSSAPI authentication for LXPLUS
- GSSAPI forward-resolves LXPLUS, DNS returns 1.2.3.5 (next server in alias list)
- GSSAPI backward-resolves 1.2.3.5, DNS returns LXPLUS002
- GSSAPI obtains credentials for LXPLUS002..
- client presents these via the established connection to LXPLUS001
- server (LXPLUS001): says "bad" = not for me.
This behaviour is mandated in
RFC1964 - Kerberos for GSSAPI:
When a reference to a name of this type is resolved, the "hostname"
is canonicalized by attempting a DNS lookup and using the fully-
qualified domain name which is returned, or by using the "hostname"
as provided if the DNS lookup fails. The canonicalization operation
also maps the host's name into lower-case characters.
At CERN, we have seen two applications affected: SSH and CVS
SSH
(see also more extensive documentation on SSH and Kerberos under
http://linux.web.cern.ch/linux/documentation/kerberos-access.shtml)
DNS aliases affected: LXPLUS, LX64SLC4, LX32SLC4, LXPARC, ISSCVS
The issue is tracked "upstream" at
Bug#1008.
Client-side patches to acquire credentials for the currently-connected IP address exists (and CERNs SLC4 openssh version is patched), but these may not get accepted upstream, since (confusingly), the
RFC4462-GSSAPI over SSH states that
Implementations of mechanisms conforming to this document MUST NOT
use the results of insecure DNS queries to construct the targ_name.
so it looks like this won't be resolved anytime soon for the general public.
However, server-side patches also exists to allow the SSH server to accept any service ticket it can decrypt (i.e. has an entry in /etc/krb5.keytab). These are part of Simon Wilkinson's
GSSAPI SSH patch set.
This would mean that either all servers in a set should have the keytab entries for all other servers as well (impractical for dynamically-changing clusters), or that the client needs to be told to require credentials for a common service - i.e. changing the DNS reverse resolve of the cluster to the clustername.
Some recent SSH clients also support the
GSSAPITrustDns yes
directive, which fools GSSAPI into reverse-resolving the correct hostname.
CVS
DNS alias: ISSCVS
CERN's version of cvs in SLC4 contains a small patch that gets around the problem by initiating the DNS resolution from the connected IP address instead of the requested host name. This works fine, but unpatched clients will fail to authenticate.
Error messages for these cases look like
GSSAPI authentication failed: An invalid name was supplied
or
GSSAPI authentication failed: lxcvsXY.cern.ch Miscellaneous failure/Unknown code krb5 144
Server-side, CERNs CVS version allows to specify which Kerberos service to use (default is cvs/hostname), thanks to a
patch, option is called
--GSS-service=cvs/isscvs.cern.ch
. Still, the client has to present a credential that the server knows about. Changing the DNS reverse resolve of the cluster to the clustername again resolves this issue (adding all keytab entries does not, since the corresponding command-line option decides which entry to use - and that option is given before the client connects or presents his service ticket).
It initially appeared as if SLC5/RHEL5 (which use a newer cvs client than SLC4) are no longer affected by this issue -now confirmed to still be a problem, i.e. a client patch will be required for CVS+ISSCVS. Has been filed with Red Hat for both
RHEL4 and
RHEL5. The corresponding RPMs are available from
http://linuxsoft.cern.ch/cern/ (
SLC5/i386,
SLC5/x86_64,
SLC4/i386,
SLC4/x86_64 - but please check yourself whether newer versions are available).
Reproducer (need CERN Kerberos TGT):
TMP=/tmp; count=0;ret=0; while [ "$ret" -eq 0 ] ; do
kinit -R;
cvs -t -d :gserver:isscvs.cern.ch/local/reps/elfms co CVSROOT >& $TMP/.out;
ret=$?; let count++; done;
echo "after $count loops";
grep "Connecting|error" $TMP/.out
Should run rather long on a patched machine, will bomb out after a few iterations with a buggy CVS.
Other
In general, a GSSAPI service can choose to accept any decryptable ticket by using
name=GSS_C_NO_NAME, creds=GSS_C_NO_CREDENTIAL in gss_accept_sec_context().