FreeIPA Identity Management planet - technical blogs

June 01, 2019

Luc de Louw

OpenID and SAML authentication with Keycloak and FreeIPA

Not every web application can handle Kerberos SSO, but some provide OpenID and/or SAML. There is how Keycloak comes into the game. You can use Keycloak to federate users from different sources. This guide shows how to integrate Keyclock and … Continue reading

The post OpenID and SAML authentication with Keycloak and FreeIPA appeared first on Luc de Louw's Blog.

by Luc de Louw at June 01, 2019 08:43 PM

May 28, 2019

Fraser Tweedale

A Distinguished Name is not a string

A Distinguished Name is not a string

Distinguished Names (DNs) are used to identify entities in LDAP databases and X.509 certificates. Although DNs are often presented as strings, they have a complex structure. Because of the numerous formal and ad-hoc serialisations have been devised, and the prevalence of ad-hoc or buggy parsers, treating DNs as string in the interals of a program inevitably leads to errors. In fact, dangerous security issues can arise!

In this post I will explain the structure of DNs, review the common serialisation regimes, and review some DN-related bugs in projects I worked on. I’ll conclude with my best practices recommendations for working with DNs.

DN structure

DNs are defined by the ITU-T X.501 standard a ASN.1 objects:

Name ::= CHOICE {
  -- only one possibility for now --
  rdnSequence RDNSequence }

RDNSequence ::= SEQUENCE OF RelativeDistinguishedName

DistinguishedName ::= RDNSequence

RelativeDistinguishedName ::=
  SET SIZE (1..MAX) OF AttributeTypeAndValue

AttributeTypeAndValue ::= SEQUENCE {
  type  ATTRIBUTE.&id({SupportedAttributes}),
  value ATTRIBUTE.&Type({SupportedAttributes}{@type}),
  ... }

The AttributeTypeAndValue definition refers to some other definitions. It means that type is an object identifier (OID) of some supported attribute, and the syntax of value is determined by type. The term attribute-value assertion (AVA) is a common synonym for AttributeTypeAndValue.

Applications define a bounded set of supported attributes. For example the X.509 certificate standard suggests a minimal set of supported attributes, and an LDAP server’s schema defines all the attribute types understood by that server. Depending on the application, a program might fail to process a DN with an unrecognised attribute type, or it might process it just fine, treating the corresponding value as opaque data.

Whereas the order of AVAs within an RDN is insignificant (it is a SET), the order of RDNs within the DN is significant. If you view the list left-to-right, then the root is on the left. X.501 formalises it thus:

Each initial sub-sequence of the name of an object is also the name of an object. The sequence of objects so identified, starting with the root and ending with the object being named, is such that each is the immediate superior of that which follows it in the sequence.

This also means that the empty DN is a valid DN.

Comparing DNs

Testing DNs for equality is an important operation. For example, when constructing an X.509 certification path, we have to find a trusted CA certificate based on the certificate chain presented by an entity (e.g. a TLS server), then verify that the chain is complete by ensuring that each Issuer DN, starting from the end entity certificate, matches the Subject DN of the certificate “above” it, all the way up to a trusted CA certificate. (Then the signatures must be verified, and several more checks performed).

Continuing with this example, if an implementation falsely determines that two equal DNs (under X.500) are inequal, then it will fail to construct the certification path and reject the certificate. This is not good. But even worse would be if it decides that two unequal DNs are in fact equal! Similarly, if you are issuing certificates or creating LDAP objects or anything else, a user could exploit bugs in your DN handling code to cause you to issue certificates, or create objects, that you did not intend.

Having motivated the importance of correct DN comparison, well, how do you compare DNs correctly?

First, the program must represent the DNs according to their true structure: a list of sets (RDNs) of attribute-value pairs (AVAs). If the DNs are not already represented this way in the program, they must be parsed or processed—correctly.

Now that the structure is correct, AVAs can be compared for equality. Each attribute type defines an equality matching rule that says how values should be compared. In some cases this is just binary matching. In other cases, normalisation or other rules must be applied to the values. For example, some string types may be case insensitive.

A notable case is the DirectoryString syntax used by several attribute types in X.509:

DirectoryString ::= CHOICE {
    teletexString       TeletexString   (SIZE (1..MAX)),
    printableString     PrintableString (SIZE (1..MAX)),
    universalString     UniversalString (SIZE (1..MAX)),
    utf8String          UTF8String      (SIZE (1..MAX)),
    bmpString           BMPString       (SIZE (1..MAX)) }

DirectoryString supports a choice of string encodings. Values of use PrintableString orr UTF8String encoding must be preprocessed using the LDAP Internationalized String Preparation rules (RFC 4518), including case folding and insignificant whitespace compression.

Taking the DN as a whole, two DNs are equal if they have the same RDNs in the same order, and two RDNs are equal if they have the same AVAs in any order (i.e. sets of equal size, with each AVA in one set having a matching AVA in the other set).

Ultimately this means that, despite X.509 certificates using Distinguised Encoding Rules (DER) for serialisation, there can still be multiple ways to represent equivalent data (by using different string encodings). Therefore, binary matching of serialised DNs, or even binary matching of individual attribute values, is incorrect behaviour and may lead to failures.

String representations

Several string representations of DNs, both formally-specified and ad-hoc, are in widespread use. In this section I’ll list some of the more important ones.

Because DNs are ordered, one of the most obvious characteristics of a string representation is whether it lists the RDNs in forward or reverse order, i.e. with the root at the left or right. Some popular libraries and programs differ in this regard.

As we look at some of these common implementations, we’ll use the following DN as an example:

SEQUENCE (3 elem)
  SET (2 elem)
    SEQUENCE (2 elem)
      OBJECT IDENTIFIER 2.5.4.6 countryName
      PrintableString AU
    SEQUENCE (2 elem)
      OBJECT IDENTIFIER 2.5.4.8 stateOrProvinceName
      PrintableString Queensland
  SET (1 elem)
    SEQUENCE (2 elem)
      OBJECT IDENTIFIER 2.5.4.10 organizationName
      PrintableString Acme, Inc.
  SET (1 elem)
    SEQUENCE (2 elem)
      OBJECT IDENTIFIER 2.5.4.3 commonName
      PrintableString CA

RFC 4514

CN=CA,O=Acme\, Inc.,C=AU+ST=Queensland
CN=CA,O=Acme\2C Inc.,C=AU+ST=Queensland

RFC 4514 defines the string representation of distinguished names used in LDAP. As such, there is widespread library support for parsing and printing DNs in this format. The RDNs are in reverse order, separated by ,. Special characters are escaped using backslash (\), and can be represented using the escaped character itself (e.g. \,) or two hex nibbles (\2C). The AVAs within a multi-valued RDN are separated by +, in any order.

Due to the multiple ways of escaping special characters, this is not a distinguished encoding.

This format is used by GnuTLS, OpenLDAP and FreeIPA, among other projects.

RFC 1485

CN=CA,O="Acme, Inc.",C=AU+ST=Queensland

RFC 1485 is a predecessor of a predecessor (RFC 1779) of a predecessor (RFC 2253) of RFC 4514. There are some differences from RFC 4514. For example, special character escapes are not supported; quotes must be used. This format is still relevant today because NSS uses it for pretty-printing and parsing DNs.

OpenSSL

OpenSSL prints DNs in its own special way. Unlike most other implementations, it works with DNs in forward order (root at left). The pretty-print looks like:

C = AU + ST = Queensland, O = "Acme, Inc.", CN = CA

The format when parsing is different again. Some commands need a flag to enable support for multi-valued RDNs; e.g. openssl req -multivalue-rdn ....

/C=AU+ST=Queensland/O=Acme, Inc./CN=CA

OpenSSL can also read DNs from a config file where AVAs are given line by line (see config and x509v3_config(5)). But this is not a DN string representation per se so I won’t cover it here.

Bugs, bugs, bugs

Here are three interesting bugs I discovered, related to DN string encoding.

389 DS #49543: certmap fails when Issuer DN has comma in name

389 DS supports TLS certificate authentication for binding to LDAP. Different certificate mapping (certmap) policies can be defined for different CAs. The issuer DN in the client certificate is used to look up a certmap configuration. Unfortunately, a string comparison was used to perform this lookup. 389 uses NSS, which serialised the DN using RFC 1485 syntax. If this disagreed with how the DN in the certmap configuration appeared (after normalisation), the lookup—hence the LDAP bind—would fail. The normalisation function was also buggy.

The fix was to parse the certmap DN string into an a NSS CertNAME using the CERT_AsciiToName routine, then compare the Issuer DN from the certificate against it using the NSS DN comparison routine (CERT_CompareName). The buggy normalisation routine was deleted.

Certmonger #90: incorrect DN in CSR

Certmonger stores tracking request configuration in a flat text file. This configuration includes the string representation of the DN, ostensibly in RFC 4514 syntax. When constructing a CSR for the tracking request, it parsed the DN then used the result to construct an OpenSSL X509_NAME, which would be used in OpenSSL routines to create the CSR.

Unfortunately, the DN parsing implementation—a custom routine in Certmonger itself—was busted. A DN string like:

CN=IPA RA,O=Acme\, Inc.,ST=Massachusetts,C=US

Resulted in a CSR with the following DN:

CN=IPA RA,CN=Inc.,O=Acme\\,ST=Massachusetts,C=US

The fix was to remove the buggy parser and use the OpenLDAP ldap_str2dn routine instead. This was a joint effort between Rob Crittenden and myself.

FreeIPA #7750: invalid modlist when attribute encoding can vary

FreeIPA’s LDAP library, ipaldap, uses python-ldap for handling low-level stuff and provides a lot of useful stuff on top. One useful thing it does is keeps track of the original attribute values for an object, so that we can perform changes locally and efficiently produce a list of modifications (modlist) for when we want to update the object at the server.

ipaldap did not take into account the possibility of the attribute encoding returned by python-ldap differing from the attribute encoding produced by FreeIPA. A disagreement could arise when DN attribute values contained special characters requiring escaping. For example, python-ldap escaped characters using hex encoding:

CN=CA,O=Red Hat\2C Inc.,L=Brisbane,C=AU

The representation produced by python-ldap is recorded as the original value of the attribute. However, if you wrote the same attribute value back, it would pass through FreeIPA’s encoding routine, which might encode it differently and record it as a new value:

CN=CA,O=Red Hat\, Inc.,L=Brisbane,C=AU

When you go to update the object, the modlist would look like:

[ (ldap.MOD_ADD, 'ipacaissuerdn',
    [b'CN=CA,O=Red Hat\, Inc.,L=Brisbane,C=AU'])
, (ldap.MOD_DELETE, 'ipacaissuerdn',
    [b'CN=CA,O=Red Hat\2C Inc.,L=Brisbane,C=AU'])
]

Though encoded differently, these are the same value but that in itself is not a problem. The problem is that the server also has the same value, and processing the MOD_ADD first results in an attributeOrValueExists error. You can’t add a value that’s already there!

The ideal fix for this would be to update ipaldap to record all values as ASN.1 data or DER, rather than strings. But that would be a large and risky change. Instead, we work around the issue by always putting deletes before adds in the modlist. LDAP servers process changes in the order they are presented (389 DS does so atomically). So deleting an attribute value then adding it straight back is a safe, albeit inefficient, workaround.

Discussion

So you have to compare or handle some DNs. What do you do? My recommendations are:

  • If you need to print/parse DNs as strings, if possible use RFC 4514 because it has the most widespread library support.
  • Don’t write your own DN parsing code. This is where security vulnerabilities are most likely. Use existing library routines for parsing DNs. If you have no other choice, take extreme care and if possible use a parser combinator library or parser generator to make the definitions more declarative and reduce likelihood of error.
  • Always decode attribute values (if the DN parsing routine doesn’t do it for you). This avoids confusion where attribute values could be encoded in different ways (due to escaped characters or differing string encodings).
  • Use established library routines for comparing DNs using the internal DN structures, not strings.

Above all, just remember: a Distinguished Name is not a string, so don’t treat it like a string. For sure it’s more work, but DNs need special treatment or bugs will certainly arise.

That’s not to say that “native” DN parsing and comparison routines are bug-free. They are not. A common error is equal DNs comparing inequal due to differing attribute string encodings (e.g. PrintableString versus UTF8String). I have written about this in a previous post. In Dogtag we’ve enountered this kind of bug quite a few times. In these situations the DN comparison should be fixed, but it may be a satisfactory workaround to serialise both DNs and perform a string comparison.

Another common issue is lack of support for multi-valued RDNs. A few years ago we wanted to switch FreeIPA’s certificate handling from python-nss to the cryptography library. I had to add support for multi-valued RDNs before we could make the switch.

A final takeaway for authors of standards. Providing multiple ways to serialise the same value leads to incompatibilities and bugs. For sure, there is a tradeoff between usability, implementation complexity and risk of interoperability issues and bugs. RFC 4514 would be less human-friendly if it only permitted hex-escapes. But implementations would be simpler and the interop/bug risk would be reduced. It’s important to think about these tradeoffs and the consequences, especially for standards and protocols relating to security.

May 28, 2019 12:00 AM

May 24, 2019

Fraser Tweedale

Fixing expired system certificates in FreeIPA

Fixing expired system certificates in FreeIPA

In previous posts I outlined and demonstrated the pki-server cert-fix tool. This tool is part of Dogtag PKI. I also discussed what additional functionality would be needed to successfully use this tool in a FreeIPA environment.

This post details the result of the effort to make cert-fix useful for FreeIPA administrators. We implemented a wrapper program, ipa-cert-fix, which performs FreeIPA-specific steps in addition to executing pki-server cert-fix.

What does ipa-cert-fix do?

In brief, the steps performed by ipa-cert-fix are:

  1. Inspect deployment to work out which certificates need renewing. This includes both Dogtag system certificates, FreeIPA-specific certificates (HTTP, LDAP, KDC and IPA RA).
  2. Print intentions and await operator confirmation.
  3. Invoke pki-server cert-fix to renew expired certificates, including FreeIPA-specific certificates.
  4. Install renewed FreeIPA-specific certificates to their respective locations.
  5. If any shared certificates were renewed (Dogtag system certificates excluding HTTP, and IPA RA), import them to the LDAP ca_renewal subtree and set the caRenewalMaster configuration to be the current server. This allows CA replicas to pick up the renewed shared certificates.
  6. Restart FreeIPA (ipactl restart).

Demonstration

For this demonstration I used a deployment with the following characteristics:

  • Two servers, f29-0 and f29-1, with CA on both.
  • f29-0 is the current CA renewal master.
  • A KRA instance is installed on f29-1.
  • The deployment was created on 2019-05-24, so most of the certificates expire on or before 2021-05-24 (the exception being the CA certificate).

On both servers I disabled chronyd and put the clock forward 27 months, so that all the certificates (except the IPA CA itself) are expired:

[f29-1] ftweedal% sudo systemctl stop chronyd
[f29-1] ftweedal% date
Fri May 24 12:01:16 AEST 2019
[f29-1] ftweedal% sudo date 082412012021
Tue Aug 24 12:01:00 AEST 2021

We want to perform this step on all machines in the topology. After all, we are simulating the passage of time.

After ipactl restart the Dogtag CA did not start, and we cannot communicate with FreeIPA due to the expired HTTP certificate:

[f29-1] ftweedal% sudo ipactl status
Directory Service: RUNNING
krb5kdc Service: RUNNING
kadmin Service: RUNNING
httpd Service: RUNNING
ipa-custodia Service: RUNNING
pki-tomcatd Service: STOPPED
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful

[f29-1] ftweedal% ipa user-find
ipa: ERROR: cannot connect to 'https://f29-1.ipa.local/ipa/json':
  [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed:
  certificate has expired (_ssl.c:1056)

Fixing the first server

I will repair f29-1 first, so that we can see why resetting the CA renewal master is an important step performed by ipa-cert-fix.

I ran ipa-cert-fix as root. It analyses the server, then prints a warning and the list of certificates to be renewed, and asks for confirmation:

[f29-1] ftweedal% sudo ipa-cert-fix

                          WARNING

ipa-cert-fix is intended for recovery when expired certificates
prevent the normal operation of FreeIPA.  It should ONLY be used
in such scenarios, and backup of the system, especially certificates
and keys, is STRONGLY RECOMMENDED.


The following certificates will be renewed:

Dogtag sslserver certificate:                                                                                                                                                                                [2/34]
  Subject: CN=f29-1.ipa.local,O=IPA.LOCAL 201905222205                                                                                                                                                             
  Serial:  13
  Expires: 2021-05-12 05:55:47

Dogtag subsystem certificate:
  Subject: CN=CA Subsystem,O=IPA.LOCAL 201905222205
  Serial:  4
  Expires: 2021-05-11 12:07:11

Dogtag ca_ocsp_signing certificate:
  Subject: CN=OCSP Subsystem,O=IPA.LOCAL 201905222205
  Serial:  2
  Expires: 2021-05-11 12:07:11

Dogtag ca_audit_signing certificate:
  Subject: CN=CA Audit,O=IPA.LOCAL 201905222205
  Serial:  5
  Expires: 2021-05-11 12:07:12

Dogtag kra_transport certificate:
  Subject: CN=KRA Transport Certificate,O=IPA.LOCAL 201905222205
  Serial:  268369921
  Expires: 2021-05-12 06:00:10

Dogtag kra_storage certificate:
  Subject: CN=KRA Storage Certificate,O=IPA.LOCAL 201905222205
  Serial:  268369922
  Expires: 2021-05-12 06:00:10

Dogtag kra_audit_signing certificate:
  Subject: CN=KRA Audit,O=IPA.LOCAL 201905222205
  Serial:  268369923
  Expires: 2021-05-12 06:00:11

IPA IPA RA certificate:
  Subject: CN=IPA RA,O=IPA.LOCAL 201905222205
  Serial:  7
  Expires: 2021-05-11 12:07:47

IPA Apache HTTPS certificate:
  Subject: CN=f29-1.ipa.local,O=IPA.LOCAL 201905222205
  Serial:  12
  Expires: 2021-05-23 05:54:11

IPA LDAP certificate:
  Subject: CN=f29-1.ipa.local,O=IPA.LOCAL 201905222205
  Serial:  11
  Expires: 2021-05-23 05:53:58

IPA KDC certificate:
  Subject: CN=f29-1.ipa.local,O=IPA.LOCAL 201905222205
  Serial:  14
  Expires: 2021-05-23 05:57:50

Enter "yes" to proceed:

Observe that the KRA certificates are included (we are on f29-1). I type “yes” and continue. After a few minutes the process has completed:

Proceeding.
Renewed Dogtag sslserver certificate:
  Subject: CN=f29-1.ipa.local,O=IPA.LOCAL 201905222205
  Serial:  268369925
  Expires: 2023-08-14 02:19:33

... (9 certificates elided)

Renewed IPA KDC certificate:
  Subject: CN=f29-1.ipa.local,O=IPA.LOCAL 201905222205
  Serial:  268369935
  Expires: 2023-08-25 02:19:42

Becoming renewal master.
The ipa-cert-fix command was successful

As suggested by the expiry dates, it took about 11 seconds to renew all 11 certifiates. So why did it take so long? The pki-server cert-fix command, which is part of Dogtag and invoked by ipa-cert-fix, restarts the Dogtag instance as its final step. Although a new LDAP certificate was issued, it is not yet been installed in 389’s certificate database. Dogtag fails to start; it cannot talk to LDAP because of the expired certificate, and the restart operation hangs for a while. ipa-cert-fix knows to expect this and ignores the pki-server cert-fix failure when the LDAP certificate needs renewal.

ipa-cert-fix also reported that it was setting the renewal master (because shared certificates were renewed). Let’s check the server status and verify the configuration.

[f29-1] ftweedal% sudo ipactl status
Directory Service: RUNNING
krb5kdc Service: RUNNING
kadmin Service: RUNNING
httpd Service: RUNNING
ipa-custodia Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful

The server is up and running.

[f29-1] ftweedal% kinit admin
Password for admin@IPA.LOCAL:
Password expired.  You must change it now.
Enter new password:
Enter it again:

Passwords have expired (due to time-travel).

[f29-1] ftweedal% ipa config-show |grep renewal
  IPA CA renewal master: f29-1.ipa.local

f29-1 has indeed become the renewal master. Oh, and the HTTP and LDAP certifiate have been fixed.

[f29-1] ftweedal% ipa cert-show 1 | grep Subject
  Subject: CN=Certificate Authority,O=IPA.LOCAL 201905222205

And the IPA framework can talk to Dogtag. This proves that the IPA RA and Dogtag HTTPS and subsystem certificates are valid.

Fixing subsequent servers

Jumping back onto f29-0, let’s look at the Certmonger request statuses:

[f29-0] ftweedal% sudo getcert list \
                  | egrep '^Request|status:|subject:'
Request ID '20190522120745':
        status: CA_UNREACHABLE
        subject: CN=IPA RA,O=IPA.LOCAL 201905222205
Request ID '20190522120831':
        status: CA_UNREACHABLE
        subject: CN=CA Audit,O=IPA.LOCAL 201905222205
Request ID '20190522120832':
        status: CA_UNREACHABLE
        subject: CN=OCSP Subsystem,O=IPA.LOCAL 201905222205
Request ID '20190522120833':
        status: CA_UNREACHABLE
        subject: CN=CA Subsystem,O=IPA.LOCAL 201905222205
Request ID '20190522120834':
        status: MONITORING
        subject: CN=Certificate Authority,O=IPA.LOCAL 201905222205
Request ID '20190522120835':
        status: CA_UNREACHABLE
        subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205
Request ID '20190522120903':
        status: CA_UNREACHABLE
        subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205
Request ID '20190522120932':
        status: CA_UNREACHABLE
        subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205
Request ID '20190522120940':
        status: CA_UNREACHABLE
        subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205

The MONITORING request is the CA certificate. All the other requests are stuck in CA_UNREACHABLE.

The Certmonger tracking requests need to communicate with LDAP to retrieve shared certificates. So we have to ipactl restart with --force to ignore individual service startup failures (Dogtag will fail):

[f29-0] ftweedal% sudo ipactl restart --force
Skipping version check
Starting Directory Service
Starting krb5kdc Service
Starting kadmin Service
Starting httpd Service
Starting ipa-custodia Service
Starting pki-tomcatd Service
Starting ipa-otpd Service
ipa: INFO: The ipactl command was successful

[f29-0] ftweedal% sudo ipactl status
Directory Service: RUNNING
krb5kdc Service: RUNNING
kadmin Service: RUNNING
httpd Service: RUNNING
ipa-custodia Service: RUNNING
pki-tomcatd Service: STOPPED
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful

Now Certmonger is able to renew the shared certificates by retrieving the new certificate from LDAP. The IPA-managed certificates are also able to be renewed by falling back to requesting them from another CA server (the already repaired f29-1). After a short wait, getcert list shows that all but one of the certificates have been renewed:

[f29-0] ftweedal% sudo getcert list \
                  | egrep '^Request|status:|subject:'
Request ID '20190522120745':
        status: MONITORING
        subject: CN=IPA RA,O=IPA.LOCAL 201905222205
Request ID '20190522120831':
        status: MONITORING
        subject: CN=CA Audit,O=IPA.LOCAL 201905222205
Request ID '20190522120832':
        status: MONITORING
        subject: CN=OCSP Subsystem,O=IPA.LOCAL 201905222205
Request ID '20190522120833':
        status: MONITORING
        subject: CN=CA Subsystem,O=IPA.LOCAL 201905222205
Request ID '20190522120834':
        status: MONITORING
        subject: CN=Certificate Authority,O=IPA.LOCAL 201905222205
Request ID '20190522120835':
        status: CA_UNREACHABLE
        subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205
Request ID '20190522120903':
        status: MONITORING
        subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205
Request ID '20190522120932':
        status: MONITORING
        subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205
Request ID '20190522120940':
        status: MONITORING
        subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205

The final CA_UNREACHABLE request is the Dogtag HTTP certificate. We can now run ipa-cert-fix on f29-0 to repair this certificate:

[f29-0] ftweedal% sudo ipa-cert-fix

                          WARNING

ipa-cert-fix is intended for recovery when expired certificates
prevent the normal operation of FreeIPA.  It should ONLY be used
in such scenarios, and backup of the system, especially certificates
and keys, is STRONGLY RECOMMENDED.


The following certificates will be renewed:

Dogtag sslserver certificate:
  Subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205
  Serial:  3
  Expires: 2021-05-11 12:07:11

Enter "yes" to proceed: yes
Proceeding.
Renewed Dogtag sslserver certificate:
  Subject: CN=f29-0.ipa.local,O=IPA.LOCAL 201905222205
  Serial:  15
  Expires: 2023-08-14 04:25:05

The ipa-cert-fix command was successful

All done?

Yep. A subsequent execution of ipa-cert-fix shows that there is nothing to do, and exits:

[f29-0] ftweedal% sudo ipa-cert-fix
Nothing to do.
The ipa-cert-fix command was successful

Feature status

Against the usual procedure for FreeIPA (and Red Hat projects in general), ipa-cert-fix was developed “downstream-first”. It has been merged to the ipa-4-6 branch, but there might not even be another upstream release from that branch. But there might be a future RHEL release based on that branch (the savvy reader might infer a high degree of certainty, given we actually bothered to do that…)

In the meantime, work to forward-port the feature to master and newer branches is ongoing. I hope that it will be merged in the next week or so.

May 24, 2019 12:00 AM

May 07, 2019

Rob Crittenden

How do I revert back to using IPA-issued Web & LDAP certs?

For IPA v4.6.x.

So you have an IPA installation with a CA and you decided you don’t want your users to have to install the IPA CA certificate(s) so instead you want to use certificates for the Web and LDAP using some known 3rd party issuer. Sure, that works fine. You’d do something like:

Install the 3rd party CA chain and update your IPA master:

# ipa-cacert-manage install /path/to/root.pem -t CT,,
# ipa-cacert-manage install intermediate.cert.pem -t CT,,
# ipa-certupdate

Install the 3rd-party provided server certificate. In this case I have it as two separate files, the public cert and the private key.

# ipa-server-certinstall --dirman-password password -w -d --pin '' server.cert.pem server.cert.key root.pem \
intermediate.cert.pem

Great. IPA is working fine and my users don’t need to import the IPA CA.

Two years later…

My 3rd party certs are expiring soon and I don’t want to pay for new ones and I want to switch back to using IPA-issued certificates. We can use certmonger for that. This assumes that your CA is still up and functioning properly.

I’d start by backing up the two NSS databases. It is safest to do this offline (ipactl stop). You need to copy *.db from /etc/dirsrv/slapd-EXAMPLE-TEST and /etc/httpd/alias someplace safe, then restart the world (ipactl start).

First the web server:

# ipa-getcert request -d /etc/httpd/alias -n Server-Cert -K HTTP/`hostname` -D `hostname` -C /usr/libexec/ipa/certmonger/restart_httpd -p /etc/httpd/alias/pwdfile.txt

Edit /etc/httpd/conf.d/nss.conf and replace the value of NSSNickname with Server-Cert.

Wait a bit to be sure the cert is issued. You can run this to see the status:

# ipa-getcert list -d /etc/httpd/alias -n Server-Cert

Now the LDAP server:

# ipa-getcert request -d /etc/dirsrv/slapd-EXAMPLE-TEST -n Server-Cert -D `hostname` -K ldap/`hostname` -C "/usr/libexec/ipa/certmonger/restart_dirsrv EXAMPLE-TEST" -p /etc/dirsrv/slapd-EXAMPLE-TEST/pwdfile.txt

Similarly wait for it to be issued. To track the status:

# ipa-getcert list -d /etc/dirsrv/slapd-EXAMPLE-TEST -n Server-Cert

Once it is issued run:

# ipactl stop

Now edit /etc/dirsrv/slapd-EXAMPLE-TEST/dse.ldif. We could do this while the server is online but we need to restart anyway and your favorite editor is easier than ldapmodify. Replace the value of nsSSLPersonalitySSL with Server-Cert

Now restart the world:

# ipactl start

Connect to each port if you want to confirm that the certificate and chain are correct, e.g.

# openssl s_client -host `hostname` -port 443
CONNECTED(00000003)
depth=1 O = EXAMPLE.TEST, CN = Certificate Authority
verify return:1
depth=0 O = EXAMPLE.TEST, CN = ipa.example.test
verify return:1
---
Certificate chain
0 s:/O=EXAMPLE.TEST/CN=ipa.example.test
i:/O=EXAMPLE.TEST/CN=Certificate Authority
1 s:/O=EXAMPLE.TEST/CN=Certificate Authority
i:/O=EXAMPLE.TEST/CN=Certificate Authority
---
...

by rcritten at May 07, 2019 02:43 AM

April 27, 2019

William Brown

Implementing Webauthn - a series of complexities …

Implementing Webauthn - a series of complexities …

I have recently started to work on a rust webauthn library, to allow servers to be implemented. However, in this process I have noticed a few complexities to an API that should have so much promise for improving the state of authentication. So far I can say I have not found any cryptographic issues, but the design of the standard does raise questions about the ability for people to correctly implement Webauthn servers.

Odd structure decisions

Webauth is made up of multiple encoding standards. There is a good reason for this, which is that the json parts are for the webbrowser, and the cbor parts are for ctap and the authenticator device.

However, I quickly noticed an issue in the Attestation Object, as described here https://w3c.github.io/webauthn/#sctn-attestation . Can you see the problem?

The problem is that the Authenticator Data relies on hand-parsing bytes, and has two structures that are concatenated with no length. This means:

  • You have to hand parse bytes 0 -> 36
  • You then have to CBOR deserialise the Attested Cred Data (if present)
  • You then need to serialise the ACD back to bytes and record that length (if your library doesn’t tell you how long the amount of data is parsed was).
  • Then you need to CBOR deserialise the Extensions.

What’s more insulting about this situation is that the Authenticator Data literally is part of the AttestationObject which is already provided as CBOR! There seems to be no obvious reason for this to require hand-parsing, as the Authenticator Data which will be signature checked, has it’s byte form checked, so you could have the AttestationObject store authDataBytes, then you can CBOR decode the nested structure (allowing the hashing of the bytes later).

There are many risks here because now you have requirements to length check all the parameters which people could get wrong - when CBOR would handle this correctly for you you and provides a good level of correctness that the structure is altered. I also trust the CBOR parser authors to do proper length checks too compared to my crappy byte parsing code!

Confusing Naming Conventions and Layout

The entire standard is full of various names and structures, which are complex, arbitrarily nested and hard to see why they are designed this way. Perhaps it’s a legacy compatability issue? More likely I think it’s object-oriented programming leaking into the specification, which is a paradigm that is not universally applicable.

Regardless, it would be good if the structures were flatter, and named better. There are many confusing structure names throughout the standard, and it can sometimes be hard to identify what you require and don’t require.

Additionally, naming of fields and their use, uses abbrivations to save bandwidth, but makes it hard to follow. I did honestly get confused about the difference between rp (the relying party name) and rp_id, where the challenge provides rp, and the browser response use rp_id.

It can be easy to point fingers and say “ohh William, you’re just not reading it properly and are stupid”. Am I? Or is it that humans find it really hard to parse data like this, and our brains are better suited to other tasks? Human factors are important to consider in specification design both in naming of values, consistency of their use, and appropriate communication as to how they are used properly. I’m finding this to be a barrier to correct implementation now (especially as the signature verification section is very fragmented and hard to follow …).

Crypto Steps seem complex or too static

There are a lot of possible choices here - there are 6 attestation formats and 5 attestation types. As some formats only do some types, there are then 11 verification paths you need to implement for all possible authenticators. I think this level of complexity will lead to mistakes over a large number of possible code branch paths, or lacking support for some device types which people may not have access to.

I think it may have been better to limit the attestation format to one, well defined format, and within that to limit the attestation types available to suit a more broad range of uses.

It feels a lot like these choice are part of some internal Google/MS/Other internal decisions for high security devices, or custom deviges, which will be internally used. It’s leaked into the spec and it raises questions about the ability for people to meaningfully implement the full specification for all possible devices, let alone correctly.

Some parts even omit details in a cryptographic operation, such as here in verification step 2, it doesn’t even list what format the bytes are. (Hint: it’s DER x509).

What would I change?

  • Be more specific

There should be no assumptions about format types, what is in bytes. Be verbose, detailed and without ambiguity.

  • Use type safe, length checked structures.

I would probably make the entire thing a single CBOR structure which contains other nested structures as required. We should never have to hand-parse bytes in 2019, especially when there is a great deal of evidence to show the risks of expecting people to do this.

  • Don’t assume object orientation

I think simpler, flatter structures in the json/cbor would have helped, and been clearer to implement, rather than the really complex maze of types currently involved.

Summary

Despite these concerns, I still think webauthn is a really good standard, and I really do think it will become the future of authentication. I’m hoping to help make that a reality in opensource and I hope that in the future I can contribute to further development and promotion of webauthn.

April 27, 2019 02:00 PM

The Case for Ethics in OpenSource

The Case for Ethics in OpenSource

For a long time there have been incidents in technology which have caused negative effects on people - from leaks of private data, to interfaces that are not accessible, to even issues like UI’s doing things that may try to subvert a persons intent. I’m sure there are many more: and we could be here all day listing the various issues that exist in technology, from small to great.

The theme however is that these issues continue to happen: we continue to make decisions in applications that can have consequences to humans.

Software is pointless without people. People create software, people deploy software, people interact with software, and even software indirectly can influence people’s lives. At every layer people exist, and all software will affect them in some ways.

I think that today, we have made a lot of progress in our communities around the deployment of code’s of conduct. These are great, and really help us to discuss the decisions and actions we take within our communities - with the people who create the software. I would like this to go further, where we can have a framework to discuss the effect of software on people that we write: the people that deploy, interact with and are influenced by our work.

Disclaimers

I’m not a specialist in ethics or morality: I’m not a registered or certified engineer in the legal sense. Finally, like all humans I am a product of my experiences which causes all my view points to be biased through the lens of my experience.

Additionally, I specialise in Identity Management software, so many of the ideas and issues I have encountered are really specific to this domain - which means I may overlook the issues in other areas. I also have a “security” mindset which also factors into my decisions too.

Regardless I hope that this is a starting point to recieve further input and advice from others, and a place where we can begin to improve.

The Problem

Let’s consider some issues and possible solutions in work that I’m familiar with - identity management software. Lets list a few “features”. (Please don’t email me about how these are wrong, I know they are …)

  • Storing usernames as first and last name
  • Storing passwords in cleartext.
  • Deleting an account sets a flag to mark deletion
  • Names are used as the primary key
  • We request sex on signup
  • To change account details, you have to use a command line tool

Now “technically”, none of these decisions are incorrect at all. There is literally no bad technical decision here, and everything is “technically correct” (not always the best kind of correct).

What do we want to achieve?

I don’t believe it’s correct to dictate a set of rules that people should follow. People will be fatigued, and will find the process too hard. We need to trust that people can learn and want to improve. Instead I believe it’s important we provide important points that people should be able to consider in a discussion around the development of software. The same way we discuss technical implementation details, we should discuss potential human impact in every change we have. To realise this, we need a short list of important factors that relate to humans.

I think the following points are important to consider when designing software. These relate to general principles which I have learnt and researched.

People should be respected to have:

  • Informed consent
  • Choice over how they are identified
  • Ability to be forgotten
  • Individual Autonomy
  • Free from Harmful Discrimination
  • Privacy
  • Ability to meaningfully access and use software

There is already some evidence in research papers to show that there are strong reasons for moral positions in software. For example, to prevent harm to come to people, to respect peoples autonomy and to conform to privacy legislation ( source ).

Let’s apply these

Given our set of “features”, lets now discuss these with the above points in mind.

  • Storing usernames as first and last name

This point clearly is in violation of the ability to choose how people are identified - some people may only have a single name, some may have multiple family names. On a different level this also violates the harmful discrimination rule due to the potential to disrespect individuals with cultures that have different name schemes compared to western/English societies.

A better way to approach this is “displayName” as a freetext UTF8 case sensitive field, and to allow substring search over the content (rather than attempting to sort by first/last name which also has a stack of issues).

  • Storing passwords in cleartext.

This one is a violation of privacy, that we risk the exposure of a password which may have been reused (we can’t really stop password reuse, we need to respect human behaviour). Not only that some people may assume we DO hash these correctly, so we actually are violating informed consent as we didn’t disclose the method of how we store these details.

A better thing here is to hash the password, or at least to disclose how it will be stored and used.

  • Deleting an account sets a flag to mark deletion

This violates the ability to be forgotten, because we aren’t really deleting the account. It also breaks informed consent, because we are being “deceptive” about what our software is actually doing compared to the intent of the users request

A better thing is to just delete the account, or if not possible, delete all user data and leave a tombstone inplace that represents “an account was here, but no details associated”.

  • Names are used as the primary key

This violates choice over identification, especially for women who have a divorce, or individuals who are transitioning or just people who want to change their name in general. The reason for the name change doesn’t matter - what matters is we need to respect peoples right to identification.

A better idea is to use UUID/ID numbers as a primary key, and have name able to be changed at any point in time.

  • To change account details, you have to use a command line tool

This violates a users ability to meaningfully access and use software - remember, people come from many walks of life and all have different skill sets, but using command line tools is not something we can universally expect.

A proper solution here is at minimum a web/graphical self management portal that is easy to access and follows proper UX/UI design rules, and for a business deploying, a service desk with humans involved that can support and help people change details on their account on their behalf if the person is unable to self-support via the web service.

Proposal

I think that OpenSource should aim to have a code of ethics - the same way we have a code of conduct to guide our behaviour internally to a project, we should have a framework to promote discussion of people’s rights that use, interact with and are affected by our work. We should not focus on technical matters only, but should be promoting people at the core of all our work. Every decision we make is not just technical, but social.

I’m sure that there are more points that could be considere than what I have listed here: I’d love to hear feedback to william at blackhats.net.au. Thanks!

April 27, 2019 02:00 PM

April 12, 2019

William Brown

Using Rust Generics to Enforce DB Record State

Using Rust Generics to Enforce DB Record State

In a database, entries go through a lifecycle which represents what attributes they have have, db record keys, and if they have conformed to schema checking.

I’m currently working on a (private in 2019, public in july 2019) project which is a NoSQL database writting in Rust. To help us manage the correctness and lifecycle of database entries, I have been using advice from the Rust Embedded Group’s Book.

As I have mentioned in the past, state machines are a great way to design code, so let’s plot out the state machine we have for Entries:

Entry State Machine

The lifecyle is:

  • A new entry is submitted by the user for creation
  • We schema check that entry
  • If it passes schema, we commit it and assign internal ID’s
  • When we search the entry, we retrieve it by internal ID’s
  • When we modify the entry, we need to recheck it’s schema before we commit it back
  • When we delete, we just remove the entry.

This leads to a state machine of:

                    |
             (create operation)
                    |
                    v
            [ New + Invalid ] -(schema check)-> [ New + Valid ]
                                                      |
                                               (send to backend)
                                                      |
                                                      v    v-------------\
[Commited + Invalid] <-(modify operation)- [ Commited + Valid ]          |
          |                                          ^   \       (write to backend)
          \--------------(schema check)-------------/     ---------------/

This is a bit rough - The version on my whiteboard was better :)

The main observation is that we are focused only on the commitability and validty of entries - not about where they are or if the commit was a success.

Entry Structs

So to make these states work we have the following structs:

struct EntryNew;
struct EntryCommited;

struct EntryValid;
struct EntryInvalid;

struct Entry<STATE, VALID> {
    state: STATE,
    valid: VALID,
    // Other db junk goes here :)
}

We can then use these to establish the lifecycle with functions (similar) to this:

impl Entry<EntryNew, EntryInvalid> {
    fn new() -> Self {
        Entry {
            state: EntryNew,
            valid: EntryInvalid,
            ...
        }
    }

}

impl<STATE> Entry<STATE, EntryInvalid> {
    fn validate(self, schema: Schema) -> Result<Entry<STATE, EntryValid>, ()> {
        if schema.check(self) {
            Ok(Entry {
                state: self.state,
                valid: EntryValid,
                ...
            })
        } else {
            Err(())
        }
    }

    fn modify(&mut self, ...) {
        // Perform any modifications on the entry you like, only works
        // on invalidated entries.
    }
}

impl<STATE> Entry<STATE, EntryValid> {
    fn seal(self) -> Entry<EntryCommitted, EntryValid> {
        // Assign internal id's etc
        Entry {
            state: EntryCommited,
            valid: EntryValid,
        }
    }

    fn compare(&self, other: Entry<STATE, EntryValid>) -> ... {
        // Only allow compares on schema validated/normalised
        // entries, so that checks don't have to be schema aware
        // as the entries are already in a comparable state.
    }
}

impl Entry<EntryCommited, EntryValid> {
    fn invalidate(self) -> Entry<EntryCommited, EntryInvalid> {
        // Invalidate an entry, to allow modifications to be performed
        // note that modifications can only be applied once an entry is created!
        Entry {
            state: self.state,
            valid: EntryInvalid,
        }
    }
}

What this allows us to do importantly is to control when we apply search terms, send entries to the backend for storage and more. Benefit is this is compile time checked, so you can never send an entry to a backend that is not schema checked, or run comparisons or searches on entries that aren’t schema checked, and you can even only modify or delete something once it’s created. For example other parts of the code now have:

impl BackendStorage {
    // Can only create if no db id's are assigned, IE it must be new.
    fn create(&self, ..., entry: Entry<EntryNew, EntryValid>) -> Result<...> {
    }

    // Can only modify IF it has been created, and is validated.
    fn modify(&self, ..., entry: Entry<EntryCommited, EntryValid>) -> Result<...> {
    }

    // Can only delete IF it has been created and committed.
    fn delete(&self, ..., entry: Entry<EntryCommited, EntryValid>) -> Result<...> {
    }
}

impl Filter<STATE> {
    // Can only apply filters (searches) if the entry is schema checked. This has an
    // important behaviour, where we can schema normalise. Consider a case-insensitive
    // type, we can schema-normalise this on the entry, then our compare can simply
    // be a string.compare, because we assert both entries *must* have been through
    // the normalisation routines!
    fn apply_filter(&self, ..., entry: &Entry<STATE, EntryValid>) -> Result<bool, ...> {
    }
}

Using this with Serde?

I have noticed that when we serialise the entry, that this causes the valid/state field to not be compiled away - because they have to be serialised, regardless of the empty content meaning the compiler can’t eliminate them.

A future cleanup will be to have a serialised DBEntry form such as the following:

struct DBEV1 {
    // entry data here
}

enum DBEntryVersion {
    V1(DBEV1)
}

struct DBEntry {
    data: DBEntryVersion
}

impl From<Entry<EntryNew, EntryValid>> for DBEntry {
    fn from(e: Entry<EntryNew, EntryValid>) -> Self {
        // assign db id's, and return a serialisable entry.
    }
}

impl From<Entry<EntryCommited, EntryValid>> for DBEntry {
    fn from(e: Entry<EntryCommited, EntryValid>) -> Self {
        // Just translate the entry to a serialisable form
    }
}

This way we still have the zero-cost state on Entry, but we are able to move to a versioned seralised structure, and we minimise the run time cost.

Testing the Entry

To help with testing, I needed to be able to shortcut and move between anystate of the entry so I could quickly make fake entries, so I added some unsafe methods:

#[cfg(test)]
unsafe fn to_new_valid(self, Entry<EntryNew, EntryInvalid>) -> {
    Entry {
        state: EntryNew,
        valid: EntryValid
    }
}

These allow me to setup and create small unit tests where I may not have a full backend or schema infrastructure, so I can test specific aspects of the entries and their lifecycle. It’s limited to test runs only, and marked unsafe. It’s not “technically” memory unsafe, but it’s unsafe from the view of “it could absolutely mess up your database consistency guarantees” so you have to really want it.

Summary

Using statemachines like this, really helped me to clean up my code, make stronger assertions about the correctness of what I was doing for entry lifecycles, and means that I have more faith when I and future-contributors will work on the code base that we’ll have compile time checks to ensure we are doing the right thing - to prevent data corruption and inconsistency.

April 12, 2019 02:00 PM

April 07, 2019

William Brown

Debugging MacOS bluetooth audio stutter

Debugging MacOS bluetooth audio stutter

I was noticing that audio to my bluetooth headphones from my iPhone was always flawless, but I started to noticed stutter and drops from my MBP. After exhausting some basic ideas, I was stumped.

To the duck duck go machine, and I searched for issues with bluetooth known issues. Nothing appeared.

However, I then decided to debug the issue - thankfully there was plenty of advice on this matter. Press shift + option while clicking bluetooth in the menu-bar, and then you have a debug menu. You can also open Console.app and search for “bluetooth” to see all the bluetooth related logs.

I noticed that when the audio stutter occured that the following pattern was observed.

default     11:25:45.840532 +1000   wirelessproxd   About to scan for type: 9 - rssi: -90 - payload: <00000000 00000000 00000000 00000000 00000000 0000> - mask: <00000000 00000000 00000000 00000000 00000000 0000> - peers: 0
default     11:25:45.840878 +1000   wirelessproxd   Scan options changed: YES
error       11:25:46.225839 +1000   bluetoothaudiod Error sending audio packet: 0xe00002e8
error       11:25:46.225899 +1000   bluetoothaudiod Too many outstanding packets. Drop packet of 8 frames (total drops:451 total sent:60685 percentDropped:0.737700) Outstanding:17

There was always a scan, just before the stutter initiated. So what was scanning?

I searched for the error related to packets, and there were a lot of false leads. From weird apps to dodgy headphones. In this case I could eliminate both as the headphones worked with other devices, and I don’t have many apps installed.

So I went back and thought about what macOS services could be the problem, and I found that airdrop would scan periodically for other devices to send and recieve files. Disabling Airdrop from the sharing menu in System Prefrences cleared my audio right up.

April 07, 2019 02:00 PM

April 02, 2019

William Brown

GDB autoloads for 389 DS

GDB autoloads for 389 DS

I’ve been writing a set of extensions to help debug 389-ds a bit easier. Thanks to the magic of python, writing GDB extensions is really easy.

On OpenSUSE, when you start your DS instance under GDB, all of the extensions are automatically loaded. This will help make debugging a breeze.

zypper in 389-ds gdb
gdb /usr/sbin/ns-slapd
GNU gdb (GDB; openSUSE Tumbleweed) 8.2
(gdb) ds-
ds-access-log  ds-backtrace
(gdb) set args -d 0 -D /etc/dirsrv/slapd-<instance name>
(gdb) run
...

All the extensions are under the ds- namespace, so they are easy to find. There are some new ones on the way, which I’ll discuss here too:

ds-backtrace

As DS is a multithreaded process, it can be really hard to find the active thread involved in a problem. So we provided a command that knows how to fold duplicated stacks, and to highlight idle threads that you can (generally) skip over.

===== BEGIN ACTIVE THREADS =====
Thread 37 (LWP 70054))
Thread 36 (LWP 70053))
Thread 35 (LWP 70052))
Thread 34 (LWP 70051))
Thread 33 (LWP 70050))
Thread 32 (LWP 70049))
Thread 31 (LWP 70048))
Thread 30 (LWP 70047))
Thread 29 (LWP 70046))
Thread 28 (LWP 70045))
Thread 27 (LWP 70044))
Thread 26 (LWP 70043))
Thread 25 (LWP 70042))
Thread 24 (LWP 70041))
Thread 23 (LWP 70040))
Thread 22 (LWP 70039))
Thread 21 (LWP 70038))
Thread 20 (LWP 70037))
Thread 19 (LWP 70036))
Thread 18 (LWP 70035))
Thread 17 (LWP 70034))
Thread 16 (LWP 70033))
Thread 15 (LWP 70032))
Thread 14 (LWP 70031))
Thread 13 (LWP 70030))
Thread 12 (LWP 70029))
Thread 11 (LWP 70028))
Thread 10 (LWP 70027))
#0  0x00007ffff65db03c in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x00007ffff66318b0 in PR_WaitCondVar () at /usr/lib64/libnspr4.so
#2  0x00000000004220e0 in [IDLE THREAD] connection_wait_for_new_work (pb=0x608000498020, interval=4294967295) at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:970
#3  0x0000000000425a31 in connection_threadmain () at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:1536
#4  0x00007ffff6637484 in None () at /usr/lib64/libnspr4.so
#5  0x00007ffff65d4fab in start_thread () at /lib64/libpthread.so.0
#6  0x00007ffff6afc6af in clone () at /lib64/libc.so.6

This example shows that there are 17 idle threads (look at frame 2) here, that all share the same trace.

ds-access-log

The access log is buffered before writing, so if you have a coredump, and want to see the last few events before they were written to disk, you can use this to display the content:

(gdb) ds-access-log
===== BEGIN ACCESS LOG =====
$2 = 0x7ffff3c3f800 "[03/Apr/2019:10:58:42.836246400 +1000] conn=1 fd=64 slot=64 connection from 127.0.0.1 to 127.0.0.1
[03/Apr/2019:10:58:42.837199400 +1000] conn=1 op=0 BIND dn=\"\" method=128 version=3
[03/Apr/2019:10:58:42.837694800 +1000] conn=1 op=0 RESULT err=0 tag=97 nentries=0 etime=0.0001200300 dn=\"\"
[03/Apr/2019:10:58:42.838881800 +1000] conn=1 op=1 SRCH base=\"\" scope=2 filter=\"(objectClass=*)\" attrs=ALL
[03/Apr/2019:10:58:42.839107600 +1000] conn=1 op=1 RESULT err=32 tag=101 nentries=0 etime=0.0001070800
[03/Apr/2019:10:58:42.840687400 +1000] conn=1 op=2 UNBIND
[03/Apr/2019:10:58:42.840749500 +1000] conn=1 op=2 fd=64 closed - U1
", '\276' <repeats 3470 times>

At the end the line that repeats shows the log is “empty” in that segment of the buffer.

ds-entry-print

This command shows the in-memory entry. It can be common to see Slapi_Entry * pointers in the codebase, so being able to display these is really helpful to isolate what’s occuring on the entry. Your first argument should be the Slapi_Entry pointer.

(gdb) ds-entry-print ec
Display Slapi_Entry: cn=config
cn: config
objectClass: top
objectClass: extensibleObject
objectClass: nsslapdConfig
nsslapd-schemadir: /opt/dirsrv/etc/dirsrv/slapd-standalone1/schema
nsslapd-lockdir: /opt/dirsrv/var/lock/dirsrv/slapd-standalone1
nsslapd-tmpdir: /tmp
nsslapd-certdir: /opt/dirsrv/etc/dirsrv/slapd-standalone1
...

April 02, 2019 02:00 PM

March 24, 2019

Alexander Bokovoy

Lost in (Kerberos) service translation?

A year ago Brian J. Atkisson from Red Hat IT filed a bug against FreeIPA asking to remove a default [domain_realm] mapping section from the krb5.conf configuration file generated during installation of a FreeIPA client. The bug is still open and I’d like to use this opportunity to discuss some less known aspects of a Kerberos service principal resolution.

When an application uses Kerberos to authenticate to a remote service, it needs to talk to a Kerberos key distribution center (KDC) to obtain a service ticket to that remote service. There are multiple ways how an application could construct a name of a service but in a simplistic view it boils down to getting a remote service host name and attaching it to a remote service type name. Type names are customary and really depend on an established tradition for a protocol in use. For example, browsers universally assume that a component HTTP/ is used in the service name; to authenticate to www.example.com server they would ask a KDC a service ticket to HTTP/www.example.com principal. When an LDAP client talks to an LDAP server ldap.example.com and uses SASL GSSAPI authentication, it will ask KDC for a service ticket to ldap/ldap.example.com. Sometimes these assumptions are written down in a corresponding RFC document, sometimes not, but they assume both client and server know what they are doing.

There are, however, few more moving parts at play. A host name part of a service principal might come from an interaction with a user. For browser, this would be a server name from a URL entered by a user and browser would need to construct the target service principal from it. The host name part might be incomplete in some cases: if you only have a single DNS domain in use, server names would be unique in that domain and your users might find it handy to only use the first label of a DNS name of the server to address it. Such approach was certainly very popular among system administrators who relied on capabilities of a Kerberos library to expand the short name into a fully qualified one.

Let’s look into that. Kerberos configuration file, krb5.conf, allows to say for any application that a hostname passed down to the library would need to be canonicalized. This option, dns_canonicalize_hostname, allows us to say “I want to connect to a server bastion” and let libkrb5 to expand that one to a bastion.example.com host name. While this behavior is handy, it relies on DNS. A downside of disabling canonicalization of the hostnames is that short hostnames will not be canonicalized and upon requests to KDC might be not recognized. Finally, there is a possibility of DNS hijacking. For Kerberos, use cases when DNS responses are spoofed aren’t too problematic since the fake KDC or the fake service wouldn’t gain much knowledge, but even in a normal situation a latency of DNS responses might be a considerable problem.

Another part of the equation is to find out which Kerberos realm a specified target service principal belongs to. If you have a single Kerberos realm, it might not be an issue; by setting default_realm option in the krb5.conf we can make sure a client always assumes the only realm we have. However, if there are multiple Kerberos realms, it is important to map the target service principal to the target realm at a client side, before a request is issued to a KDC.

There might be multiple Kerberos realms in existence at any site. For example, FreeIPA deployment provides one. If FreeIPA has established a trust to an Active Directory forest, then that forest would represent another Kerberos realm. Potentially, even more than one as each Active Directory domain in an Active Directory forest is a separate Kerberos realm in itself.

Kerberos protocol defines that a realm in which the application server is located must be determined by the client (RFC 4120 section 3.3.1). The specification also defines several strategies how a client may map the hostname of the application server to the realm it believes the server belongs to.

Domain to realm mapping

Let us stop and think a bit at this point. A Kerberos client has full control over the decision process of to which realm a particular application server belongs to. If it decides that the application server is from a different realm than the client is itself, then it needs to ask for a cross-realm ticket granting ticket from its own KDC. Then, with the cross-realm TGT in possession, the client can ask a KDC of the application server’s realm for the actual service ticket.

As a client, we want to be sure we would be talking to the correct KDC. As mentioned earlier, overly relying on DNS is not always a particulary secure action. As a result, krb5 library provides a way to consider how a particular hostname is mapped to a realm. The search mechanism for a realm mapping is pluggable and by default includes:

  • registry-based search on WIN32 (does nothing for Linux)
  • profile-based search: uses [domain_realm] section in krb5.conf to do actual mapping
  • dns-based search that can be disabled with dns_lookup_realm = false
  • domain-based search: it is disabled by default and can be enabled with realm_try_domains = ... option in krb5.conf

The order of search is important. It is hard-coded in krb5 library and depends on what operation is performed. For realm selection it is hard-coded that profile-based search is done before DNS-based search and domain-based search is done as the last one.

When [domain_realm] section exists in krb5.conf, it will be used to map a hostname of the application server to a realm. The mapping table in this section is typically build up based on the host and domain maps:

[domain_realm]
   www.example.com = EXAMPLE.COM
   .dev.example.com = DEVEXAMPLE.COM
   .example.com = EXAMPLE.COM

The mapping above says that www.example.com would be explicitly mapped to EXAMPLE.COM realm, all machines in DNS zone dev.example.com would be mapped to DEVEXAMPLE.COM realm and the rest of hosts in DNS zone example.com would be mapped to EXAMPLE.COM. This mapping only applies to hostnames, so a hostname foo.bar.example.com would not be mapped by this schema to any realm.

Profile-based search is visible in the Kerberos trace output as a selection of the realm right at the beginning of a request for a service ticket to a host-based service principal:

[root@client ~]# kinit -k
[root@client ~]# KRB5_TRACE=/dev/stderr kvno -S cifs client.example.com
[30798] 1552847822.721561: Getting credentials host/client.example.com@EXAMPLE.COM -> cifs/client.example.com@EXAMPLE.COM using ccache KEYRING:persistent:0:0
...

The difference here is that for a service principal not mapped with profile-based search there will be no assumed realm and the target principal would be constructed without a realm:

[root@client ~]# kinit -k
[root@client ~]# KRB5_TRACE=/dev/stderr kvno -S ldap dc.ad.example.com
[30684] 1552841274.602324: Getting credentials host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@ using ccache KEYRING:persistent:0:0

DNS-based search is activated when dns_lookup_realm option is set to true in krb5.conf and profile-based search did not return any results. Kerberos library will do a number of DNS queries for a TXT record starting with _kerberos. It will help it to discover which Kerberos realm is responsible for the DNS host of the application server. Kerberos library will perform these searches for the hostname itself first and then for each domain component in the hostname until it finds an answer or processes all domain components.

If we have www.example.com as a hostname, then Kerberos library would issue a DNS query for TXT record _kerberos.www.example.com to find a name of the Kerberos realm of www.example.com. If that fails, next try will be for a TXT record _kerberos.example.com and so on, until DNS components are all processed.

It should be noted that this algorithm is only implemented in MIT and Heimdal Kerberos libraries. Active Directory implementation from Microsoft does not allow to query _kerberos.$hostname DNS TXT record to find out which realm a target application server belongs to. Instead, Windows environments delegate the discovery process to their domain controllers.

DNS canonicalization feature (or lack of) also affects DNS search since without it we wouldn’t know what realm to map to a non-fully qualified hostname. When dns_canonicalize_hostname option is set to false, Kerberos client would send the request to KDC with a default realm associated with the non-fully qualified hostname. Most likely such service principal wouldn’t be understood by the KDC and reported as not found.

To help in this situations, FreeIPA KDC supports Kerberos principal aliases. One can use the following ipa command line tool’s command to add aliases to hosts. Remember that a host principal is really a host/<hostname>:

$ ipa help host-add-principal
Usage: ipa [global-options] host-add-principal HOSTNAME KRBPRINCIPALNAME... [options]

Add new principal alias to host entry
Options:
  -h, --help    show this help message and exit
  --all         Retrieve and print all attributes from the server. Affects
                command output.
  --raw         Print entries as stored on the server. Only affects output
                format.
  --no-members  Suppress processing of membership attributes.

$ ipa host-add-principal bastion.example.com host/bastion
-------------------------------------------
Added new aliases to host "bastion.example.com"
-------------------------------------------
  Host name: bastion.example.com
  Principal alias: host/bastion.example.com@EXAMPLE.COM, host/bastion@EXAMPLE.COM

and for other Kerberos service principals the corresponding command is ipa service-add-principal:

$ ipa help service-add-principal
Usage: ipa [global-options] service-add-principal CANONICAL-PRINCIPAL PRINCIPAL... [options]

Add new principal alias to a service
Options:
  -h, --help    show this help message and exit
  --all         Retrieve and print all attributes from the server. Affects
                command output.
  --raw         Print entries as stored on the server. Only affects output
                format.
  --no-members  Suppress processing of membership attributes.

$ ipa service-show HTTP/bastion.example.com
  Principal name: HTTP/bastion.example.com@EXAMPLE.COM
  Principal alias: HTTP/bastion.example.com@EXAMPLE.COM
  Keytab: False
  Managed by: bastion.example.com
  Groups allowed to create keytab: admins
[root@nyx ~]# ipa service-add-principal HTTP/bastion.example.com HTTP/bastion
---------------------------------------------------------------------------------
Added new aliases to the service principal "HTTP/bastion.example.com@EXAMPLE.COM"
---------------------------------------------------------------------------------
  Principal name: HTTP/bastion.example.com@EXAMPLE.COM
  Principal alias: HTTP/bastion.example.com@EXAMPLE.COM, HTTP/bastion@EXAMPLE.COM

Finally, domain-based search is activated when realm_try_domains = ... is specified. In this case Kerberos library will try heuristics based on the hostname of the target application server and a specific number of domain components of the application server hostname depending on how many components realm_try_domains option is allowing to cut off. More about that later.

However, there is another option employed by MIT Kerberos library. In case when MIT Kerberos client is unable to find out a realm on its own, starting with MIT krb5 1.6 version, a client will issue a request for without a known realm to own KDC. A KDC (must be MIT krb5 1.7 or later) can opt to recognize the hostname against own [domain_realm] mapping table and choose to issue a referral to the appropriate service realm.

The latter approach would only work if the KDC has been configured to allow such referrals to be issued and if client is asking for a host-based service. FreeIPA KDC, by default, allows this behavior. For trusted Active Directory realms there is also a support from SSSD on IPA masters: SSSD generates automatically [domain_realm] and [capaths] sections for all known trusted realms so that KDC is able to respond with the referrals.

However, a care should be taken by an application itself on the client side when constructing such Kerberos principal. For example, if we would use kvno utility, then a request kvno -S service hostname would ask for a referral while kvno service/hostname would not. The former is constructing a host-based principal while the latter is not.

When looking at the Kerberos trace, we can see the difference. Below host/client.example.com is asking for a service ticket to ldap/dc.ad.example.com as a host-based principal, without knowing which realm the application server’s principal belongs to:

[root@client ~]# kinit -k
[root@client ~]# KRB5_TRACE=/dev/stderr kvno -S ldap dc.ad.example.com
[30684] 1552841274.602324: Getting credentials host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@ using ccache KEYRING:persistent:0:0
[30684] 1552841274.602325: Retrieving host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@ from KEYRING:persistent:0:0 with result: -1765328243/Matching credential not found
[30684] 1552841274.602326: Retrying host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@EXAMPLE.COM with result: -1765328243/Matching credential not found
[30684] 1552841274.602327: Server has referral realm; starting with ldap/dc.ad.example.com@EXAMPLE.COM
[30684] 1552841274.602328: Retrieving host/client.example.com@EXAMPLE.COM -> krbtgt/EXAMPLE.COM@EXAMPLE.COM from KEYRING:persistent:0:0 with result: 0/Success
[30684] 1552841274.602329: Starting with TGT for client realm: host/client.example.com@EXAMPLE.COM -> krbtgt/EXAMPLE.COM@EXAMPLE.COM
[30684] 1552841274.602330: Requesting tickets for ldap/dc.ad.example.com@EXAMPLE.COM, referrals on
[30684] 1552841274.602331: Generated subkey for TGS request: aes256-cts/A93C
[30684] 1552841274.602332: etypes requested in TGS request: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac, camellia128-cts, camellia256-cts
[30684] 1552841274.602334: Encoding request body and padata into FAST request
[30684] 1552841274.602335: Sending request (965 bytes) to EXAMPLE.COM
[30684] 1552841274.602336: Initiating TCP connection to stream ip.ad.dr.ess:88
[30684] 1552841274.602337: Sending TCP request to stream ip.ad.dr.ess:88
[30684] 1552841274.602338: Received answer (856 bytes) from stream ip.ad.dr.ess:88
[30684] 1552841274.602339: Terminating TCP connection to stream ip.ad.dr.ess:88
[30684] 1552841274.602340: Response was from master KDC
[30684] 1552841274.602341: Decoding FAST response
[30684] 1552841274.602342: FAST reply key: aes256-cts/D1E2
[30684] 1552841274.602343: Reply server krbtgt/AD.EXAMPLE.COM@EXAMPLE.COM differs from requested ldap/dc.ad.example.com@EXAMPLE.COM
[30684] 1552841274.602344: TGS reply is for host/client.example.com@EXAMPLE.COM -> krbtgt/AD.EXAMPLE.COM@EXAMPLE.COM with session key aes256-cts/470F
[30684] 1552841274.602345: TGS request result: 0/Success
[30684] 1552841274.602346: Following referral TGT krbtgt/AD.EXAMPLE.COM@EXAMPLE.COM
[30684] 1552841274.602347: Requesting tickets for ldap/dc.ad.example.com@AD.EXAMPLE.COM, referrals on
[30684] 1552841274.602348: Generated subkey for TGS request: aes256-cts/F0C6
[30684] 1552841274.602349: etypes requested in TGS request: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac, camellia128-cts, camellia256-cts
[30684] 1552841274.602351: Encoding request body and padata into FAST request
[30684] 1552841274.602352: Sending request (921 bytes) to AD.EXAMPLE.COM
[30684] 1552841274.602353: Sending DNS URI query for _kerberos.AD.EXAMPLE.COM.
[30684] 1552841274.602354: No URI records found
[30684] 1552841274.602355: Sending DNS SRV query for _kerberos._udp.AD.EXAMPLE.COM.
[30684] 1552841274.602356: SRV answer: 0 0 88 "dc.ad.example.com."
[30684] 1552841274.602357: Sending DNS SRV query for _kerberos._tcp.AD.EXAMPLE.COM.
[30684] 1552841274.602358: SRV answer: 0 0 88 "dc.ad.example.com."
[30684] 1552841274.602359: Resolving hostname dc.ad.example.com.
[30684] 1552841274.602360: Resolving hostname dc.ad.example.com.
[30684] 1552841274.602361: Initiating TCP connection to stream ano.ther.add.ress:88
[30684] 1552841274.602362: Sending TCP request to stream ano.ther.add.ress:88
[30684] 1552841274.602363: Received answer (888 bytes) from stream ano.ther.add.ress:88
[30684] 1552841274.602364: Terminating TCP connection to stream ano.ther.add.ress:88
[30684] 1552841274.602365: Sending DNS URI query for _kerberos.AD.EXAMPLE.COM.
[30684] 1552841274.602366: No URI records found
[30684] 1552841274.602367: Sending DNS SRV query for _kerberos-master._tcp.AD.EXAMPLE.COM.
[30684] 1552841274.602368: No SRV records found
[30684] 1552841274.602369: Response was not from master KDC
[30684] 1552841274.602370: Decoding FAST response
[30684] 1552841274.602371: FAST reply key: aes256-cts/10DE
[30684] 1552841274.602372: TGS reply is for host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@AD.EXAMPLE.COM with session key aes256-cts/24D1
[30684] 1552841274.602373: TGS request result: 0/Success
[30684] 1552841274.602374: Received creds for desired service ldap/dc.ad.example.com@AD.EXAMPLE.COM
[30684] 1552841274.602375: Storing host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@ in KEYRING:persistent:0:0
[30684] 1552841274.602376: Also storing host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@AD.EXAMPLE.COM based on ticket
[30684] 1552841274.602377: Removing host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@AD.EXAMPLE.COM from KEYRING:persistent:0:0
ldap/dc.ad.example.com@: kvno = 28

However, when not using a host-based principal in the request we’ll fail.

[root@client ~]# kinit -k
[root@client ~]# KRB5_TRACE=/dev/stderr kvno ldap/dc.ad.example.com
[30695] 1552841932.100975: Getting credentials host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@EXAMPLE.COM using ccache KEYRING:persistent:0:0
[30695] 1552841932.100976: Retrieving host/client.example.com@EXAMPLE.COM -> ldap/dc.ad.example.com@EXAMPLE.COM from KEYRING:persistent:0:0 with result: -1765328243/Matching credential not found
[30695] 1552841932.100977: Retrieving host/client.example.com@EXAMPLE.COM -> krbtgt/EXAMPLE.COM@EXAMPLE.COM from KEYRING:persistent:0:0 with result: 0/Success
[30695] 1552841932.100978: Starting with TGT for client realm: host/client.example.com@EXAMPLE.COM -> krbtgt/EXAMPLE.COM@EXAMPLE.COM
[30695] 1552841932.100979: Requesting tickets for ldap/dc.ad.example.com@EXAMPLE.COM, referrals on
[30695] 1552841932.100980: Generated subkey for TGS request: aes256-cts/27DA
[30695] 1552841932.100981: etypes requested in TGS request: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac, camellia128-cts, camellia256-cts
[30695] 1552841932.100983: Encoding request body and padata into FAST request
[30695] 1552841932.100984: Sending request (965 bytes) to EXAMPLE.COM
[30695] 1552841932.100985: Initiating TCP connection to stream ip.ad.dr.ess:88
[30695] 1552841932.100986: Sending TCP request to stream ip.ad.dr.ess:88
[30695] 1552841932.100987: Received answer (461 bytes) from stream ip.ad.dr.ess:88
[30695] 1552841932.100988: Terminating TCP connection to stream ip.ad.dr.ess:88
[30695] 1552841932.100989: Response was from master KDC
[30695] 1552841932.100990: Decoding FAST response
[30695] 1552841932.100991: TGS request result: -1765328377/Server ldap/dc.ad.example.com@EXAMPLE.COM not found in Kerberos database
[30695] 1552841932.100992: Requesting tickets for ldap/dc.ad.example.com@EXAMPLE.COM, referrals off
[30695] 1552841932.100993: Generated subkey for TGS request: aes256-cts/C1BF
[30695] 1552841932.100994: etypes requested in TGS request: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac, camellia128-cts, camellia256-cts
[30695] 1552841932.100996: Encoding request body and padata into FAST request
[30695] 1552841932.100997: Sending request (965 bytes) to EXAMPLE.COM
[30695] 1552841932.100998: Initiating TCP connection to stream ip.ad.dr.ess:88
[30695] 1552841932.100999: Sending TCP request to stream ip.ad.dr.ess:88
[30695] 1552841932.101000: Received answer (461 bytes) from stream ip.ad.dr.ess:88
[30695] 1552841932.101001: Terminating TCP connection to stream ip.ad.dr.ess:88
[30695] 1552841932.101002: Response was from master KDC
[30695] 1552841932.101003: Decoding FAST response
[30695] 1552841932.101004: TGS request result: -1765328377/Server ldap/dc.ad.example.com@EXAMPLE.COM not found in Kerberos database
kvno: Server ldap/dc.ad.example.com@EXAMPLE.COM not found in Kerberos database while getting credentials for ldap/dc.ad.example.com@EXAMPLE.COM

As you can see, our client tried to ask for a service ticket to a non-host-based service principal from outside our realm and this was not accepted by the KDC, thus resolution failing.

Mixed realm deployments

The behavior above is predictable. However, a client-side processing of the target realm behaves wrongly in case a client needs to request a service ticket to a service principal located in a trusted realm but situated in a DNS zone belonging to our own realm. This might sound like a complication but it is a typical situation for deployments with FreeIPA trusting Active Directory forests. In such cases customers often want to place Linux machines right in the DNS zones associated with Active Directory domains.

Since Microsoft Active Directory implementation does not support per-host Kerberos realm hint, unlike MIT Kerberos or Heimdal, such request from Windows client will always fail. It will be not possible to obtain a service ticket in such situation from Windows machines.

However, when both realms trusting each other are MIT Kerberos, their KDCs and clients can be configured for a selective realm discovery.

As explained at FOSDEM 2018 and devconf.cz 2019, Red Hat IT moved from an old plain Kerberos realm to the FreeIPA deployment. This is a situation where we have EXAMPLE.COM and IPA.EXAMPLE.COM both trusting each other and migrating systems to IPA.EXAMPLE.COM over long period of time. We want to continue providing services in example.com DNS zone but use IPA.EXAMPLE.COM realm. Our clients are in both Kerberos realms but over time they will all eventually migrate to IPA.EXAMPLE.COM.

Working with such situation can be tricky. Let’s start with a simple example.

Suppose our client’s krb5.conf has [domain_realm] section that looks like this:

[domain_realm]
   client.example.com = EXAMPLE.COM
   .example.com = EXAMPLE.COM

If we need to ask for a HTTP/app.example.com service ticket to the application server hosted on app.example.com, the Kerberos library on the client will map HTTP/app.example.com to the EXAMPLE.COM and will not attempt to request a referral from a KDC. If our application server is enrolled into IPA.EXAMPLE.COM realm, it means the client with such configuration will never try to discover HTTP/app.example.com@IPA.EXAMPLE.COM and will never be able to authenticate to app.example.com with Kerberos.

There are two possible solutions here. We can either add an explicit mapping for app.example.com host to IPA.EXAMPLE.COM in the client’s [domain_realm] section in krb5.conf or remove .example.com mapping entry from the [domain_realm] on the client side completely and rely on KDC or DNS-based search.

First solution does not scale and is a management issue. Updating all clients when a new application server is migrated to the new realm sounds like a nightmare if majority of your clients are laptops. You’d really want to force them to delegate to the KDC or do DNS-based search instead.

Of course, there is a simple solution: add _kerberos.app.example.com TXT record pointing out to IPA.EXAMPLE.COM in the DNS and let clients to use it. This would assume that all clients will not have .example.com = EXAMPLE.COM mapping rule.

Unfortunately, it is more complicated. As Robbie Harwood, Fedora and RHEL maintainer of MIT Kerberos, explained to me, the problem is what happens if there’s inadequate DNS information, e.g. DNS-based search failed. A client would fall back to heuristics (domain-based search) and these would differ depending which MIT Kerberos version is in use. Since MIT Kerberos 1.16 heuristics would be trying to prefer mapping HTTP/app.ipa.example.com into IPA.EXAMPLE.COM over EXAMPLE.COM, and prefer EXAMPLE.COM to failure. However, there is no a way to map HTTP/app.example.com to IPA.EXAMPLE.COM with these heuristics.

Domain-based search gives us another heuristics based on the realm. It is tunable via realm_try_domains option but it also would affect the way how MIT Kerberos library would choose which credentials cache from a credentials cache collection (KEYRING:, DIR:, KCM: ccache types). This logic is present since MIT Kerberos 1.12 but it also wouldn’t help us to map HTTP/app.example.com to IPA.EXAMPLE.COM.

After some discussions, Robbie and I came to a conclusion that perhaps changing the order how these methods are applied by the MIT Kerberos library could help. As I mentioned in “Domain to realm mapping” section, the current order is hard-coded: for realm selection the profile-based search is done before DNS-based search and domain-based search is done as the last one. Ideally, choosing which search is done after which could be given to administrators. However, there aren’t many reasonable orders out there. Perhaps, allowing just two options would be enough:

  • prioritizing DNS search over a profile search
  • prioritizing a profile search over DNS search

Until it is done, we are left with the following recommendation for mixed-domain Kerberos principals from multiple realms:

  • make sure you don’t use [domain_realm] mapping for mixed realm domains
  • make sure you have _kerberos.$hostname TXT record set per host/domain for the right realm name. Remember that Kerberos realm is case-sensitive and almost everywhere it is uppercase, so be sure the value of the TXT record is correct.

March 24, 2019 07:13 AM

March 18, 2019

Fraser Tweedale

cert-fix redux

cert-fix redux

A few weeks ago I analysed the Dogtag pki-server cert-fix tool, which is intended to assist with recovery in scenarios where expired certificates inhibit Dogtag’s normal operation. Unfortunately, there were some flawed assumptions and feature gaps that limited the usefulness of the tool, especially in FreeIPA contexts.

In this post, I provide an update on changes that are being made to the tool to address those shortcomings.

Recap

Recapping the shortcomings in brief:

  1. When TLS client certificate authentication is used to authenticate to Dogtag (the default for FreeIPA), and expired subsystem certificate causes authentication failure and Dogtag cannot start.
  2. When Dogtag is configured to use TLS or STARTTLS when connecting to the database, an expired LDAP service certificate causes connection failure.
  3. cert-fix uses an admin or agent certificate to perform authenticated operations against Dogtag. An expired certificate causes authentication failure, and certificate renewal fails.
  4. Expired CA certificate is not handled. Due to longer validity periods, and externally-signed CA certificates expiring at different times from Dogtag system certificates, this scenario is less common, but it still occurs.
  5. The need to renew non-system certificates. Apart from system certificates, in order for correct operation of Dogtag it may be necessary to renew some other certificates, such as an expired LDAP service certificate, or an expired agent certificate (e.g. IPA RA). cert-fix did not provide a way to do this.

cert-fix now switches the deployment to use password authentication to LDAP, over an insecure connection on port 389. The original database configuration is restored when cert-fix finishes.

The subsystem certificate is used by Dogtag to authenticate to LDAP. Switching to password authentication works around the expired subsystem certificate. Furthermore if the subsystem certificate gets renewed, the new certificate gets imported into the pkidbuser LDAP entry so that authentication will work (389 DS requires an exact certificate match in the userCertificate attribute of the user).

If the LDAP service certificate is expired, this procedure works around that but does not renew it. This is problem #3, and is addressed separately.

Switching Dogtag to password authentication to LDAP means resetting the pkidbuser account password. We use the ldappasswd program to do this. The LDAP password modify extended operation requires confientiality (i.e. TLS or STARTTLS); an expired LDAP service certificate inhibits this. Therefore we use LDAPI and autobind. The LDAPI socket is specified via the --ldapi-socket option.

FreeIPA always configures LDAP and root autobind to the cn=Directory Manager LDAP account. For standalone Dogtag installations these may need to be configured before runnning cert-fix.

Resolving expired agent certificate (issue #3)

Instead of using the certificate to authenticate the agent, reset the password of the agent account and use that password to authenticate the agent. The password is randomly generated and forgotten after cert-fix terminates.

The agent account to use is now specified via the --agent-uid option. NSSDB-related options for specifying the agent certificate and NSSDB passphrase have been removed.

Renewing other certificates (issue #5)

cert-fix learned the --extra-cert option, which gives the serial number of an extra certificate to renew. The option can be given multiple times to specify multiple certificates. Each certificate gets renewed and output in /etc/pki/<instance-dir>/certs/<serial>-renewed.crt. If a non-existing serial number is specified, an error is printed but processing continues.

This facility allows operators (or wrapper tools) to renew other essential certificates alongside the Dogtag system certificates. Further actions are needed to put those new certificates in the right places. But it is fair, in order to keep to keep the cert-fix tool simple, to put this burden back on the operator. In any case, we intend to write a supplementary tool for FreeIPA that wraps cert-fix and takes care of working out which extra certificates to renew, and putting them in the right places.

New or changed assumptions

The changes dicsussed above abolish some assumptions that were previously made by cert-fix, and establish some new assumptions.

Absolished:

  • A valid admin certificate is no longer needed
  • A valid LDAP service certificate is no longer needed
  • When Dogtag is configured to use certificate authentication to LDAP, a valid subsystem certificate is no longer needed

New:

  • cert-fix must be run as root.
  • LDAPI must be configured, with root autobinding to cn=Directory Manager or other account with privileges on o=ipaca subtree, including password reset privileges.
  • The password of the specified agent account will be reset. If needed, it can be changed back afterwards (manually; successful execution of cert-fix proves that the operator has privileges to do this).
  • If Dogtag was configured to use TLS certificate authentication to bind to LDAP, the password on the pkidbuser account will be reset. (If password authentication was already used, the password does not get reset).
  • LDAPI (ldappasswd) and need to be root

Demo

Here I’ll put the full command and command output for an execution of the cert-fix tool, and break it up with commentary. I will renew the subsystem certificate, and additionally the certificate with serial number 29 (which happens to be the LDAP certificate):

[root@f27-1 ~]# pki-server cert-fix \
    --agent-uid admin \
    --ldapi-socket /var/run/slapd-IPA-LOCAL.socket \
    --cert subsystem \
    --extra-cert 29

There is no longer any need to set up an NSSDB with an agent certificate, a considerable UX improvement! An further improvement was to default the log verbosity to INFO, so we can see progress and observe (at a high level) what the cert-fix is doing, without specifying -v / --verbose.

INFO: Loading password config: /etc/pki/pki-tomcat/password.conf
INFO: Fixing the following system certs: ['subsystem']
INFO: Renewing the following additional certs: ['29']
SASL/EXTERNAL authentication started
SASL username: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth
SASL SSF: 0

Preliminaries. The tool loads information about the Dogtag instance, states its intentions and verifies that it can authenticate to LDAP.

INFO: Stopping the instance to proceed with system cert renewal
INFO: Configuring LDAP password authentication
INFO: Setting pkidbuser password via ldappasswd
SASL/EXTERNAL authentication started
SASL username: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth
SASL SSF: 0
INFO: Selftests disabled for subsystems: ca
INFO: Resetting password for uid=admin,ou=people,o=ipaca
SASL/EXTERNAL authentication started
SASL username: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth
SASL SSF: 0

cert-fix stopped Dogtag, changed the database connection configuration, reset the agent password and suppressed the Dogtag self-tests.

INFO: Starting the instance
INFO: Sleeping for 10 seconds to allow server time to start...

cert-fix starts Dogtag then sleeps for a bit. The sleep was added to avoid races against Dogtag startup that sometimes caused the tool to fail. It’s a bit of a hack, but 10 seconds should hopefully be enough.

INFO: Requesting new cert for subsystem
INFO: Getting subsystem cert info for ca
INFO: Trying to setup a secure connection to CA subsystem.
INFO: Secure connection with CA is established.
INFO: Placing cert creation request for serial: 34
INFO: Request ID: 38
INFO: Request Status: complete
INFO: Serial Number: 0x26
INFO: Issuer: CN=Certificate Authority,O=IPA.LOCAL 201903151111
INFO: Subject: CN=CA Subsystem,O=IPA.LOCAL 201903151111
INFO: New cert is available at: /etc/pki/pki-tomcat/certs/subsystem.crt
INFO: Requesting new cert for 29; writing to /etc/pki/pki-tomcat/certs/29-renewed.crt
INFO: Trying to setup a secure connection to CA subsystem.
INFO: Secure connection with CA is established.
INFO: Placing cert creation request for serial: 29
INFO: Request ID: 39
INFO: Request Status: complete
INFO: Serial Number: 0x27
INFO: Issuer: CN=Certificate Authority,O=IPA.LOCAL 201903151111
INFO: Subject: CN=f27-1.ipa.local,O=IPA.LOCAL 201903151111
INFO: New cert is available at: /etc/pki/pki-tomcat/certs/29-renewed.crt

Certificate requests were issued and completed successfully.

INFO: Stopping the instance
INFO: Getting subsystem cert info for ca
INFO: Getting subsystem cert info for ca
INFO: Updating CS.cfg with the new certificate
INFO: Importing new subsystem cert into uid=pkidbuser,ou=people,o=ipaca
SASL/EXTERNAL authentication started
SASL username: gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth
SASL SSF: 0
modifying entry "uid=pkidbuser,ou=people,o=ipaca"

Dogtag was stopped, and the new subsystem cert was updated in CS.cfg. It was also imported into the pkidbuser entry to ensure LDAP TLS client authentication continues to work. No further action is taken in relation to the extra cert(s).

INFO: Selftests enabled for subsystems: ca
INFO: Restoring previous LDAP configuration
INFO: Starting the instance with renewed certs

Self-tests are re-enabled and the previous LDAP configuration restored. Python context managers are used to ensure that these steps are performed even when a fatal error occurs.

The end.

Conclusion

The problem of an expired CA certificate (issue #4) has not yet been addressed. It is not the highest priority but it would be nice to have. It is still believed to be a low-effort change so it is likely to be implemented at some stage.

More extensive testing of the tool is needed for renewing system certificates for other Dogtag subsystems—in particular the KRA subsystem.

The enhancements discussed in this post make the cert-fix tool a viable MVP for expired certificate recovery without time-travel. The enhancements are still in review, yet to be merged. That will hopefully happen soon (within a day or so of this post). We are also making a significant effort to backport cert-fix to some earlier branches and make it available on older releases.

As mentioned earlier in the post, we intend to implement a FreeIPA-specific wrapper for cert-fix that can take care of the additional steps required to renew and deploy expired certificates that are part of the FreeIPA system, but are not Dogtag system certificates handled directly by cert-fix. These include LDAP and Apache HTTPD certificates, the IPA RA agent certificate and the Kerberos PKINIT certificate.

March 18, 2019 12:00 AM

Powered by Planet