FreeIPA Identity Management planet - technical blogs

May 31, 2018

Fraser Tweedale

Replacing a lost or broken CA in FreeIPA

Replacing a lost or broken CA in FreeIPA

This is a long post. If you just want some steps to follow feel free to skip ahead.

Every now and then we have a customer case or a question on the freeipa-users mailing list about replacing a lost CA. Usually the scenario goes something like this:

  • FreeIPA was installed (with a CA)
  • Replicas were created, but without the CA role
  • The original CA master was decommissioned or failed

A variation on this is the removal of the Dogtag instance on the only CA master in a deployment. This is less common, because it’s a deliberate act rather than an oversight. This action might be performed to clean up a partially-completed but failed execution of ipa-ca-install, leaving the deployment in an inconsistent state.

In either case, the deployment is left without a CA. There might be a backup of the original CA keys can can be used to restore a CA, or there might not.

In this post I will focus on the total loss of a CA. What is required to bring up a new CA in an existing IPA deployment, after the original CA is destroyed? I’m going to break a test installation as described above, then work out how to fix it. The goal is to produce a recovery procedure for administrators in this situation.

Prevention is better than cure

Before I go ahead and delete the CA from a deployment, let’s talk about prevention. Losing your CA is a Big Deal. Therefore it’s essential not to leave your deployment with only one CA master. In earlier times, FreeIPA did not do anything to detect that there was only one CA master and make a fuss about it. This was poor UX that left many users and customers in a precarious situation, and ultimately to higher support costs for Red Hat.

Today we have some safeguards in place. In the topology Web UI we detect a single-CA topology and warn about it. ipa-replica-install alerts the administrator if there is only one CA in the topology and suggests to install the CA role on the new replica. ipa-server-install --uninstall warns when you are uninstalling the last instance of a some server role; this check includes the CA role. Eventually, FreeIPA will have some health check tools that will check for many kinds of problems, including this one.

Assumptions and starting environment

I’ve made some assumptions that reduce the number of steps or remove potential pitfalls:

  • The Subject DN of the replacement CA will be different from the original CA. The key will be different, so this will avoid problems with programs that can’t handle a same subject, different key scenario. It also avoids the need to force the new CA to start issuing certificates from some serial number higher than any that were previously issued.
  • We’ll use self-signed CAs. I can’t think of any problems that would arise doing this with an externally-signed CA. But there will be fewer steps and it will keep the post focused. The recovery procedure will not be substantially different for externally-signed CAs.

For the environment, I’m using builds of the FreeIPA master branch, somewhere around the v4.7 pre-release. Master and replica machines are both running Fedora 28.

There are two servers in the topology. f28-1.ipa.local was the original server and is the only server with the CA role. The replica f28-0.ipa.local was created from f28-1, without a CA. The CA subject DN is CN=Certificate Authority,O=IPA.LOCAL 201805171453. The Kerberos realm name is IPA.LOCAL.

Success criteria

How do we know when the deployment is repaired? I will use the following success criteria:

  1. The CA role is installed on a server (in our case, f28-0). That server is configured as the CA renewal master.
  2. The new CA certificate is present in the LDAP trust store.
  3. The old certificate remains in the LDAP trust store, so that certificates issued by the old CA are still trusted.
  4. Certificates can be issued via ipa cert-request.
  5. Existing HTTP and LDAP certificates, issued by the old CA, can be successfully renewed by Certmonger using the new CA.
  6. A CA replica can be created.

Deleting the CA

Now I will remove f28-1 from the topology. Recent versions of FreeIPA are aware of which roles (e.g. CA, DNS, etc) are installed on which servers. In this case, the program correctly detects that this server contains the only CA instance, and aborts:

# ipa-server-install --uninstall

This is a NON REVERSIBLE operation and will delete all data and configuration!
It is highly recommended to take a backup of existing data and configuration
  using ipa-backup utility before proceeding.

Are you sure you want to continue with the uninstall procedure? [no]: y
ipapython.admintool: ERROR    Server removal aborted:
  Deleting this server is not allowed as it would leave your
  installation without a CA.
ipapython.admintool: ERROR    The ipa-server-install command failed.
  See /var/log/ipaserver-uninstall.log for more information

The --ignore-last-of-role option suppresses this check. When we add that option, the deletion of the server succeeds:

# ipa-server-install --uninstall --ignore-last-of-role

This is a NON REVERSIBLE operation and will delete all data and configuration!
It is highly recommended to take a backup of existing data and configuration
  using ipa-backup utility before proceeding.

Are you sure you want to continue with the uninstall procedure? [no]: y
Deleted IPA server "f28-1.ipa.local"
Shutting down all IPA services
Configuring certmonger to stop tracking system certificates for KRA
Configuring certmonger to stop tracking system certificates for CA
Unconfiguring CA
Unconfiguring web server
Unconfiguring krb5kdc
Unconfiguring kadmin
Unconfiguring directory server
Unconfiguring ipa-custodia
Unconfiguring ipa-otpd
Removing IPA client configuration
Removing Kerberos service principals from /etc/krb5.keytab
Disabling client Kerberos and LDAP configurations
Redundant SSSD configuration file /etc/sssd/sssd.conf was moved to /etc/sssd/sssd.conf.deleted
Restoring client configuration files
Unconfiguring the NIS domain.
nscd daemon is not installed, skip configuration
nslcd daemon is not installed, skip configuration
Systemwide CA database updated.
Client uninstall complete.
The ipa-client-install command was successful

Switching back to f28-0 (the CA-less replica), we can see that the f28-1 is gone for good, and there is no server with the CA server role installed:

% ipa server-find
1 IPA server matched
  Server name: f28-0.ipa.local
  Min domain level: 0
  Max domain level: 1
Number of entries returned 1

% ipa server-role-find --role "CA server"
1 server role matched
  Server name: f28-0.ipa.local
  Role name: CA server
  Role status: absent
Number of entries returned 1

And because of this, we cannot issue certificates:

% ipa cert-request --principal alice alice.csr
ipa: ERROR: CA is not configured

OK, time to fix the deployment!

Fixing the deployment

The first thing we’ll try is just running ipa-ca-install. This command installs the CA role on an existing server. I expect it to fail, but it might hint at some of the repairs that need to be performed.

# ipa-ca-install --subject-base "O=IPA.LOCAL NEW CA"
Directory Manager (existing master) password: XXXXXXXX

Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

Certificate with nickname IPA.LOCAL IPA CA is present in
/etc/dirsrv/slapd-IPA-LOCAL/, cannot continue.

We will not follow the advice about uninstalling the server. But the second message tell us something useful: we need to rename the CA certificate in /etc/dirsrv/slapd-IPA-LOCAL.

In fact, there are lots of places we need to rename the old CA certificate, including the LDAP certificate store. I’ll actually start there.

LDAP certificate store

FreeIPA has an LDAP-based store of trusted CA certificates used by clients and servers. The ipa-certupdate command reads certificates from this trust store and adds them to system trust stores and server certificate databases.

CA certificates are stored under cn=certificates,cn=ipa,cn=etc,{basedn}. The cn of each certificate entry is based on the Subject DN. The FreeIPA CA is the one exception: its cn is always {REALM} IPA CA. What are the current contents of the LDAP trust store?

% ldapsearch -LLL -D "cn=Directory Manager" -wXXXXXXXX \
    -b "cn=certificates,cn=ipa,cn=etc,dc=ipa,dc=local" \
    -s one ipaCertIssuerSerial cn
dn: cn=IPA.LOCAL IPA CA,cn=certificates,cn=ipa,cn=etc,dc=ipa,dc=local
ipaCertIssuerSerial: CN=Certificate Authority,O=IPA.LOCAL 201805171453;1

We see only the FreeIPA CA certificate, as expected. We must move this entry aside. We do still want to keep it in the trust stores so certificates that were issued by this CA will still be trusted. I used the ldapmodrdn command to rename this entry, with the new cn based on the Subject DN of the old CA.

% ldapmodrdn -D "cn=Directory Manager" -wXXXXXXXX -r \
    "cn=IPA.LOCAL IPA CA,cn=certificates,cn=ipa,cn=etc,dc=ipa,dc=local" \
    "cn=CN\=Certificate Authority\,O\=IPA.LOCAL 201805171453"

% ldapsearch -LLL -D "cn=Directory Manager" -wXXXXXXXX \
    -b "cn=certificates,cn=ipa,cn=etc,dc=ipa,dc=local" \
    -s one ipaCertIssuerSerial cn
dn: cn=CN\3DCertificate Authority\2CO\3DIPA.LOCAL 201805171453,cn=certificates,cn=
ipaCertIssuerSerial: CN=Certificate Authority,O=IPA.LOCAL 201805171453;1
cn: CN=Certificate Authority,O=IPA.LOCAL 201805171453

For the ldapmodrdn command, note the escaping of the = and , characters in the DN. This is important.

Removing CA entries

There are a bunch of CA entries in the FreeIPA directory. The cn=ipa is the main IPA CA. In additional, there can be zero or more lightweight sub-CAs in a FreeIPA deployment.

# ipa ca-find
2 CAs matched
  Name: ipa
  Description: IPA CA
  Authority ID: a0e7a855-aac2-40fc-8e86-cf1a7429f28c
  Subject DN: CN=Certificate Authority,O=IPA.LOCAL 201805171453
  Issuer DN: CN=Certificate Authority,O=IPA.LOCAL 201805171453

  Name: test1
  Authority ID: ac7e6def-acd8-4d19-ab3e-60067c17ba81
  Subject DN: CN=test1
  Issuer DN: CN=Certificate Authority,O=IPA.LOCAL 201805171453
Number of entries returned 2

These entries will all need to be removed:

# ipa ca-find --pkey-only --all \
    | grep dn: \
    | awk '{print $2}' \
    | xargs ldapdelete -D "cn=Directory Manager" -wXXXXXXXX

# ipa ca-find
0 CAs matched
Number of entries returned 0


ipa-ca-install complained about the presense of a certificate with nickname IPA.LOCAL IPA CA in the /etc/dirsrv/slapd-IPA-LOCAL NSS certificate database (NSSDB). What are the current contents of this NSSDB?

# certutil -d /etc/dirsrv/slapd-IPA-LOCAL -L

Certificate Nickname                 Trust Attributes

IPA.LOCAL IPA CA                     CT,C,C
Server-Cert                          u,u,u

There are two certificates: the old CA certificate and the server certificate.

With the CA certificate having been renamed in the LDAP trust store, I’ll now run ipa-certupdate and see what happens in the NSSDB.

# ipa-certupdate
trying https://f28-0.ipa.local/ipa/session/json
[try 1]: Forwarding 'ca_is_enabled/1' to json server
Systemwide CA database updated.
Systemwide CA database updated.
The ipa-certupdate command was successful

Nothing failed! That is encouraging. But certutil still shows the same output as above. So we must find another way to change the nickname in the NSSDB. Lucky for us, certutil has a rename option:

# certutil --rename --help
--rename        Change the database nickname of a certificate
   -n cert-name      The old nickname of the cert to rename
   --new-n new-name  The new nickname of the cert to rename
   -d certdir        Cert database directory (default is ~/.netscape)
   -P dbprefix       Cert & Key database prefix

# certutil -d /etc/dirsrv/slapd-IPA-LOCAL --rename \
    -n 'IPA.LOCAL IPA CA' --new-n 'OLD IPA CA'

# certutil -d /etc/dirsrv/slapd-IPA-LOCAL -L

Certificate Nickname                 Trust Attributes

OLD IPA CA                           CT,C,C
Server-Cert                          u,u,u

I also performed this rename in /etc/ipa/nssdb. On Fedora 28, Apache uses OpenSSL instead of NSS. But on older versions there is also an Apache NSSDB at /etc/httpd/alias; the rename will need to be performed there, too.

ipa-ca-install, attempt 2

Now that the certificates have been renamed in the LDAP trust store and NSSDBs, let’s try ipa-ca-install again:

# ipa-ca-install --ca-subject 'CN=IPA.LOCAL NEW CA'
Directory Manager (existing master) password: XXXXXXXX

The CA will be configured with:
Subject base: O=IPA.LOCAL
Chaining:     self-signed

Continue to configure the CA with these values? [no]: y
Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes
  [1/28]: configuring certificate server instance
  [2/28]: exporting Dogtag certificate store pin
  [3/28]: stopping certificate server instance to update CS.cfg
  [4/28]: backing up CS.cfg
  [5/28]: disabling nonces
  [6/28]: set up CRL publishing
  [7/28]: enable PKIX certificate path discovery and validation
  [8/28]: starting certificate server instance
  [9/28]: configure certmonger for renewals
  [10/28]: requesting RA certificate from CA
  [error] DBusException: org.fedorahosted.certmonger.duplicate:
          Certificate at same location is already used by request
          with nickname "20180530050017".

Well, we have made progress. Installation got a fair way along, but failed because there was already a Certmonger tracking request for the IPA RA certificate.

Certmonger tracking requests

We have to clean up the Certmonger tracking request for the IPA RA certificate. The ipa-ca-install failure helpfully told us the ID of the problematic request. But if we wanted to nail it on the first try we’d have to look it up. We can ask Certmonger to show the tracking request for the certificate file at /var/lib/ipa/ra-agent.pem, where the IPA RA certificate is stored:

# getcert list -f /var/lib/ipa/ra-agent.pem
Number of certificates and requests being tracked: 4.
Request ID '20180530050017':
        status: MONITORING
        stuck: no
        key pair storage: type=FILE,location='/var/lib/ipa/ra-agent.key'
        certificate: type=FILE,location='/var/lib/ipa/ra-agent.pem'
        CA: dogtag-ipa-ca-renew-agent
        issuer: CN=Certificate Authority,O=IPA.LOCAL 201805171453
        subject: CN=IPA RA,O=IPA.LOCAL 201805171453
        expires: 2020-05-06 14:55:30 AEST
        key usage: digitalSignature,keyEncipherment,dataEncipherment
        eku: id-kp-serverAuth,id-kp-clientAuth
        pre-save command: /usr/libexec/ipa/certmonger/renew_ra_cert_pre
        post-save command: /usr/libexec/ipa/certmonger/renew_ra_cert
        track: yes
        auto-renew: yes

Then we can stop tracking it:

# getcert stop-tracking -i 20180530050017
Request "20180530050017" removed.

Now, before we can run ipa-ca-install again, we have an unwanted pki-tomcat instance sitting around. We need to explicitly remove it using pkidestroy:

# pkidestroy -s CA -i pki-tomcat
Log file: /var/log/pki/pki-ca-destroy.20180530165156.log
Loading deployment configuration from /var/lib/pki/pki-tomcat/ca/registry/ca/deployment.cfg.
Uninstalling CA from /var/lib/pki/pki-tomcat.
pkidestroy  : WARNING  ....... this 'CA' entry will NOT be deleted from security domain 'IPA'!
pkidestroy  : WARNING  ....... security domain 'IPA' may be offline or unreachable!
pkidestroy  : ERROR    ....... subprocess.CalledProcessError:  Command '['/usr/bin/sslget', '-n', 'subsystemCert cert-pki-ca', '-p', '7Zc^NEd1%~@rGO%d{)%K:$S5L[^1F1K.!@5oWgZ]e', '-d', '/etc/pki/pki-tomcat/alias', '-e', 'name="/var/lib/pki/pki-tomcat"&type=CA&list=caList&host=f28-0.ipa.local&sport=443&ncsport=443&adminsport=443&agentsport=443&operation=remove', '-v', '-r', '/ca/agent/ca/updateDomainXML', 'f28-0.ipa.local:443']' returned non-zero exit status 3.!
pkidestroy  : WARNING  ....... Directory '/etc/pki/pki-tomcat/alias' is either missing or is NOT a directory!

Uninstallation complete.

ipa-ca-install, attempt 3

Here we go again!

# ipa-ca-install --ca-subject 'CN=IPA.LOCAL NEW CA'
  [10/28]: requesting RA certificate from CA
  [11/28]: setting audit signing renewal to 2 years
  [12/28]: restarting certificate server
  [13/28]: publishing the CA certificate
  [14/28]: adding RA agent as a trusted user
  [15/28]: authorizing RA to modify profiles
  [16/28]: authorizing RA to manage lightweight CAs
  [17/28]: Ensure lightweight CAs container exists
  [18/28]: configure certificate renewals
  [19/28]: configure Server-Cert certificate renewal
  [20/28]: Configure HTTP to proxy connections
  [21/28]: restarting certificate server
  [22/28]: updating IPA configuration
  [23/28]: enabling CA instance
  [24/28]: migrating certificate profiles to LDAP
  [error] RemoteRetrieveError: Failed to authenticate to CA REST API

Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

Unexpected error - see /var/log/ipareplica-ca-install.log for details:
RemoteRetrieveError: Failed to authenticate to CA REST API

Dang! This time the installation failed due to an authentication failure between the IPA framework and Dogtag. This authentication uses the IPA RA certificate. It turns out that Certmonger did not request a new RA certificate. Instead, it tracked the preexisting RA certificate issued by the old CA:

# openssl x509 -text < /var/lib/ipa/ra-agent.pem |grep Issuer
      Issuer: O = IPA.LOCAL 201805171453, CN = Certificate Authority

The IPA framework presents the old RA certificate when authenticating to the new CA. The new CA does not recognise it, so authentication fails. Therefore we need to remove the IPA RA certificate and key before installing a new CA:

# rm -fv /var/lib/ipa/ra-agent.*
removed '/var/lib/ipa/ra-agent.key'
removed '/var/lib/ipa/ra-agent.pem'

Because installation got a fair way along before failing, we also need to:

  • pkidestroy the Dogtag instance (as before)
  • remove Certmonger tracking requests for the RA certificate (as before)
  • remove Certmonger tracking requests for Dogtag system certificates
  • run ipa-certupdate to remove the new CA certificate from trust stores

Also, the deployment now believes that the CA role has been installed on f28-0:

# ipa server-role-find --role 'CA server'
1 server role matched
  Server name: f28-0.ipa.local
  Role name: CA server
  Role status: enabled
Number of entries returned 1

Note Role status: enabled above. We need to remove this record that the CA role is installed on f28-0. Like so:

# ldapdelete -D "cn=Directory Manager" -wXXXXXXXX \

# ipa server-role-find --role 'CA server'
1 server role matched
  Server name: f28-0.ipa.local
  Role name: CA server
  Role status: absent
Number of entries returned 1

Having performed these cleanup tasks, we will try again to install the CA.

ipa-ca-install, attempt 4

# ipa-ca-install --ca-subject 'CN=IPA.LOCAL NEW CA'
  [24/28]: migrating certificate profiles to LDAP
  [25/28]: importing IPA certificate profiles
  [26/28]: adding default CA ACL
  [27/28]: adding 'ipa' CA entry
  [28/28]: configuring certmonger renewal for lightweight CAs
Done configuring certificate server (pki-tomcatd).

Hooray! We made it.


Let’s revisit each of the success criteria and see whether the goal has been achieved.

1. CA role installed and configured as renewal master

# ipa server-role-find --role 'CA server'
1 server role matched
  Server name: f28-0.ipa.local
  Role name: CA server
  Role status: enabled
Number of entries returned 1

# ipa config-show |grep CA
  Certificate Subject base: O=IPA.LOCAL
  IPA CA servers: f28-0.ipa.local
  IPA CA renewal master: f28-0.ipa.local

Looks like this criterion has been met.

2 & 3. LDAP trust store

# ldapsearch -LLL -D cn="Directory manager" -wXXXXXXXX \
    -b "cn=certificates,cn=ipa,cn=etc,dc=ipa,dc=local" \
    -s one ipaCertIssuerSerial cn
dn: cn=CN\3DCertificate Authority\2CO\3DIPA.LOCAL 201805171453,cn=certificates
ipaCertIssuerSerial: CN=Certificate Authority,O=IPA.LOCAL 201805171453;1
cn: CN=Certificate Authority,O=IPA.LOCAL 201805171453

dn: cn=IPA.LOCAL IPA CA,cn=certificates,cn=ipa,cn=etc,dc=ipa,dc=local
ipaCertIssuerSerial: CN=IPA.LOCAL NEW CA;1

The old and new CA certificates are present in the LDAP trust store. The new CA certificate has the appropriate cn value. These criteria have been met.

4. CA can issue certificates

# ipa cert-request --principal alice alice.csr
  Issuing CA: ipa
  Certificate: MIIC0zCCAbugAwIBAgIBCDAN...
  Subject: CN=alice,OU=pki-ipa,O=IPA
  Not Before: Thu May 31 05:14:42 2018 UTC
  Not After: Sun May 31 05:14:42 2020 UTC
  Serial number: 8
  Serial number (hex): 0x8

The certificate was issued by the new CA. Success.

5. Can renew HTTP and LDAP certificates

Because we are still trusting the old CA, there is no immediate need to renew the HTTP and LDAP certificate. But they will eventually expire, so we need to ensure that renewal works. getcert resubmit is used to initiate a renewal:

# getcert resubmit -i 20180530045952
Resubmitting "20180530045952" to "IPA".

# sleep 10

# getcert list -i 20180530045952
Number of certificates and requests being tracked: 9.
Request ID '20180530045952':
        status: MONITORING
        stuck: no
        key pair storage: type=FILE,location='/var/lib/ipa/private/httpd.key',pinfile='/var/lib/ipa/passwds/f28-0.ipa.local-443-RSA'
        certificate: type=FILE,location='/var/lib/ipa/certs/httpd.crt'
        CA: IPA
        issuer: CN=IPA.LOCAL NEW CA
        subject: CN=f28-0.ipa.local,OU=pki-ipa,O=IPA
        expires: 2020-05-31 15:24:05 AEST
        key usage: digitalSignature,nonRepudiation,keyEncipherment,dataEncipherment
        eku: id-kp-serverAuth,id-kp-clientAuth
        pre-save command: 
        post-save command: /usr/libexec/ipa/certmonger/restart_httpd
        track: yes
        auto-renew: yes

The renewal succeeded. Using openssl s_client we can see that the HTTP server is now presenting a certificate chain ending with the new CA certificate:

# echo | openssl s_client -showcerts \
    -connect f28-0.ipa.local:443 -servername f28-0.ipa.local \
    | grep s:
verify return:1
depth=0 O = IPA, OU = pki-ipa, CN = f28-0.ipa.local
verify return:1
 0 s:/O=IPA/OU=pki-ipa/CN=f28-0.ipa.local

So we are looking good against this criterion too.

6. A CA replica can be created

f28-1 was removed from the deployment at the beginning. To test CA replica installation, I enrolled it again using ipa-client-install, then executed ipa-replica-install --setup-ca. Installation completed successfully:

# ipa-replica-install --setup-ca
Password for admin@IPA.LOCAL:
Run connection check to master
Connection check OK
Configuring directory server (dirsrv). Estimated time: 30 seconds
  [1/41]: creating directory server instance
  [26/26]: configuring certmonger renewal for lightweight CAs
Done configuring certificate server (pki-tomcatd).
Configuring Kerberos KDC (krb5kdc)
  [1/1]: installing X509 Certificate for PKINIT
Full PKINIT configuration did not succeed
The setup will only install bits essential to the server functionality
You can enable PKINIT after the setup completed using 'ipa-pkinit-manage'
Done configuring Kerberos KDC (krb5kdc).
Applying LDAP updates
Upgrading IPA:. Estimated time: 1 minute 30 seconds
  [1/9]: stopping directory server
  [2/9]: saving configuration
  [3/9]: disabling listeners
  [4/9]: enabling DS global lock
  [5/9]: starting directory server
  [6/9]: upgrading server
  [7/9]: stopping directory server
  [8/9]: restoring configuration
  [9/9]: starting directory server
Restarting the KDC

We have a clean sweep of the success criteria. Mission accomplished.

Recovery procedure, summarised

Distilling the trial-and-error exploration above down to the essential steps, we end up with the following procedure. Not every step is necessary in every case, and most steps do not necessarily have to be performed in the order shown here.

  1. Delete CA entries:

    # ipa ca-find --pkey-only --all \
        | grep dn: \
        | awk '{print $2}' \
        | xargs ldapdelete -D "cn=Directory Manager" -wXXXXXXXX
  2. Destroy the existing Dogtag instance, if present:

    # pkidestroy -s CA -i pki-tomcat
  3. Delete the CA server role entry for the current host, if present. For example:

    # ldapdelete -D "cn=Directory Manager" -wXXXXXXXX
  4. Move aside the old IPA CA certificate in the LDAP certificate store. By convention, the new RDN should be based on the subject DN. For example:

    % ldapmodrdn -D "cn=Directory Manager" -wXXXXXXXX -r \
        "cn=IPA.LOCAL IPA CA,cn=certificates,cn=ipa,cn=etc,dc=ipa,dc=local" \
        "cn=CN\=Certificate Authority\,O\=IPA.LOCAL 201805171453"
  5. Rename the IPA CA certificate nickname in the NSSDBs at /etc/dirsrv/slapd-{REALM}, /etc/ipa/nssdb and, if relevant, /etc/httpd/alias. Example command:

    # certutil -d /etc/dirsrv/slapd-IPA-LOCAL --rename \
        -n 'IPA.LOCAL IPA CA' --new-n 'OLD IPA CA'
  6. Remove Certmonger tracking requests for all Dogtag system certificates, and remove the tracking request for the IPA RA certificate:

    # for ID in ... ; \
        do certmonger stop-tracking -i $ID ; \
  7. Delete the IPA RA certificate and key:

    # rm -fv /var/lib/ipa/ra-agent.*
    removed '/var/lib/ipa/ra-agent.key'
    removed '/var/lib/ipa/ra-agent.pem'
  8. Run ipa-certupdate.
  9. Run ipa-ca-install.


The procedure developed in this post should cover most cases of CA installation failure or loss of the only CA master in a deployment. Inevitably the differences between versions of FreeIPA mean that the procedure may vary, depending on which version(s) you are using.

In this procedure, the new CA is installed with a different Subject DN. Conceptually, this is not essential. But reusing the same subject DN could cause problems for some programs. I wrote about this in an earlier post. Furthermore, to keep the CA subject DN the same would involve extra steps to ensure that serial numbers were not re-used. I am not interested in investigating how to pull this off. Just choose a new DN!

One feature request we sometimes receive is a CA uninstaller. The steps outlined in this post would suffice to uninstall a CA and erase knowledge of it from a deployment (apart from the CA certificate itself, which you would probably want to keep).

Looking ahead, I (or maybe someone else) could gather the cleanup steps into an easy to use script. Administrators or support personnel who have run into problems can execute the script to quickly restore their server to a state where the CA can (hopefully) successfully be installed.

May 31, 2018 12:00 AM

May 26, 2018

Fabiano Fidencio

openSUSE Conference 2018

This year openSUSE conference was held in Prague and, thanks to both my employer and openSUSE conference organizers, I've been able to spend almost a full day there.

I've headed to Prague with a Fleet Commander talk accepted and, as openSUSE Leap 15.0 was released Yesterday, also with the idea to show an unattended ("express") installation of the "as fresh as possible" Leap 15.0 happening on GNOME Boxes.

The conference was not so big, which helped to easy spot some old friends (Fridrich Strba, seriously? Meeting you after almost 7 years ... I have no words to describe my happiness on seeing you there!), some known faces (as Scott, with whom I just meet at conferences :-)) and also meet some people who either helped me a lot in the past (here I can mention the whole autoyast team who gave me some big support when I was writing down the autoinst.xml for libosinfo, which provides the support to do openSUSE's express installations via GNOME Boxes) or who have some interest in some of the work I've been doing (as Richard Brown who's a well-know figure around SUSE/openSUSE community, a GNOME Boxes user and also an enthusiastic supporter of our work done in libosiinfo/osinfo-db).

About the talks ...

I've re-shaped the very same Fleet Commander talk presented at FOSDEM'18 and also have prepared a demo to show the audience the magic happening on a CentOS 7.5 environment. In the audience we had around 20 people and the talk went considerably well considering that the demo just exploded. After leaving the conference room I took some time to debug what happened and seems that my master machine just hung at some point, thus the client machine wasn't able to download the desktop-profiles data from it (and it hung for so long that the DataProvider was marked as offline) and as I didn't have time to do a "live-debug" session I ended up proceeding with the rest of the talk (curiously, when writing this blog post I've logged in the client machine in order to debug the issue and the first thing that I see is the "pink background"!!!). We've even gotten a few questions! :-)
Sincerely, thanks to everyone who attended the talk!
I'm taking as an action item from this to write down a blog post on how to debug those issues (end-to-end) as, due to amount of components involved, something can go wrong on different parts, different projects and on.

By the end of my Fleet Commander talk, I've taken 5 minutes to say that I'm also a libosinfo maintainer (with a strong interest in the "tooling" part of the virtualization world :-)) and mention that during the trip from Brno to Prague I've crafted some patches adding support to openSUSE Leap 15.0 that was just released Yesterday and I'd like to show them an express installation performed via GNOME Boxes. In order to do so, I've booted the ISO, set up my username and password, clicked on "Create" and left my laptop in the presentation desk till the end of the next presenter's talk (who was Carlos Soriano presenting a nice "DevOps for GNOME with Flatpak" talk). By the end of Carlos' talk, I've just got back the mic and the screen and showed people that the installation have just worked. :-). The patches enabling this were submitted and hopefully we'll have them on both Fedora and openSUSE packages by Wednesday! :-)

So, summing up ... half of the demos worked, I've left both demos with action items (write a troubleshoot page and upstream the patches, which is already done) and I've met some really nice people in an equally nice environment!

Looking forward to attend next openSUSE Conference and thanks a lot for having me there!

by (Fabiano Fidêncio) at May 26, 2018 08:36 PM

May 11, 2018

Fraser Tweedale

Certificate renewal and revocation in FreeIPA

Certificate renewal and revocation in FreeIPA

A recent FreeIPA ticket has prompted a discussion about what revocation behaviour should occur upon certificate renewal. The ticket reported a regression: when renewing a certificate, ipa cert-request was no longer revoking the old certificate. But is revoking the certificate the correct behaviour in the first place?

This post discusses the motivations and benefits of automatically revoking a principal’s certificates when a new certificate is issued. It is assumed that subjects of certificates are FreeIPA principals. Conclusions do not necessarily apply to other environments or use cases.

Description of current behaviour

Notwithstanding the purported regression mentioned above, the current behaviour of FreeIPA is:

  • for host and service principals: when a new certificate is issued, revoke previous certificate(s)
  • for user principals: never automatically revoke certificates

The revocation behaviour that occurs during ipa cert-request is actually defined in ipa {host,service}-mod. That is, when a userCertificate attribute value is removed, the removed certificates get revoked.

One certificate per service: a bad assumption?

The automatic revocation regime makes a big assumption. Host or service principals are assumed to need only one certificate. This is usually the case. But it is not inconceivable that a service may need multiple certificates for different purposes. The current (intended) behaviour prevents a service from possessing multiple valid (non-revoked) certificates concurrently.

Certificate issuance scenarios

Let us abandon the assumption that a host or service only needs one certificate at a time. There are three basic scenarios where cert-request would be revoked to issue a certificate to a particular principal. In each scenario, there are different motivations and consequences related to revocation. We will discuss each scenario in turn.

Certificate for new purpose (non-renewal)

A certificate is being requested for some new purpose. The subject may already have certs issued to it for other purposes. Existing certificates should not be revoked. FreeIPA’s revocation behaviour excludes this use case for host and service certificates.

Renewal due to impending expiry

A certificate may be requested to renew an existing certificate. After the new certificate is issued, it does no harm to revoke the old certificate. But it is not necessary to revoke it; it will expire soon.

Renewal for other reasons

A certificate could be renewed in advance of its expiration time for any reasons (e.g. re-key due to compromise, add a Subject Alternative Name, etc.) Conservatively, we’ll lump all the possible reasons together and say that it is necessary to revoke the certificate that is being replaced.

What if the subject possesses multiple certificates for different purposes? Right now, for host and service principals we revoke them all.

Proposed changes

A common theme is emerging. When we request a certificate, we want to revoke at most one certificate, i.e. the certificate being renewed (if any). This suggestion is applicable to service/host certificates as well as user certificates. It would admit the multiple certificates for different purposes use case for all principal types.

How do we get there from where we are now?

Observe that the ipa cert-request currently does not know (a) whether the request is a renewal or (b) what certificate is being renewed. Could we make cert-request smart enough to guess what it should do? Fuzzy heuristics that could be employed to make a guess, e.g. by examining certificate attributes, validity period, the subject public key, the profile and CA that were used, and so on. The guessing logic would be complex, and could not guarantee a correct answer. It is not the right approach.

Perhaps we could remove all revocation behaviour from ipa cert-request. This would actually be a matter of suppressing the revocation behaviour of ipa {host,service}-mod. Revocation has always been available via the ipa cert-revoke command. This approach makes revocation a separate, explicit step.

Note that renewals via Certmonger could perform revocation via ipa cert-revoke in the renewal helper. If you had to re-key or reissue a certificate via getcert resubmit, it could revoke the old certificate automatically. The nice thing here is that there is no guesswork involved. Certmonger knows what cert it is tracking so it can nominate the certificate to revoke and leave the subject’s other certificates alone.

A nice middle ground might be to add a new option to ipa cert-request to specify the certificate that is being renewed/replaced, so that cert-request can revoke just that certificate, and remove it from the subject principal’s LDAP entry. The command might look something like:

% ipa cert-request /path/to/req.csr \
    --principal HTTP/ \
    --replace "CN=Certificate Authority,O=EXAMPLE.COM;42"

The replace option specifies the issuer and serial number of the certificate being replaced. After the new certificate is issued, ipa cert-request would attempt to revoke the specified certificate, and remove it from the principal’s userCertificate attribute. Certmonger would be able to supply the replace option (or whatever we call it).

For any of the above suggestions it would be necessary to prominently and clearly outline the changes in release notes. The change in revocation behaviour could catch users off guard. It is important not to rush any changes through. We’ll need to engage with our user base to explain the changes, and outline steps to preserve the existing revocation behaviour if so desired.

ipa {host,service}-mod changes

Another (independent) enhancement to consider is an option to suppress the revocation behaviour of ipa {host,service}-mod, so that certificates could be removed from host/service entries without revoking them. A simple --no-revoke flag would suffice.


In this post I discussed how the current revocation behaviour of FreeIPA prevents hosts and services from using multiple certificates for different purposes. This is not the majority use case but I feel that we should support the use case. And we can, with a refinement of ipa cert-request behaviour.

We ought to make it possible to revoke only the certificate being renewed. We can do this by preventing ipa cert-request from revoking certs and requiring a separate call to ipa cert-revoke. behaviour of cert-request Alternatively, we can add an option to ipa cert-request for explicitly specifying the certificate(s) to revoke. In either case, the Certmonger renewal helpers can be changed to ensure that renewals via Certmonger revoke the old certificate (while leaving the subject’s other certificates alone!)

What do you think of the changes I’ve suggested? You can contribute to the discussion on the freeipa-devel mailing list.

May 11, 2018 12:00 AM

April 28, 2018

Fabiano Fidencio

Reporting Issues!

It's a recurrent thing on #sssd channel that people show up with different kind of questions, expecting straight forward answers about the problem they're having.

While I understand the expectations, it's not that easy for anyone there to help you without knowing context, without seeing logs, without seeing configuration files.

We (as SSSD team) have written down some documents that may help you and I sincerely would like to suggest people to take a look at those documents at the first thing. So, if you're having an issue, please, go through:

In our "User facing documentation" we have material about:

All of those should be useful to, at least, give us or meaningful information that we would be able to start helping you!

Also, keep in mind that we do not spend our entire day checking #sssd. We still have to fix the bugs we have, come up with new cool stuff and whatnot. It means that just dropping a message on #sssd may not be the best way to get a quick answer (although sometimes it may work!).

So, please, do not be afraid of file an issue on, following the instructions provided here. File the issue, provide us as much info as possible and drop us a ping on IRC in case we don't reply to your bug report quickly enough.

Last but not least, please, do not be this person:
21:27 -!- person [---] has joined #sssd
21:27 <person> Does there exist software worse than SSSD?
21:27 <person> I really don't think so
21:27 -!- person [---] has left #sssd []

Although I can understand how gratifying is to telling us how bad our software is (mind that I also do not think it's great), we'd like to hear from the user what makes it so bad and then, hopefully, be able to improve it somehow. But for doing this, we need meaningful bug reports, patience from our users to wait till the issue is fixed and to deal with us with some back and forth of testing packages and, mainly, understanding that we may have been busy with a bunch of different issues (which, please, does not mean that your issue is not important to us ... it is, we just need enough time to get to it!).

by (Fabiano Fidêncio) at April 28, 2018 08:11 AM

April 25, 2018

William Brown

AD directory admins group setup

AD directory admins group setup

Recently I have been reading many of the Microsoft Active Directory best practices for security and hardening. These are great resources, and very well written. The major theme of the articles is “least privilege”, where accounts like Administrators or Domain Admins are over used and lead to further compromise.

A suggestion that is put forward by the author is to have a group that has no other permissions but to manage the directory service. This should be used to temporarily make a user an admin, then after a period of time they should be removed from the group.

This way you have no Administrators or Domain Admins, but you have an AD only group that can temporarily grant these permissions when required.

I want to explore how to create this and configure the correct access controls to enable this scheme.

Create our group

First, lets create a “Directory Admins” group which will contain our members that have the rights to modify or grant other privileges.

# /usr/local/samba/bin/samba-tool group add 'Directory Admins'
Added group Directory Admins

It’s a really good idea to add this to the “Denied RODC Password Replication Group” to limit the risk of these accounts being compromised during an attack. Additionally, you probably want to make your “admin storage” group also a member of this, but I’ll leave that to you.

# /usr/local/samba/bin/samba-tool group addmembers "Denied RODC Password Replication Group" "Directory Admins"

Now that we have this, lets add a member to it. I strongly advise you create special accounts just for the purpose of directory administration - don’t use your daily account for this!

# /usr/local/samba/bin/samba-tool user create da_william
User 'da_william' created successfully
# /usr/local/samba/bin/samba-tool group addmembers 'Directory Admins' da_william
Added members to group Directory Admins

Configure the permissions

Now we need to configure the correct dsacls to allow Directory Admins full control over directory objects. It could be possible to constrain this to only modification of the cn=builtin and cn=users container however, as directory admins might not need so much control for things like dns modification.

If you want to constrain these permissions, only apply the following to cn=builtins instead - or even just the target groups like Domain Admins.

First we need the objectSID of our Directory Admins group so we can build the ACE.

# /usr/local/samba/bin/samba-tool group show 'directory admins' --attributes=cn,objectsid
dn: CN=Directory Admins,CN=Users,DC=adt,DC=blackhats,DC=net,DC=au
cn: Directory Admins
objectSid: S-1-5-21-2488910578-3334016764-1009705076-1104

Now with this we can construct the ACE.


This permission grants:

  • RP: read property
  • WP: write property
  • LC: list child objects
  • LO: list objects
  • RC: read control

It could be possible to expand these rights: it depends if you want directory admins to be able to do “day to day” ad control jobs, or if you just use them for granting of privileges. That’s up to you. An expanded ACE might be:

# Same as Enterprise Admins

Now lets actually apply this and do a test:

# /usr/local/samba/bin/samba-tool dsacl set --sddl='(A;CI;RPWPLCLORC;;;S-1-5-21-2488910578-3334016764-1009705076-1104)' --objectdn='dc=adt,dc=blackhats,dc=net,dc=au'
# /usr/local/samba/bin/samba-tool group addmembers 'directory admins' administrator -U 'da_william%...'
Added members to group directory admins
# /usr/local/samba/bin/samba-tool group listmembers 'directory admins' -U 'da_william%...'
# /usr/local/samba/bin/samba-tool group removemembers 'directory admins' -U 'da_william%...'
Removed members from group directory admins
# /usr/local/samba/bin/samba-tool group listmembers 'directory admins' -U 'da_william%...'

It works!


With these steps we have created a secure account that has limited admin rights, able to temporarily promote users with privileges for administrative work - and able to remove it once the work is complete.

April 25, 2018 02:00 PM

April 19, 2018

William Brown

Understanding AD Access Control Entries

Understanding AD Access Control Entries

A few days ago I set out to work on making samba 4 my default LDAP server. In the process I was forced to learn about Active Directory Access controls. I found that while there was significant documentation around the syntax of these structures, very little existed explaining how to use them effectively.

What’s in an ACE?

If you look at the the ACL of an entry in AD you’ll see something like:


This seems very confusing and complex (and someone should write a tool to explain these … maybe me). But once you can see the structure it starts to make sense.

Most of the access controls you are viewing here are DACLs or Discrestionary Access Control Lists. These make up the majority of the output after ‘O:DAG:DAD:AI’. TODO: What does ‘O:DAG:DAD:AI’ mean completely?

After that there are many ACEs defined in SDDL or ???. The structure is as follows:


Each of these fields can take varies types. These interact to form the access control rules that allow or deny access. Thankfully, you don’t need to adjust many fields to make useful ACE entries.

MS maintains a document of these field values here.

They also maintain a list of wellknown SID values here

I want to cover some common values you may see though:


Most of the types you’ll see are “A” and “OA”. These mean the ACE allows an access by the SID.


These change the behaviour of the ACE. Common values you may want to set are CI and OI. These determine that the ACE should be inherited to child objects. As far as the MS docs say, these behave the same way.

If you see ID in this field it means the ACE has been inherited from a parent object. In this case the inherit_object_guid field will be set to the guid of the parent that set the ACE. This is great, as it allows you to backtrace the origin of access controls!


This is the important part of the ACE - it determines what access the SID has over this object. The MS docs are very comprehensive of what this does, but common values are:

  • RP: read property
  • WP: write property
  • CR: control rights
  • CC: child create (create new objects)
  • DC: delete child
  • LC: list child objects
  • LO: list objects
  • RC: read control
  • WO: write owner (change the owner of an object)
  • WD: write dac (allow writing ACE)
  • SW: self write
  • SD: standard delete
  • DT: delete tree

I’m not 100% sure of all the subtle behaviours of these, because they are not documented that well. If someone can help explain these to me, it would be great.


We will skip some fields and go straight to SID. This is the SID of the object that is allowed the rights from the rights field. This field can take a GUID of the object, or it can take a “well known” value of the SID. For example ‘AN’ means “anonymous users”, or ‘AU’ meaning authenticated users.


I won’t claim to be an AD ACE expert, but I did find the docs hard to interpret at first. Having a breakdown and explanation of the behaviour of the fields can help others, and I really want to hear from people who know more about this topic on me so that I can expand this resource to help others really understand how AD ACE’s work.

April 19, 2018 02:00 PM

April 17, 2018

William Brown

Making Samba 4 the default LDAP server

Making Samba 4 the default LDAP server

Earlier this year Andrew Bartlett set me the challenge: how could we make Samba 4 the default LDAP server in use for Linux and UNIX systems? I’ve finally decided to tackle this, and write up some simple changes we can make, and decide on some long term goals to make this a reality.

What makes a unix directory anyway?

Great question - this is such a broad topic, even I don’t know if I can single out what it means. For the purposes of this exercise I’ll treat it as “what would we need from my previous workplace”. My previous workplace had a dedicated set of 389 Directory Server machines that served lookups mainly for email routing, application authentication and more. The didn’t really process a great deal of login traffic as the majority of the workstations were Windows - thus connected to AD.

What it did show was that Linux clients and applications:

  • Want to use anonymous binds and searchs - Applications and clients are NOT domain members - they just want to do searches
  • The content of anonymous lookups should be “public safe” information. (IE nothing private)
  • LDAPS is a must for binds
  • MemberOf and group filtering is very important for access control
  • sshPublicKey and userCertificate;binary is important for 2fa/secure logins

This seems like a pretty simple list - but it’s not the model Samba 4 or AD ship with.

You’ll also want to harden a few default settings. These include:

  • Disable Guest
  • Disable 10 machine join policy

AD works under the assumption that all clients are authenticated via kerberos, and that kerberos is the primary authentication and trust provider. As a result, AD often ships with:

  • Disabled anonymous binds - All clients are domain members or service accounts
  • No anonymous content available to search
  • No LDAPS (GSSAPI is used instead)
  • no sshPublicKey or userCertificates (pkinit instead via krb)
  • Access control is much more complex topic than just “matching an ldap filter”.

As a result, it takes a bit of effort to change Samba 4 to work in a way that suits both, securely.

Isn’t anonymous binding insecure?

Let’s get this one out the way - no it’s not. In every pen test I have seen if you can get access to a domain joined machine, you probably have a good chance of taking over the domain in various ways. Domain joined systems and krb allows lateral movement and other issues that are beyond the scope of this document.

The lack of anonymous lookup is more about preventing information disclosure - security via obscurity. But it doesn’t take long to realise that this is trivially defeated (get one user account, guest account, domain member and you can search …).

As a result, in some cases it may be better to allow anonymous lookups because then you don’t have spurious service accounts, you have a clear understanding of what is and is not accessible as readable data, and you don’t need every machine on the network to be domain joined - you prevent a possible foothold of lateral movement.

So anonymous binding is just fine, as the unix world has shown for a long time. That’s why I have very few concerns about enabling it. Your safety is in the access controls for searches, not in blocking anonymous reads outright.

Installing your DC

As I run fedora, you will need to build and install samba for source so you can access the heimdal kerberos functions. Fedora’s samba 4 ships ADDC support now, but lacks some features like RODC that you may want. In the future I expect this will change though.

These documents will help guide you:


build steps

install a domain

I strongly advise you use options similar to:

/usr/local/samba/bin/samba-tool domain provision --server-role=dc --use-rfc2307 --dns-backend=SAMBA_INTERNAL --realm=SAMDOM.EXAMPLE.COM --domain=SAMDOM --adminpass=Passw0rd

Allow anonymous binds and searches

Now that you have a working domain controller, we should test you have working ldap:

/usr/local/samba/bin/samba-tool forest directory_service dsheuristics 0000002 -H ldaps://localhost --simple-bind-dn=''
ldapsearch -b DC=samdom,DC=example,DC=com -H ldaps://localhost -x

You can see the domain object but nothing else. Many other blogs and sites recommend a blanket “anonymous read all” access control, but I think that’s too broad. A better approach is to add the anonymous read to only the few containers that require it.

/usr/local/samba/bin/samba-tool dsacl set --objectdn=DC=samdom,DC=example,DC=com --sddl='(A;;RPLCLORC;;;AN)' --simple-bind-dn="" --password=Passw0rd
/usr/local/samba/bin/samba-tool dsacl set --objectdn=CN=Users,DC=samdom,DC=example,DC=com --sddl='(A;CI;RPLCLORC;;;AN)' --simple-bind-dn="" --password=Passw0rd
/usr/local/samba/bin/samba-tool dsacl set --objectdn=CN=Builtin,DC=samdom,DC=example,DC=com --sddl='(A;CI;RPLCLORC;;;AN)' --simple-bind-dn="" --password=Passw0rd

In AD groups and users are found in cn=users, and some groups are in cn=builtin. So we allow read to the root domain object, then we set a read on cn=users and cn=builtin that inherits to it’s child objects. The attribute policies are derived elsewhere, so we can assume that things like kerberos data and password material are safe with these simple changes.

Configuring LDAPS

This is a reasonable simple exercise. Given a ca cert, key and cert we can place these in the correct locations samba expects. By default this is the private directory. In a custom install, that’s /usr/local/samba/private/tls/, but for distros I think it’s /var/lib/samba/private. Simply replace ca.pem, cert.pem and key.pem with your files and restart.

Adding schema

To allow adding schema to samba 4 you need to reconfigure the dsdb config on the schema master. To show the current schema master you can use:

/usr/local/samba/bin/samba-tool fsmo show -H ldaps://localhost --simple-bind-dn='' --password=Password1

Look for the value:

SchemaMasterRole owner: CN=NTDS Settings,CN=LDAPKDC,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au

And note the CN=ldapkdc = that’s the hostname of the current schema master.

On the schema master we need to adjust the smb.conf. The change you need to make is:

    dsdb:schema update allowed = yes

Now restart the instance and we can update the schema. The following LDIF should work if you replace ${DOMAINDN} with your namingContext. You can apply it with ldapmodify

dn: CN=sshPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: attributeSchema
cn: sshPublicKey
name: sshPublicKey
lDAPDisplayName: sshPublicKey
description: MANDATORY: OpenSSH Public key
oMSyntax: 4
isSingleValued: FALSE
searchFlags: 8

dn: CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: classSchema
cn: ldapPublicKey
name: ldapPublicKey
description: MANDATORY: OpenSSH LPK objectclass
lDAPDisplayName: ldapPublicKey
subClassOf: top
objectClassCategory: 3
defaultObjectCategory: CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
mayContain: sshPublicKey

dn: CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: modify
replace: auxiliaryClass
auxiliaryClass: ldapPublicKey
auxiliaryClass: posixAccount
auxiliaryClass: shadowAccount
sudo ldapmodify -f sshpubkey.ldif -D '' -w Password1 -H ldaps://localhost
adding new entry "CN=sshPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

adding new entry "CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

modifying entry "CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

To my surprise, userCertificate already exists! The reason I missed it is a subtle ad schema behaviour I missed. The ldap attribute name is stored in the lDAPDisplayName and may not be the same as the CN of the schema element. As a result, you can find this with:

ldapsearch -H ldaps://localhost -b CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au -x -D '' -W '(attributeId='

This doesn’t solve my issues: Because I am a long time user of 389-ds, that means I need some ns compat attributes. Here I add the nsUniqueId value so that I can keep some compatability.

dn: CN=nsUniqueId,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: attributeSchema
attributeID: 2.16.840.1.113730.3.1.542
cn: nsUniqueId
name: nsUniqueId
lDAPDisplayName: nsUniqueId
description: MANDATORY: nsUniqueId compatability
oMSyntax: 4
isSingleValued: TRUE
searchFlags: 9

dn: CN=nsOrgPerson,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: classSchema
governsID: 2.16.840.1.113730.3.2.334
cn: nsOrgPerson
name: nsOrgPerson
description: MANDATORY: Netscape DS compat person
lDAPDisplayName: nsOrgPerson
subClassOf: top
objectClassCategory: 3
defaultObjectCategory: CN=nsOrgPerson,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
mayContain: nsUniqueId

dn: CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: modify
replace: auxiliaryClass
auxiliaryClass: ldapPublicKey
auxiliaryClass: posixAccount
auxiliaryClass: shadowAccount
auxiliaryClass: nsOrgPerson

Now with this you can extend your users with the required data for SSH, certificates and maybe 389-ds compatability.

/usr/local/samba/bin/samba-tool user edit william  -H ldaps://localhost --simple-bind-dn=''


Out of the box a number of the unix attributes are not indexed by Active Directory. To fix this you need to update the search flags in the schema.

Again, temporarily allow changes:

    dsdb:schema update allowed = yes

Now we need to add some indexes for common types. Note that in the nsUniqueId schema I already added the search flags. We also want to set that these values should be preserved if they become tombstones so we can recove them.

/usr/local/samba/bin/samba-tool schema attribute modify uid --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify nsUniqueId --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify uidnumber --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify gidnumber --searchflags=9
# Preserve on tombstone but don't index
/usr/local/samba/bin/samba-tool schema attribute modify x509-cert --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify sshPublicKey --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify gecos --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify loginShell --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify home-directory --searchflags=24

AD Hardening

We want to harden a few default settings that could be considered insecure. First, let’s stop “any user from being able to domain join machines”.

/usr/local/samba/bin/samba-tool domain settings account_machine_join_quota 0 -H ldaps://localhost --simple-bind-dn=''

Now let’s disable the Guest account

/usr/local/samba/bin/samba-tool user disable Guest -H ldaps://localhost --simple-bind-dn=''

I plan to write a more complete samba-tool extension for auditing these and more options, so stay tuned!

SSSD configuration

Now that our directory service is configured, we need to configure our clients to utilise it correctly.

Here is my SSSD configuration, that supports sshPublicKey distribution, userCertificate authentication on workstations and SID -> uid mapping. In the future I want to explore sudo rules in LDAP with AD, and maybe even HBAC rules rather than GPO.

Please refer to my other blog posts on configuration of the userCertificates and sshKey distribution.

ignore_group_members = False

# There is a bug in SSSD where this actually means "ipv6 only".
# lookup_family_order=ipv6_first
cache_credentials = True
id_provider = ldap
auth_provider = ldap
access_provider = ldap
chpass_provider = ldap
ldap_search_base = dc=blackhats,dc=net,dc=au

# This prevents an infinite referral loop.
ldap_referrals = False
ldap_id_mapping = True
ldap_schema = ad
# Rather that being in domain users group, create a user private group
# automatically on login.
# This is very important as a security setting on unix!!!
# See this bug if it doesn't work correctly.
auto_private_groups = true

ldap_uri = ldaps://
ldap_tls_reqcert = demand
ldap_tls_cacert = /etc/pki/tls/certs/bh_ldap.crt

# Workstation access
ldap_access_filter = (memberOf=CN=Workstation Operators,CN=Users,DC=blackhats,DC=net,DC=au)

ldap_user_member_of = memberof
ldap_user_gecos = cn
ldap_user_uuid = objectGUID
ldap_group_uuid = objectGUID
# This is really important as it allows SSSD to respect nsAccountLock
ldap_account_expire_policy = ad
ldap_access_order = filter, expire
# Setup for ssh keys
ldap_user_ssh_public_key = sshPublicKey
# This does not require ;binary tag with AD.
ldap_user_certificate = userCertificate
# This is required for the homeDirectory to be looked up in the sssd schema
ldap_user_home_directory = homeDirectory

services = nss, pam, ssh, sudo
config_file_version = 2
certificate_verification = no_verification

domains =
homedir_substring = /home

pam_cert_auth = True







With these simple changes we can easily make samba 4 able to perform the roles of other unix focused LDAP servers. This allows stateless clients, secure ssh key authentication, certificate authentication and more.

Some future goals to improve this include:

  • CLI tools to manage sshPublicKeys and userCertificates easily
  • Ship samba 4 with schema templates that can be used
  • Schema querying (what objectclass takes this attribute?)
  • Group editing (same as samba-tool user edit)
  • Security auditing tools
  • user/group modification commands
  • Refactor and improve the cli tools python to be api driven - move the logic from netcmd into samdb so that samdb can be an API that python can consume easier. Prevent duplication of logic.

The goal is so that an admin never has to see an LDIF ever again.

April 17, 2018 02:00 PM

April 10, 2018

Nathaniel McCallum

/boot on btrfs Subvolume

Historically, installing Fedora with /boot on a btrfs subvolume was impossible. The reason for this was actually somewhat trivial: grubby had a bug where it failed to detect the path to the kernel and initramfs when a btrfs subvolume was used. Fortunately, that bug has now been fixed.

However, there is one minor issue that still prevents you from installing stock Fedora in this configuration. Due to the aforementioned grubby bug, Anaconda excluded btrfs subvolumes from its internal whitelist for /boot. I have filed a trivial patch against Anaconda which enables this functionality. However, the Anaconda team does not have the time to review this patch for Fedora 28. You can follow this Anaconda effort at this bug as well.

The good news is that we can bypass this blocker using the magic of Anaconda update images! For your convenience, I have created these update images for Fedora 27. You can find them here. I will update them for Fedora 28 once it is released.

Installing Fedora 27 with /boot on a btrfs Subvolume

Begin by downloading your Fedora 27 media of choice and booting it. When you get to the bootloader menu, press tab.

Fedora 27 Bootloader Menu

This brings up the kernel command line interface. We will simply add the URL to the Anaconda update image as you see in the image below. Please note that you will need to adjust the architecture in the URL for your installation target.

Adding the Anaconda Update Image

Triple-check that you type this URL correctly as this will be the most common source of errors in this process. When you are sure it is correct, press Enter to continue. If you mistype it, Anaconda will fail to download the image and will give you an error.

From here you will continue the install as normal until you come to the main Anaconda menu screen shown below. You will need to choose the “Installation Source” option.

Main Anaconda Menu

Once in the Installation Source submenu, you will need to uncheck the checkbox shown in the image below. Doing this enables Anaconda to download and install the latest version of the packages (in our case, grubby) from the internet rather than the version that is on the install media. This step is necessary for Fedora 27 since we need the fixed grubby. However, this should be unnecessary for Fedora 28 which should contain a new enough version.

Installation Source

That’s it! Just continue the install like normal and put your /boot partition on a btrfs subvolume. Everything should work correctly from here.


We’d love to hear your feedback! If you encounter problems - or simply want to report your success - please leave a comment on one of the relevant bugs above.

April 10, 2018 03:36 PM

April 05, 2018

Adam Young

Recursive DNS and FreeIPA

DNS is essential to Kerberos. Kerberos Identity for servers is based around host names, and if you don’t have a common view between client and server, you will not be able to access your remote systems. Since DNS is an essential part of FreeIPA, BIND is one of the services integrated into the IPA server.

When a user wants to visit a public website, like this one, they click a link or type that URL into their browsers navigation bar. The browser then requests the IP address for the hostname inside the URL from the operating system via a library call. On a Linux based system, the operating system makes the DNS call to the server specified in /etc/resolv.conf. But what happens if the DNS server does not know the answer? It depends on how it is configured. In the simple case, where the server is not allowed to make additional calls, it returns a response that indicates the record is not found.

Since IPA is supposed to be the one-source-of-truth for a client system, it is common practice to register the IPA server as the sole DNS resolver. As such, it cannot just short-circuit the request. Instead, it performs a recursive search to the machines it has set up as Forwarders. For example, I often will set up a sample server that points to the google resolver at Or, now CloudFlare has DNS privacy enabled, I might use that.

This is fine inside controlled environments, but is sub-optimal if the DNS portion of the IPA server is accessible on the public internet. It turns out that forwarding requests allows a DNS server to be used to attack these DNS servers via a distributed denial of service attack. In this attack, the attackers sends the request to all DNS servers that are acting as forwarders, and these forwarders hammer on the central DNS servers.

If you have set up a FreeIPA server on the public internet, you should plan on disabling Recursive DNS queries. You do this by editing the file /etc/named.conf and setting the values:

allow-recursion {"none";};
recursion no;

And restarting the named service.

And then everything breaks. All of your IPA clients can no longer resolve anything except the entries you have in your IPA server.

The fix for that is to add the (former) DNS forward address as a nameserver entry in /etc/resolv.conf on each machine, including your IPA server. Yes, it is a pain, but it limits the query capacity to only requests local to those machines. For example, if my IPA server is on (yes I know this is not routable, just for example) my resolve.conf would look like.


If you wonder if your Nameserver has this problem, use this site to test it.

by Adam Young at April 05, 2018 05:38 PM

Red Hat Blog

Ultimate Guide to Red Hat Summit 2018 Labs: Hands-on with RHEL

red hat summit labs

This year you’ve got a lot of decisions to make before you got to Red Hat Summit in San Francisco, CA from 8-10 May 2018.

There are breakout sessionsbirds-of-a-feather sessionsmini sessionspanelsworkshops, and instructor led labs that you’re trying to juggle into your daily schedule. To help with these plans, let’s try to provide an overview of the labs in this series.

In this article let’s examine a track focusing only on Red Hat Enterprise Linux (RHEL). It’s a selection of labs where you’ll get hands-on with package management, OS security, dig into RHEL internals, build a RHEL image for the cloud and more.

The following hands-on labs are on the agenda, so let’s look at the details of each one.

From source to RPM in 120 minutes

In this lab, we’ll learn best practises for packaging software using the Red Hat Enterprise Linux native packaging format, RPM. We’ll cover how to properly build software from source code into RPM packages, create RPM packages from precompiled binaries, and automate RPM builds from source code version control systems (such as git) for use in CI/DevOps environments. And finally, we’ll hear tips and tricks from lessons learned, such as how to set up and work with pristine build environments and why such things are important to software packaging.
Presenters: Adam Miller, Red Hat, Rob Marti


The definitive Red Hat Enterprise Linux 7 hands-on lab

In this hands-on lab, Red Hat field solution architects will lead attendees through a mix of self-paced and instructor-led exercises covering the core and updated capabilities of Red Hat Enterprise Linux 7.4, including:

– Service startup and management with systemd
– Performance tuning & monitoring with performance co-pilot, tuned, and numad
– Storage management with ssm (System Storage Manager)
– Data de-duplication and compression with Permabit
– Network interface management with nmcli (Network Manager CLI)
– Dynamic firewall with firewalld
– System administration with Cockpit
– System security audit with oscap-workbench (OpenSCAP)
– System backup and recovery with rear (Relax and Recover)

Presenters: Chrsitoph Doerbeck, Red Hat, Matthew St Onge, Red Hat, Joe Hackett, Red Hat, Rob Wilmoth, Red Hat, Eddie Chen, Red Hat

Defend yourself using built-in Red Hat Enterprise Linux security technologies

In this lab, you’ll learn about the built-in security technologies available to you in Red Hat Enterprise Linux.

You will use OpenSCAP to scan and remediate against vulnerabilities and configuration security baselines. You will then block possible attacks from vulnerabilities using Security-Enhanced Linux (SELinux) and use Network Bound Disk Encryption to securely decrypt your encrypted boot volumes unattended. You will also use USBGuard to implement basic white listing and black listing to define which USB devices are and are not authorized and how a USB device may interact with your system. Throughout your investigation of the security issues in your systems, you will utilize the improved audit logs and automate as many of your tasks as possible using Red Hat Ansible Automation. Finally, you will make multiple configuration changes to your systems across different versions of Red Hat Enterprise Linux running in your environment, in an automated fashion, using the Systems Roles feature.

Presenters: Lucy Kerner, Red Hat, Miroslav Grepl, Red Hat, Paul Moore, Red Hat, Martin Preisler, Red Hat, Peter Beniaris

Up and Running with Red Hat Identity Management

Red Hat identity management (IdM) can play a central role in user authentication, authorization, and control. It can manage critical security components such as SSH keys, host-based access controls, and SELinux contexts—in a standalone environment or in a trust relationship with a Microsoft Active Directory domain controller.

In this lab, you’ll learn:

– Basic installation and configuration of IdM
– Configuration of an IdM replica – Joining of clients to the IdM domain
– Basic user and host management activities
– sudo setup
– SSH key management

Attendees will leave with a lab that can be repeated in their own environments and form the basics for a rudimentary environment.

Presenters: James Wildman, Red Hat, Chuck Mattern, Red Hat

Building a Red Hat Enterprise Linux gold image for Azure

ultimate guide red hat summit labsIn this practical lab, we’ll walk through the process of how to build a standard gold image for Red Hat Enterprise Linux that works in the Microsoft Azure public cloud. Customize the image for best practises, install the Windows Azure Live Agent, and convert the disk image to the Windows Azure format. This is necessary if you want to use Cloud Access with Azure, and not the marketplace.
Presenters: Dan Kinkead, Red Hat, James Read, Red Hat, Goetz Rieger, Red Hat, El Bedri Khaled, Microsoft

Stay tuned for more Red Hat Summit 2018 Labs and watch for more online under the tag #RHSummitLabs.

by Eric D. Schabell at April 05, 2018 05:00 AM

March 26, 2018

Jakub Hrozek

IPA sudo rules for local users

Central identity management solutions like IPA offer a very nice capability – in addition to storing users and groups on the server, also certain kinds of policies can be defined in one place and used by all the machines in the domain.

Sudo rules are one of the most common policies that administrators define for their domain in place of distributing or worse, hand-editing /etc/sudoers. But what if the administrator would take advantage of the central management of sudo rules, but apply these sudo rules to a user who is not part of the domain, but only exists in the traditional UNIX files, like /etc/passwd?

This blog post illustrates several strategies for achieving that.

Option 1: Move the user to IPA

This might be the easiest solution from the point of view of policy management that allows you to really keep everything in one place. But it might not be always possible to move the user. A common issue arises with UIDs and file ownership. If the UIDs are somewhat consistent on majority of hosts, perhaps it would be possible to use the ID views feature of IPA to present the UID a particular system expects. If it’s not feasible to move the user to the IPA domain, there are other options to consider.

Option 2: Use the files provider of SSSD

One of the core design principles of SSSD is that a user and its policies must both be served from SSSD and even from the same SSSD domain. This prevents unexpected results in case of overlapping user names between SSSD domains or between the data SSSD is handling and the rest of the system.

But at the same time, local users are by definition anchored in /etc/passwd. To be able to serve the local users (and more, for example better performance), SSSD ships a files data provider since its 1.15 release. At the moment (as of 1.16.1), the files release mostly mirrors the contents of /etc/passwd and /etc/groups and presents them via the SSSD interfaces. And that’s precisely what is needed to be able to serve the sudo rules as well.

Let’s consider an example. A client is enrolled in an IPA domain and there is a user called lcluser on the client. We want to let the user run sudo less to be able to display the contents of any file on the client.

Let’s start with adding the sudo rule on the IPA server. First, we add the less command itself:
$ ipa sudocmd-add --desc='For reading log files' /usr/bin/less

Then we add the rule and link the user:
$ ipa sudorule-add readfiles
$ ipa sudorule-add-user --users=lcluser

Note that we lcluser doesn’t exist in IPA, yet we were able to add the user with the regular sudorule-add-user command. When we inspect the sudo rule, we can see that IPA detected that the user is external to its directory and created a special attribute externalUser where the username was added to, unlike the IPA user admin, which is linked to its LDAP object.
$ ipa sudorule-show --all --raw readfiles
dn: ipaUniqueID=31051b50-30d5-11e8-b1a7-5254004e66c1,cn=sudorules,cn=sudo,dc=ipa,dc=test
cn: readfiles
ipaenabledflag: TRUE
hostcategory: all
externaluser: lcluser
memberuser: uid=admin,cn=users,cn=accounts,dc=ipa,dc=test
ipaUniqueID: 31051b50-30d5-11e8-b1a7-5254004e66c1
memberallowcmd: ipaUniqueID=179de976-30d5-11e8-8e98-5254004e66c1,cn=sudocmds,cn=sudo,dc=ipa,dc=test
objectClass: ipaassociation
objectClass: ipasudorule

Finally, let’s link the sudo rule with the host we want to allow the sudo rule at:
ipa sudorule-add-host readfiles --hosts ipa.client.test

Now, we can configure the IPA client. We being by defining a new SSSD domain that will mirror the
local users, but we also add a ‘sudo_provider’:
id_provider = files
sudo_provider = ldap
ldap_uri = ldap://unidirect.ipa.test

The reason it is preferable to use the IPA sudo provider over the LDAP provider is that the LDAP provider relies on schema conversion on the IPA server from IPA’s own schema into the schema normally used when storing sudo rules in LDAP as described in man sudoers.ldap. This conversion is done by the slapi-nis plugin and can be somewhat costly in large environments.

Finally, we enable the domain in the [sssd] section:
services = nss, pam, ssh, sudo, ifp
domains = files, ipa.test

Now, we should be able to run sudo as “lcluser”:
$ su - lcluser
[lcluser@client ~]$ sudo -l
[sudo] password for lcluser:
Matching Defaults entries for lcluser on client:
XAUTHORITY", secure_path=/sbin\:/bin\:/usr/sbin\:/usr/bin

User lcluser may run the following commands on client:
(root) /usr/bin/less

Woo, it works!

Option 3: Use the proxy SSSD provider proxying to files

The files provider was introduced to SSSD in the 1.15 version. If you are running an older SSSD version, this provider might not be available. Still, you can use SSSD, just with somewhat more convoluted configuration using id_provider=proxy. All the server-side steps can be taken from the previous paragraph, just the client-side configuration will look like this:

id_provider = proxy
proxy_lib_name = files
proxy_pam_target = none

sudo_provider = ldap
ldap_uri = ldap://unidirect.ipa.test

Using the files provider is preferable, because it actively watches the changes to /etc/passwd and group using inotify, so the provider receives any change done to the files instantly. The proxy provider reaches out to the files as if it were a generic data source. Also, as you can see, the configuration is not as nice, but it’s a valid option for older releases.

Option 4: Use the LDAP connector built into sudo

If using SSSD is not an option at all, because perhaps you are configuring a non-Linux client, you can still point sudo to the IPA LDAP server manually. I’m not going to go into details, because the setup is nicely described in “man sudoers.ldap”.

Unlike all the above options, this doesn’t give you any caching, though.

by jhrozek at March 26, 2018 10:33 AM

Fraser Tweedale

Can we teach an old Dogtag new tricks?

Can we teach an old Dogtag new tricks?

Dogtag is a very old program. It started at Netscape. It is old enough to vote. Most of it was written in the early days of Java, long before generics or first-class functions. A lot of it has hardly been touched since it was first written.

Old code often follows old practices that are no longer reasonable. This is not an indictment on the original programmers! The capabilities of our tools usually improve over time (certainly true for Java). The way we solve problems often improves over time too, through better libraries and APIs. And back in the ’90s sites like Stack Overflow didn’t exist and there wasn’t as much free software to learn from. Also, observe that Dogtag is still here, 20 years on, used by customers and being actively developed. This is a huge credit to the original developers and everyone who worked on Dogtag in the meantime.

But we cannot deny that today we have a lot of very old Java code that follows outdated practices and is difficult to reason about and maintain. And maintain it we must. Bugs must be fixed, and new features will be developed. Can Dogtag’s code be modernised? Should it be modernised?

Costs of change, costs of avoiding change

One option is to accept and embrace the status quo. Touch the old code as little as possible. Make essential fixes only. Do not refactor classes or interfaces. When writing new code, use the existing interfaces, even if they allow (or demand) unsafe use.

There is something to be said for this approach. Dogtag has bugs, but it is “battle hardened”. It is used by large organisations in security-critical infrastructure. Changing things introduces a risk of breaking things. The bigger the change, the bigger the risk. And Dogtag users are some of the biggest, most security-conscious and risk-conscious organisations out there.

On the other hand, persisting with the old code has some drawbacks too. First, there are certainly undiscovered bugs. Avoiding change except when there is a known defect means those bugs will stay hidden—until they manifest themselves in an unpleasant way! Second, old interfaces that require, for example, unsafe mutation of objects, can lead to new bugs when we do fix bugs or implement new features. Finally, existing code that is difficult to reason about, and interfaces that are difficult to use, slow down fixes and new development.

Case study: ACLs

Dogtag uses access control lists (ACLs) to govern what users can do in the system. The text representation of an ACL (with wrapping and indication for presentation only) looks like:
  :allow (list,read) user="anybody"
    ;allow (create,modify,delete) group="Administrators"
  :Administrators may create and modify lightweight authorities

The fields are:

  1. Name of the ACL
  2. List of permissions covered by the ACLs
  3. List of ACL entries. Each entry either grants or denies the listed permissions to users matching an expression
  4. Comment

The above ACL grants lightweight CA read permission to all users, while only members of the Administrators group can create, modify or delete them. A typical Dogtag CA subsystem might have around 60 such ACLs. The authorisation subsystem is responsible for loading and enforcing ACLs.

I have touched the ACL machinery a few times in the last couple of years. Most of the changes were bug fixes but I also implemented a small enhancement for merging ACLs with the same name. These were tiny changes; most ACL code is unchanged from prehistoric (pre-Git repo) times. The implementation has several significant issues. Let’s look at a few aspects.

Broken parsing

The ACL.parseACL method (source) converts the textual representation of an ACL into an internal representation. It’s about 100 lines of Java. Internally it calls ACLEntry.parseACLEntry which is another 40 lines.

The implementation is ad-hoc and inflexible. Fields are found by scanning for delimiters, and their contents are handled in a variety of ways. For fields that can have multiple values, StringTokenizer is used, as in the following (simplified) example:

StringTokenizer st = new StringTokenizer(entriesString, ";");
while (st.hasMoreTokens()) {
    String entryString = st.nextToken();
    ACLEntry entry = ACLEntry.parseACLEntry(acl, entryString);
    if (entry == null)
        throw new EACLsException("failed to parse ACL entries");

So what happens if you have an ACL like the following? Note the semicolon in the group name.

certificate:issue:allow (read) group="sysadmin;pki"
  :PKI sysadmins can read certificates

The current parser will either fail, or succeed but yield an ACL that makes no sense (I’m not quite sure which). I found a similar issue in real world use where group names contained a colon. The parser was scanning forward for a colon to determine the end of the ACL entries field:

int finalDelimIdx = unparsedInput.indexOf(":");
String entriesString = unparsedInput.substring(0, finalDelimIdx);

This was fixed by scanning backwards from the end of the string for the final colon:

int finalDelimIdx = unparsedInput.lastIndexOf(":");
String entriesString = unparsedInput.substring(0, finalDelimIdx);

Now colons in group names work as expected. But it is broken in a different way: if the comment contains a colon, parsing will fail. These kinds of defects are symptomatic of the ad-hoc, brittle parser implementation.

Incomplete parsing

ACLEntry.parseACLEntry method does not actually parse the access expressions. An ACL expression can look like:

user="caadmin" || group="Administrators"

The expression is saved in the ACLEntry as-is, i.e. as a string. Parsing is deferred to ACL evaluation. Parsing work is repeated every time the entry is evaluated. The deferral also means that invalid expressions are silently allowed and can only be noticed when they are evaluated. The effect of an invalid expression depends on the kind of syntax error, and the behaviour of the access evaluator.

Access evaluator expressions

The code that parses access evaluator expressions (e.g. user="bob") will accept any of =, !=, > or <, even when the nominated access evaluator does not handle the given operator. For example, user>"bob" will be accepted, but the user access evaluator only handles = and !=. It is up to each access evaluator to handle invalid operators appropriately. This is a burden on the programmer. It’s also confusing for users in that semantically invalid expressions like user>"bob" do not result in an error.

Furthermore, the set of access evaluator operators is not extensible. Dogtag administrators can write their own access evaluators and configure Dogtag to use them. But these can only use the =, !=, > or < operators. If you need more than four operators, need non-binary operators, or would prefer different operator symbols, too bad.

ACL evaluation

The AAclAuthz class (source) contains around 400 lines of code for evaluating an ACLs for a given user and permissions. (This includes the expression parsing discussed above). In addition, the typical access evaluator class (UserAccessEvaluator, GroupAccessEvaluator, etc.) has about 20 to 40 lines of code dealing with evaluation. The logic is not straightforward to follow.

There is at least one major bug in this code. There is a global configuration that controls whether an ACL’s allow rules or deny rules are processed first. The default is deny,allow, but if you change it to allow,deny, then a matching allow rule will cause denial! Observe (example simplified and commentary added by me):

if (order.equals("deny")) {
    // deny,allow, the default
    entries = getDenyEntries(nodes, perm);
} else {
    // allow,deny
    entries = getAllowEntries(nodes, perm);

while (entries.hasMoreElements()) {
    ACLEntry entry = entries.nextElement();
    if (evaluateExpressions(
            entry.getAttributeExpressions())) {
        // if we are in allow,deny mode, we just hit
        // a matching *allow* rule, and deny access
        throw new EACLsException("permission denied");

The next step of this routine is to process the next set of rules. Like above, if we are in allow,deny mode and encounter a matching deny rule, access will be granted.

This is a serious bug! It completely reverses the meaning of ACLs. In most cases the environment will be completely broken. It also poses a security issue. Because of how broken this setting is, the Dogtag team thinks that it’s unlikely that anyone is running in allow,deny mode. But we can’t be sure, so the bug was assigned CVE-2018-1080.

This defect is present in the initial commit in the Dogtag Git repository (2008). It might have been present in the original implementation. But whenever it was introduced, the problem was not noticed. Several developers who made small changes over the years to the ACL code (logging, formatting, etc) did not notice it. Including me, until very recently.

How has this bug existed for so long? There are several possible factors:

  • Lack of tests, or at least lack of testing in allow,deny mode
  • Verbose, hard to read code makes it hard to notice a bug that might be more obvious in “pseudo-code”.
  • Boolean blindness. A boolean is just a bit, divorced from the context that constructed it. This can lead to misinterpretation. In this case, the boolean result of evaluateExpressions was misinterpreted as allow|deny; the correct interpretation is match|no-match.
  • Lack of code review. Perhaps peer code review was not practiced when the original implementation was written. Today all patches are reviewed by another Dogtag developer before being merged (we use Gerrit for that). There is a chance (but not a guarantee) we might have noticed that bug. Maybe a systematic review of old code is warranted.

A better way?

So, looking at one small but important part of Dogtag, we see an old, broken implementation. Some of these problems can be fixed easily (the allow,deny bug). Others require more work (fixing the parsing, extensible access evaluator operators).

Is it worth fixing the non-critical issues? Taking Java as an assumption, it is debatable. The implementation could be cleaned up, type safety improved, bugs fixed. But Java being what it is, even if a lot of the parsing complexity was handled by libraries, the result would still be fairly verbose. Readability and maintainability would still be limited, because of the limitations of Java itself.

So let’s refine our assumption. Instead of Java, we will assume JVM. This opens up to us a bunch of languages that target the JVM, and libraries written using those languages. Dogtag will probably never leave the JVM, for various reasons. But there’s no technical reason we can’t replace old, worn out parts made of Java with new implementations written using languages that have more to offer in terms of correctness, readability and maintainability.

There are many languages that target the JVM and interoperate with Java. One such language is Haskell, an advanced, pure functional programming (FP) language. JVM support for Haskell comes in the guise of Eta. Eta is a fork of GHC (the most popular Haskell compiler) version 7.10, so any pure Haskell code that worked with GHC 7.10 will work with Eta. I won’t belabour any more gory details of the toolchain right now. Instead, we can dive right into a prototype of ACLs written in Haskell/Eta.

I Haskell an ACL

I assembled a Haskell prototype (source code) of the ACL machinery in one day. Much of this time was spent reading the Java implementation so I could preserve its semantics.

The prototype is not complete. It does not support serialisation of ACLs or the heirarchical nature of ACL evaluation (i.e. checking an authorisation on resource would check ACLs named, and foo). It does support parsing and evaluation. We shall see that it resolves the problems in the Java implementation discussed above.

The implementation is about 250 lines of code, roughly ⅓ the size of the Java implementation. It is much easier to read and reason about. Let’s look at a few highlights.

The definitions of the ACL data type, and its constituents, are straightforward:

type Permission = Text  -- type synonym, for convenience

data ACLRuleType = Allow | Deny
  deriving (Eq) -- auto-derive an equality
                -- test (==) for this type

-- a record type with 3 fields
data ACLRule = ACLRule
  { aclRuleType :: ACLRuleType
  , aclRulePermissions :: [Permission]
  , aclRuleExpression :: ACLExpression

data ACL = ACL
  { aclName :: Text
  , aclPermissions :: [Permission]
  , aclRules :: [ACLRule]
  , aclDescription :: Text

The definition of the ACL parser follows the structure of the data type. This aids readability and assists reasoning about correctness:

acl :: [Parser AccessEvaluator] -> Parser ACL
acl ps = ACL
  <$> takeWhile1 (/= ':') <* char ':'
  <*> (permission `sepBy1` char ',') <* char ':'
  <*> (rule ps `sepBy1` spaced (char ';')) <* char ':'
  <*> takeText

Each line is a parser for one of the fields of the ACL data type. The <$> and <*> infix functions combine these smaller parsers into a parser for the whole ACL type. permission and rule are parsers for the Permission and ACLRule data types, respectively. The sepBy1 combinator turns a parser for a single thing into a parser for a list of things.

Note that several of these combinators are not specific to parsers but are derived from, or part of, a common abstraction that parsers happen to inhabit. The actual parser library used is incidental. A simple parser type and all the combinators used in this ACL implementation, written from scratch, would take all of 50 lines.

The [Parser AccessEvaluator] argument (named ps) is a list of parsers for AccessEvaluator. This provides the access evaluator extensibility we desire while ensuring that invalid expressions are rejected. The details are down inside the implementation of rule and are not discussed here.

Next we’ll look at how ACLs are evaluated:

data ACLRuleOrder = AllowDeny | DenyAllow

data ACLResult = Allowed | Denied

  :: ACLRuleOrder
  -> AuthenticationToken
  -> Permission
  -> ACL
  -> ACLResult
evaluateACL order tok perm (ACL _ _ rules _ ) =
  fromMaybe Denied result  -- deny if no rules matched
    permRules =
      filter (elem perm . aclRulePermissions) rules

    orderedRules = case order of
      DenyAllow -> denyRules <> allowRules
      AllowDeny -> allowRules <> denyRules
    denyRules =
      filter ((== Deny) . aclRuleType) permRules
    allowRules =
      filter ((== Allow) . aclRuleType) permRules

    -- the first matching rule wins
    result = getFirst
      (foldMap (First . evaluateRule tok) orderedRules)

Given an ACLRuleOrder, an AuthenticationToken bearing user data, a Permission on the resource being accessed and an ACL for that resource, evaluateACL returns an ACLResult (either Allowed or Denied. The implementation filters rules for the given permission, orders the rules according to the ACLRuleOrder, and returns the result of the first matching rule, or Denied if no rules were matched.

  :: AuthenticationToken
  -> ACLRule
  -> Maybe ACLResult
evaluateRule tok (ACLRule ruleType _ expr) =
  if evaluateExpression tok expr
    then Just (result ruleType)
    else Nothing
    result Deny = Denied
    result Allow = Allowed

Could the allow,deny bug from the Java implementation occur here? It cannot. Instead of the rule evaluator returning a boolean as in the Java implementation, evaluateRule returns a Maybe ACLResult. If a rule does not match, its result is Nothing. If it does match, the result is Just Denied for Deny rules, or Just Allowed for Allow rules. The first Just result encountered is used directly. It’s still possible to mess up the implementation, for example:

result Deny = Allowed
result Allow = Deny

But this kind of error is less likely to occur and more likely to be noticed. Boolean blindness is not a factor.

Benefits of FP for prototyping

There are benefits to using functional programming for prototyping or re-implementing parts of a system written in less expressive langauges.

First, a tool like Haskell lets you express the nature of a problem succinctly, and leverage the type system as a design tool as you work towards a solution. The solution can then be translated into Java (or Python, or whatever). Because of the less powerful (or nonexistent) type system, there will be a trade-off. You will either have to throw away some of the type safety, or incur additional complexity to keep it (how much complexity depends on the target language). It would be better if we didn’t have to make this trade-off (e.g. by using Eta). But the need to make the trade-off does not diminish the usefulness of FP as a design tool.

It’s also a great way of learning about an existing part of Dogtag, and checking assumptions. And for finding bugs, and opportunities for improving type safety, APIs or performance. I learned a lot about Dogtag’s ACL implementation by reading the code to understand the problem, then solving the problem using FP. Later, I was able to translate some aspects of the Haskell implementation (e.g. using sum types to represent ACL rule types and the evaluation order setting) back into the Java implementation (as enum types). This improved type safety and readability.

Going forward, for significant new code and for fixes or refactorings in isolated parts of Dogtag’s implementation, I will spend some time representing the problems and designing solutions in Haskell. The resulting programs will be useful artifacts in their own right; a kind of documentation.

Where to from here?

I’ve demonstrated some of the benefits of the Haskell implementation of ACLs. If the Dogtag development team were to agree that we should begin using FP in Dogtag itself, what would the next steps be?

Eta is not yet packaged for Fedora, let alone RHEL. So as a first step we would have to talk to product managers and release engineers about bringing Eta into RHEL. This is probably the biggest hurdle. One team asking for a large and rather green toolchain that’s not used anywhere else (yet) to be brought into RHEL, where it will have to be supported forever, is going to raise eyebrows.

If we clear that hurdle, then comes the work of packaging Eta. Someone (me) will have to become the package mantainer. And by the way, Eta is written in (GHC) Haskell, so we’ll also need to package GHC for RHEL (or RHEL-extras). Fortunately, GHC is packaged for Fedora, so there is less to do there.

The final stage would be integrating Eta into Dogtag. The build system will need to be updated, and we’ll need to work out how we want to use Eta-based functions and objects from Java (and vice-versa). For the ACLs system, we might want to make the old and new implementations available side by side, for a while. We could even run both implementations simultaneously in a sanity check mode, checking that results are consistent and emitting a warning when they diverge.


This post started with a discussion of the costs and risks of making (or avoiding) significant changes in a legacy system. We then looked in detail at the ACLs implementation in Dogtag, noting some of its problems.

We examined a prototype (re)implementation of ACLs in Haskell, noting several advantages over the legacy implementation. FP’s usefulness as a design tool was discussed. Then we discussed the possibility of using FP in Dogtag itself. What would it take to start using Haskell in Dogtag, via the Eta compiler which targets the JVM? There are several hurdles, technical and non-technical.

Is it worth all this effort, just to be in a position where we can (re)write even a small component of Dogtag in a language other than Java? A language that assists the programmer in writing correct, readable and maintainable software? In answering this question, the costs and risks of persisting with legacy languages and APIs must be considered. I believe the answer is “yes”.

March 26, 2018 12:00 AM

March 15, 2018

Fraser Tweedale

DN attribute value encoding in X.509

DN attribute value encoding in X.509

X.509 certificates use the X.500 Distinguished Name (DN) data type to represent issuer and subject names. X.500 names may contain a variety of fields including CommonName, OrganizationName, Country and so on. This post discusses how these values are encoded and compared, and problematic circumstances that can arise.

ASN.1 string types and encodings

ASN.1 offers a large number of string types, including:

  • NumericString
  • PrintableString
  • IA5String
  • UTF8String
  • BMPString
  • …several others

When serialising an ASN.1 object, each of these string types has a different tag. Some of the types have a shared representation for serialisation but differ in which characters they allow. For example, NumericString and PrintableString are both represented in DER using one byte per character. But NumericString only allows digits (09) and SPACE, whereas PrintableString admits the full set of ASCII printable characters. In contrast, BMPString uses two bytes to represent each character; it is equivalent to UTF-16BE. UTF8String, unsurprisingly, uses UTF-8.

ASN.1 string types for X.509 name attributes

Each of the various X.509 name attribute types uses a specific ASN.1 string type. Some types have a size constraint. For example:

X520countryName      ::= PrintableString (SIZE (2))
DomainComponent      ::= IA5String
X520CommonName       ::= DirectoryName (SIZE (1..64))
X520OrganizationName ::= DirectoryName (SIZE (1..64))

Hold on, what is DirectoryName? It is not a universal ASN.1 type; it is specified as a sum of string types:

DirectoryName ::= CHOICE {
    teletexString     TeletexString,
    printableString   PrintableString,
    universalString   UniversalString,
    utf8String        UTF8String,
    bmpString         BMPString }

Note that a size constraint on DirectoryName propagates to each of the cases. The constraint gives a maximum length in characters, not bytes.

Most X.509 attribute types use DirectoryName, including common name (CN), organization name (O), organizational unit (OU), locality (L), state or province name (ST). For these attribute types, which encoding should be used? RFC 5280 § provides some guidance:

The DirectoryString type is defined as a choice of PrintableString,
TeletexString, BMPString, UTF8String, and UniversalString.  CAs
conforming to this profile MUST use either the PrintableString or
UTF8String encoding of DirectoryString, with two exceptions.

The current version of X.509 only allows PrintableString and UTF8String. Earlier versions allowed any of the types in DirectoryString. The exceptions mentioned are grandfather clauses that permit the use of the now-prohibited types in environments that were already using them.

So for strings containing non-ASCII code points UTF8String is the only type you can use. But for ASCII-only strings, there is still a choice, and the RFC does not make a recommendation on which to use. Both are common in practice.

This poses an interesting question. Suppose two encoded DNs have the same attributes in the same order, but differ in the string encodings used. Are they the same DN?

Comparing DNs

RFC 5280 §7.1 outlines the procedure for comparing DNs. To compare strings you must convert them to Unicode, translate or drop some special-purpose characters, and perform case folding and normalisation. The resulting strings are then compared case-insensitively. According to this rule, DNs that use different string encodings but are otherwise the same are equal.

But the situation is more complex in practice. Earlier versions of X.509 required only binary comparison of DNs. For example, RFC 3280 states:

Conforming implementations are REQUIRED to implement the following
name comparison rules:

   (a)  attribute values encoded in different types (e.g.,
   PrintableString and BMPString) MAY be assumed to represent
   different strings;

   (b) attribute values in types other than PrintableString are case
   sensitive (this permits matching of attribute values as binary

   (c)  attribute values in PrintableString are not case sensitive
   (e.g., "Marianne Swanson" is the same as "MARIANNE SWANSON"); and

   (d)  attribute values in PrintableString are compared after
   removing leading and trailing white space and converting internal
   substrings of one or more consecutive white space characters to a
   single space.

Futhermore, RFC 5280 and earlier versions of X.509 state:

The X.500 series of specifications defines rules for comparing
distinguished names that require comparison of strings without regard
to case, character set, multi-character white space substring, or
leading and trailing white space.  This specification relaxes these
requirements, requiring support for binary comparison at a minimum.

This is a contradiction. The above states that binary comparison of DNs is acceptable, but other sections require a more sophisticated comparison algorithm. The combination of this contradiction, historical considerations and (no doubt) programmer laziness means that many X.509 implementations only perform binary comparison of DNs.

How CAs should handle DN attribute encoding

To ease certification path construction with clients that only perform binary matching of DNs, RFC 5280 states the following requirement:

When the subject of the certificate is a CA, the subject
field MUST be encoded in the same way as it is encoded in the
issuer field (Section in all certificates issued by
the subject CA.  Thus, if the subject CA encodes attributes
in the issuer fields of certificates that it issues using the
TeletexString, BMPString, or UniversalString encodings, then
the subject field of certificates issued to that CA MUST use
the same encoding.

This is confusing wording, but in practical terms there are two requirements:

  1. The Issuer DN on a certificate must be byte-identical to the Subject DN of the CA that issued it.
  2. The attribute encodings in a CA’s Subject DN must not change (e.g. when the CA certificate gets renewed).

If a CA violates either of these requirements breakage will ensue. Programs that do binary DN comparison will be unable to construct a certification path to the CA.

For end-entity (or leaf) certificates, the subject DN is not use in any links of the certification path. Changing the subject attribute encoding when renewing an end-entity certificate will not break validation. But it could still confuse some programs that only do binary comparison of DNs (e.g. they might display two distinct subjects).

Processing certificate requests

What about when processing certificate requests—should CAs respect the attribute encodings in the CSR? In my experience, CA programs are prone to issuing certificates with the subject encoded differently from how it was encoded in the CSR. CAs may do various kinds of validation, substitution or addition of subject name attributes. Or they may enforce the use of a particular encoding regardless of the encoding in the CSR.

Is this a problem? It depends on the client program. In my experience most programs can handle this situation. Problems mainly arise when the issuer or subject encoding changes upon renewal (for the reasons discussed above).

If a CSR-versus-certificate encoding mismatch does cause a problem for you, you may have to create a new CSR with the attributes encoding you expect the CA to use for the certificate. In many programs this is not straightforward, if it is possible at all. If you control the CA you might be able to configure it to use particular encodings for string attributes, or to respect the encodings in the CSR. The options available and how to configure them vary among CA programs.


X.509 requires the use of either PrintableString or UTF8String for most DN attribute types. Strings consisting of printable 7-bit ASCII characters can be represented using either encoding. This ambiguity can lead to problems in certification path construction.

Formally, two DNs that have the same attributes and values are the same DN, regardless of the string encodings used. But there are many programs that only perform binary matching of DNs. To avoid causing problems for such programs a CA:

  • must ensure that the Issuer DN field on all certificates it issues is identical to its own Subject DN;
  • must ensure that Subject DN attribute encodings on CA certificates it issues to a given subject do not change upon renewal;
  • should ensure that Subject DN attribute encodings on end-entity certificates it issues to a given subject do not change upon renewal.

CAs will often issue certificates with values encoded differently from how they were presented in the CSR. This usually does not cause problems. But if it does cause problems, you might be able to configure the client program to produce a CSR with different attribute encodings. If you control the CA you may be able to configure it to have a different treatment for attribute encodings. How to do these things was beyond the scope of this article.

March 15, 2018 12:00 AM

Powered by Planet