FreeIPA Identity Management planet - technical blogs

April 04, 2026

Alexander Bokovoy

kurbu5: MIT Kerberos plugins in Rust

For a couple of years, Andreas Schneider and I have been working on a project we call the ‘local authentication hub’: an effort to use the Kerberos protocol to track authentication and authorization context for applications, regardless of whether the system they run on is enrolled into a larger organizational domain or is standalone. We aim to reuse the code and experience we got while developing Samba and FreeIPA over the past twenty years.

Local authentication hub

The local authentication hub relies on a Kerberos KDC available on demand on each system. We achieved this by allowing MIT Kerberos to communicate over UNIX domain sockets. On Linux systems, systemd allows processes to be started on demand when someone connects to a UNIX domain socket, and MIT Kerberos 1.22 has support for this mode.

A KDC accessible over a UNIX domain socket is not very useful in itself: it is only available within the context of a single machine (or a single container, or pod, if UNIX domain sockets are shared across multiple containers). Otherwise, it is a fully featured KDC with its own quirks. And we can start looking at what could be improved based on the enhanced context locality we have achieved. For example, a KDB driver can see host-specific network interfaces and thus be able to react to requests such as host/<ip.ad.dr.ess>@LOCALKDC-REALM dynamically—something that a centrally-managed KDC would only do through statically registered service principal names (SPNs), which are a pain to update as machines move across networks.

Adding support for dynamic features means new code needs to be written. MIT Kerberos is written in C, so our choices are either to continue writing in C or to integrate with whatever new language we choose. Initially, we kept the local KDC database driver written in C and decided to build the infrastructure we need in Rust. The end goal is to have most bits written in Rust.

The local KDC database isn’t supposed to handle millions of principal entries, but even for millions of them, MIT Kerberos has a pretty good default database driver built on LMDB: klmdb. We wanted to get out of the data store business and instead focus on higher-level logic. Thus, we made the same change I made in Samba around 2003 for virtual file system modules: we introduced support for stackable KDB drivers. This is also a part of the MIT Kerberos 1.22 release: a KDB driver implementation can ask the KDC to load a different KDB driver and choose to delegate some requests to it. The local KDC driver is using klmdb for that purpose.

With the database handled for us by klmdb, we focused on the local KDC-specific logic. We wanted to dynamically discover user principals from the operating system so that administrators do not need to maintain separate databases for them. systemd provides a userdb API to query such information over a varlink interface (also available over a UNIX domain socket) in a structured way, using JSON format. Thus, the Kirmes project was born. Kirmes is a Rust data library backed by the userdb API. It handles varlink communication through the wonderful Zlink library and exposes both asynchronous and synchronous access to user and group information.

The local KDC database driver prototype used the Kirmes C API. We demonstrated it at FOSDEM 2025: a user lookup is done over varlink, and if a user is present on the system, their Kerberos key is then looked up in klmdb using a specially-formatted userdb:<username> principal. You still need to handle those keys somehow, but there is a way to avoid that: use RADIUS.

Pre-authentication

A bit of historical reference. In 2012, Red Hat collaborated with MIT to introduce a KDC-side implementation of RFC 6560 (the OTP pre-authentication mechanism; at that point implemented in a proprietary solution by the RSA corporation). This mechanism allowed the KDC to get a hint out of a KDB driver and ask a RADIUS server to authenticate the credentials provided by the Kerberos client. Unlike traditional Kerberos symmetric keys, in this case, the client is sending a plain-text credential over the Kerberos protocol, and this credential can be forwarded to the RADIUS server. The plain-text nature of the RADIUS credential requires the use of a secure communication channel, and a good part of RFC 6560 relies on Flexible Authentication Secure Tunneling (FAST, RFC6113), where a pre-existing Kerberos ticket is used to encrypt the content of that tunnel.

Since ~2013, FreeIPA has used this mechanism to provide multi-factor authentication mechanisms: HOTP/TOTP tokens, RADIUS proxying to remote servers, the OAuth2 device authorization grant flow, and FIDO2 tokens. The list of mechanisms can be extended, as long as the model fits into the somewhat constrained Kerberos exchange flow. FreeIPA handles all communication from the KDC side via a local UNIX domain socket-activated daemon, ipa-otpd, which performs a user principal lookup and then decides on the details of how that user will be authenticated.

For the local KDC case, we used a similar approach but wrote a simplified version, localkdc-pam-auth, which uses PAM to authenticate user credentials. It works well and allows for a drop-in replacement: once the local KDC is set up, users defined on the system will automatically be able to receive Kerberos tickets, with no need to change any passwords or migrate their credentials into the Kerberos KDC. All we need now is the business logic to guide the KDC to use the OTP pre-authentication mechanism so that our RADIUS ‘proxy’ (localkdc-pam-auth) gets activated. This logic is implemented and will be available in the first localkdc release soon.

API bindings

But back to the KDC side. As mentioned above, our goal was to write the local KDC database driver in a modern, safe language. Interfacing Rust with the MIT Kerberos KDC means building an interface that allows aligning code on both sides. This is what this blog is actually about (sorry for the long prelude…): how to make an MIT Kerberos KDB driver in Rust.

Today I published Kurbu5, a project that aims to provide these API bindings to Rust. The name is a transliteration of “krb5” into Mesopotamian cuneiform phonology: Kurbu-ḫamšat-qaqqadī—”The Blessed Five-Headed One”.

Creating API bindings is tedious work: there are many interfaces, each representing multiple functions and structures. MIT Kerberos has 12 interfaces which altogether expose roughly 117 methods that plugin authors implement, backed by around 70 supporting types (data structures passed into and out of those methods). It all sounds like a Tolkien tale: nine interfaces for core Kerberos functionality (checking password quality, mapping hostnames to Kerberos realms, mapping Kerberos principals to local accounts, selecting which credential cache to use, handling pre-authentication on both the client and server side, enforcing KDC policy, authorizing PKINIT certificates, and auditing events on the KDC side), the database backend interface, and two administrative interfaces. This is something that could be automated with agentic workflows—which I did to allow a parallel porting effort. The resulting agent instructions are useful artifacts in themselves: they show how to work when porting MIT Kerberos C code to Rust.

The result is split over several Rust crates to allow targeted reuse. The bulk of the code lives in three crates. The core Kerberos plugin crate (kurbu5-rs) is the largest at around 12,600 lines. The database backend crate (kurbu5-kdb-rs) follows at 5,600 lines, and the administration crate (kurbu5-kadm5-rs) at 3,100 lines. The remaining crates—the proc-macro derives and the raw FFI sys crates—are much smaller, with the sys crates being almost trivially thin (the KDB and kadm5 ones are under 40 lines each, since they mostly just re-export bindings from the main sys crate).

All crates are available on crates.io and share the same MIT license as the original MIT Kerberos.

  • kurbu5-sys — Raw FFI bindings to the MIT Kerberos libkrb5 and KDB plugin API
  • kurbu5-derive — Proc-macro derives for kurbu5-rs non-KDB plugin interfaces
  • kurbu5-rs — Safe, idiomatic Rust API for writing MIT Kerberos non-KDB plugin modules
  • kurbu5-kdb-sys — KDB plugin API re-export — thin wrapper over kurbu5-sys adding libkdb5 linkage
  • kurbu5-kdb-derive — Proc-macro derive for kurbu5-kdb-rs KDB driver plugins
  • kurbu5-kdb-rs — Safe, idiomatic Rust API for writing MIT Kerberos KDB driver plugins
  • kurbu5-kadm5-sys — KADM5 plugin API bindings — links libkadm5srv_mit and re-exports kurbu5-sys types
  • kurbu5-kadm5-derive — Proc-macro derives for kurbu5-kadm5-rs KADM5_AUTH and KADM5_HOOK plugin interfaces
  • kurbu5-kadm5-rs — Safe, idiomatic Rust API for writing MIT Kerberos KADM5_AUTH and KADM5_HOOK plugin modules

In the localkdc project, we use kurbu5 to build a KDB driver and provide our audit plugin. We also have an experimental re-implementation of the OTP pre-authentication mechanism, both client and KDC sides, that was used to test interoperability with MIT Kerberos versions. The core of the KDB driver is ~520 lines of heavily documented Rust code, mostly handling business logic.

April 04, 2026 07:10 PM

March 23, 2026

Alexander Bokovoy

ASN.1 for legacy apps: Synta

Pretty much everything I deal with requires parsing ASN.1 encodings. ASN.1 definitions published as part of internet RFCs: certificates are encoded using DER, LDAP exchanges use BER, Kerberos packets are using DER as well. ASN.1 use is a never ending source of security issues in pretty much all applications. Having safer ASN.1 processing is important to any application developer.

In FreeIPA we are using three separate ASN.1 libraries: pyasn1 and x509 (part of PyCA) for Python code, and asn1c code generator for C code. In fact, we use more: LDAP server plugins also use OpenLDAP’s lber library, while Kerberos KDC plugins also use internal MIT Kerberos parsers.

The PyCA developers noted in their State of OpenSSL statement:

[…] when pyca/cryptography migrated X.509 certificate parsing from OpenSSL to our own Rust code, we got a 10x performance improvement relative to OpenSSL 3 (n.b., some of this improvement is attributable to advantages in our own code, but much is explainable by the OpenSSL 3 regressions). Later, moving public key parsing to our own Rust code made end-to-end X.509 path validation 60% faster — just improving key loading led to a 60% end-to-end improvement, that’s how extreme the overhead of key parsing in OpenSSL was.

That’s 16x performance improvement over the OpenSSL 3. OpenSSL did improve their performance since then but it still pays an overhead for a very flexible design to allow loading cryptographic implementations from dynamic modules (providers). Enablement for externally-provided modules is essential to allow adding new primitives and support for government-enforced standards (such as FIPS 140) where implementations have to be validated in advance and code changes cannot come without expensive and slow re-validation process.

Nevertheless, in FreeIPA we focus on integrating with Linux distributions. Fedora, CentOS Stream, and RHEL enforce crypto consolidation rules, where all packaged applications must be using the same crypto primitives provided by the operating system. We can process metadata ourselves but all cryptographic operations still have to go through OpenSSL and NSS. And paying large performance costs during metadata processing would be hurting to infrastructure components such as FreeIPA.

FreeIPA is a large beast. Aside from its management component, written in Python, it has more than a dozen plugins for 389-ds LDAP server, plugins for MIT Kerberos KDC, plugins for Samba, and tight integration with SSSD, all written in C. Its default certificate authority software, Dogtag PKI, is written in Java and relies on own stack of Java and C dependencies. We are using PyCA’s x509 module for certificate processing in Python code but we cannot use it and underlying ASN.1 libraries in C as those libraries aren’t exposed to C applications or intentionally limited in their functionality to PKI-related tasks.

For the 2026-2028, I’m focusing on enabling FreeIPA to handle post-quantum cryptography (PQC), as a part of the Quantum-Resistant Cryptography in Practice (QARC) project. The project is funded by the European Union under the Horizon Europe framework programme (Grant Agreement No. 101225691) and supported by the European Cybersecurity Competence Centre. One of well publicized aspects of moving to PQC certificates is their sizes. The following table 5 is from Post-Quantum Cryptography for Engineers IETF draft summarizes it well:

PQ Security Level Algorithm Public key size (bytes) Private key size (bytes) Signature size(bytes)
Traditional RSA2048 256 256 256
Traditional ECDSA-P256 64 32 64
1 FN-DSA-512 897 1281 666
2 ML-DSA-44 1312 2560 2420
3 ML-DSA-65 1952 4032 3309
5 FN-DSA-1024 1793 2305 1280
5 ML-DSA-87 2592 4896 4627

Public keys for ML-DSA-65 certificates 7.6x bigger than RSA-2048 ones. You need to handle public keys in multiple situations: when performing certificates’ verification against known certificate authorities (CAs), when matching their properties for validation and identity derivation during authorization, when storing them. FreeIPA uses LDAP as a backend, so storing 7.6 times more data directly affects your scalability when number of users or machines (or Kerberos services) grow up. And since certificates are all ASN.1 encoded, I naturally wanted to establish a performance baseline to ASN.1 parsing.

Synta, ASN.1 library

I started with a small task: created a Rust library, synta, to decode and encode ASN.1 with the help of AI tooling. It quickly grew up to have its own ASN.1 schema parser and code generation tool. With those in place, I started generating more code, this time to process X.509 certificates, handle Kerberos packet structures, and so on. Throwing different tasks at Claude Code led to iterative improvements. Over couple months we progressed to a project with more than 60K lines of Rust code.

Language files blank comment code
Rust 207 9993 17492 67284
Markdown 52 5619 153 18059
Python 41 2383 2742 7679
C 17 852 889 4333
Bourne Shell 8 319 482 1640
C/C++ Header 4 319 1957 1138
TOML 20 196 97 896
YAML 1 20 46 561
make 4 166 256 493
CMake 3 36 25 150
JSON 6 0 0 38
diff 1 6 13 29
SUM 364 19909 24152 102300

I published some of the synta crates yesterday on crates.io, the whole project is available at codeberg.org/abbra/synta. In total, there are 11 crates, though only seven are published (and synta-python is also available at PyPI):

Crate Lines (src/ only)
synta 10572
synta-derive 2549
synta-codegen 17578
synta-certificate 4549
synta-python 8953
synta-ffi 7843
synta-krb5 2765
synta-mtc 7876
synta-tools 707
synta-bench 0
synta-fuzz 3551

Benchmarking, fuzzer, and tools aren’t published. They only needed for development purposes.

Performance

The numbers below were obtained on Lenovo ThinkPad P1 Gen 5, 12th Gen Intel(R) Core(TM) i7-12800H, 64 GB RAM, on Fedora 42. This is pretty much a 3-4 years old hardware.

Benchmarking is what brought this project to life, let’s look at the numbers. When dealing with certificates, ASN.1 encoding can be parsed in different ways: you can visit every structure or stop at outer shells and only visit the remaining nested structures when you really need them. The former is “parse+fields” and the latter is “parse-only” in the following table that summarizes comparison between synta and various Rust crates (and OpenSSL/NSS which were accessible through their Rust FFI bindings):

Library Parse-only Parse+fields vs synta (parse-only) vs synta (parse+fields)
synta 0.48 µs 1.32 µs
cryptography-x509 1.45 µs 1.43 µs 3.0× slower 1.1× slower
x509-parser 2.01 µs 1.99 µs 4.2× slower 1.5× slower
x509-cert 3.16 µs 3.15 µs 6.6× slower 2.4× slower
NSS 7.90 µs 7.99 µs 16× slower 6.1× slower
rust-openssl 15.4 µs 15.1 µs 32× slower 11× slower
ossl 16.1 µs 15.8 µs 33× slower 12× slower

“Parse+fields” tests access every named field: serial number, issuer/subject DNs, signature algorithm OID, signature bytes, validity period, public key algorithm OID, public key bytes, and version. The “parse+fields” speedup is the fair end-to-end comparison: synta’s parse-only advantage is large because most fields are stored as zero-copy slices deferred until access, while other libraries must materialise all fields eagerly at parse time.

The dominant cost in X.509 parsing is Distinguished Name traversal: a certificate’s issuer and subject each contain a SEQUENCE OF SET OF SEQUENCE with per-attribute OID lookup. synta defers this entirely by storing the Name as a RawDer<'a> — a pointer+length into the original input with no decoding. cryptography-x509 takes a similar deferred approach. The nom-based and RustCrypto libraries decode Names eagerly. NSS goes further and formats them into C strings, which is the dominant fraction of its 16× parse overhead.

For benchmarking I used certificates from PyCA test vectors. There are few certificates with different properties, so we parse them multiple times and then average numbers:

Certificate synta cryptography-x509 x509-parser x509-cert NSS
cert_00 (NoPolicies) 1333.7 ns 1386.7 ns 1815.9 ns 2990.6 ns 7940.3 ns
cert_01 (SamePolicies-1) 1348.8 ns 1441.0 ns 2033.4 ns 3174.3 ns 7963.8 ns
cert_02 (SamePolicies-2) 1338.6 ns 1440.1 ns 2120.1 ns 3205.6 ns 8206.8 ns
cert_03 (anyPolicy) 1362.4 ns 1468.3 ns 2006.2 ns 3194.5 ns 7902.4 ns
cert_04 (AnyPolicyEE) 1232.9 ns 1424.7 ns 1968.6 ns 3168.1 ns 7913.1 ns
Average 1323 ns 1432 ns 1989 ns 3147 ns 7985 ns

The gap between synta (1.32 µs) and cryptography-x509 (1.43 µs) is tighter here than in parse-only (3.0×) because synta’s field access includes two format_dn() calls (~800 ns combined) that cryptography-x509 does for effectively free (its offsets were computed at parse time). Synta leads by ~8% overall.

Now, when parsing PQC certificates, an interesting thing happens. First, it is faster to parse ML-DSA than traditional certificates.

Certificate synta cryptography-x509 x509-parser x509-cert NSS
ML-DSA-44 1030.9 ns 1256.4 ns 1732.2 ns 2666.0 ns 7286.9 ns
ML-DSA-65 1124.9 ns 1237.5 ns 1690.5 ns 2664.2 ns 7222.1 ns
ML-DSA-87 1102.6 ns 1226.5 ns 1727.2 ns 2696.6 ns 7284.6 ns
Average 1086 ns 1240 ns 1717 ns 2675 ns 7265 ns

synta’s ML-DSA parse+fields (1.09 µs) is faster than its traditional parse+fields (1.32 µs) because ML-DSA test certificates have shorter Distinguished Names (one attribute each in issuer and subject vs multiple attributes in traditional certificates in the test above). The signature BIT STRING — which is 2,420–4,627 bytes for ML-DSA — is accessed as a zero-copy slice with no size-dependent cost.

Processing CA databases

Imaging your app needs to test whether the certificate presented by a client is known to you (e.g. belongs to a trusted CAs set). A library like OpenSSL looks at the client’s certificate, extracts identifiers of the certificate issuer, looks up whether such issuer is known in the CA database. That would require looking up properties of the certificates in the database. The fast we can do that, the better.

All those numbers in the previous section are for a single certificate being parsed millions of times. In a real app we often need to validate the certificate against a system-wide database of certificate authorities. The database used by Fedora and other Linux distributions comes from Firefox. It contains 180 self-signed root CA certificates for all public CAs with diverse key types (RSA 2048/4096, ECDSA P-256/P-384) and DN structures. The median cert by DER size is “Entrust.net Premium 2048 Secure Server CA” (1,070 bytes); the benchmark uses this cert for single-certificate and field-access sub-benchmarks to get stable results that are not sensitive to certificate-size outliers.

Another data I tried to benchmark against is 9,898 certificates from the Common CA Database (CCADB), covering the full multi-level hierarchy used by Mozilla, Chrome, Apple, and Microsoft:

Depth Count Description
0 919 Root CAs (self-signed)
1 6,627 Intermediates issued directly by roots
2 2,212 Two levels deep
3 137 Three levels deep
4 3 Four levels deep

Intermediate CA certificates tend to have more complex DNs and more extensions than the root CAs in the Mozilla store. The CCADB median cert is “Bayerische SSL-CA-2014-01” (10,432 bytes). These certificates from CCADB cover past 30 years of certificate issuance on the internet.

To see how those benchmarks would behave if CA roots database would be built with post quantum cryptography, I rebuilt the CCADB corpus as ML-DSA certificates. Nine CCADB certificates were skipped: OpenSSL’s x509 -x509toreq -copy_extensions copy step failed to convert them to CSR form, typically because those certs use non-standard DER encodings or critical extensions that the x509toreq pipeline cannot copy into a PKCS#10 request. (The failures are in OpenSSL’s cert→CSR conversion; synta parses all 9,898 original CCADB certs without error.) This leaves 9,889 of the original 9,898 certs in the synthetic database.

The median cert by DER size is “TrustCor Basic Secure Site (CA1)” (6,705 bytes). ML-DSA certs range from 5,530 B to 16,866 B; the distribution is shifted left relative to the CCADB RSA/ECDSA median (10,432 B) because the smallest CCADB certs (compact root CAs with few extensions) become the new median position after ML-DSA key replacement enlarges all certs uniformly.

Benchmark Library Dataset Time Throughput
synta_parse_all synta Mozilla (180 certs) 87.8 µs 2.0 M/sec
nss_parse_all NSS Mozilla (180 certs) 1.577 ms 114 K/sec
openssl_parse_all rust-openssl Mozilla (180 certs) 3.552 ms 50.7 K/sec
ossl_parse_all ossl Mozilla (180 certs) 3.617 ms 49.8 K/sec
synta_parse_and_access synta Mozilla (180 certs) 261 µs 690 K/sec
synta_build_trust_chain synta Mozilla (180 certs) 11.6 µs
synta_parse_all synta CCADB (9,898 certs) 5.10 ms 1.94 M/sec
nss_parse_all NSS CCADB (9,898 certs) 106 ms 93 K/sec
openssl_parse_all rust-openssl CCADB (9,898 certs) 203 ms 48.8 K/sec
ossl_parse_all ossl CCADB (9,898 certs) 214 ms 46.3 K/sec
synta_parse_and_access synta CCADB (9,898 certs) 16.1 ms 615 K/sec
synta_parse_roots synta CCADB (919 roots) 457.7 µs 2.01 M/sec
synta_parse_intermediates synta CCADB (8,979 intermediates) 4.735 ms 1.90 M/sec
synta_build_dependency_tree synta CCADB (9,898 certs) 559 µs
synta_parse_all synta ML-DSA synth (9,889 certs) 5.78 ms 1.71 M/sec
nss_parse_all NSS ML-DSA synth (9,889 certs) 103 ms 96.4 K/sec
openssl_parse_all rust-openssl ML-DSA synth (9,889 certs) 239 ms 41.4 K/sec
ossl_parse_all ossl ML-DSA synth (9,889 certs) 256 ms 38.6 K/sec
synta_parse_and_access synta ML-DSA synth (9,889 certs) 17.5 ms 566 K/sec
synta_parse_roots synta ML-DSA synth (919 roots) 463 µs 1.98 M/sec
synta_parse_intermediates synta ML-DSA synth (8,970 ints.) 5.10 ms 1.76 M/sec
synta_build_dependency_tree synta ML-DSA synth (9,889 certs) 549 µs

NSS is 18–21× slower than synta across all three datasets; rust-openssl is 40–41× slower and ossl is 41–44× slower. All three C-backed libraries successfully parse ML-DSA certificates (NSS 3.120+ and OpenSSL 3.4+ support ML-DSA natively). NSS’s absolute parse time is nearly identical across CCADB traditional certs (106 ms) and ML-DSA synthetic certs (103 ms) — confirming that NSS’s dominant cost is eager DN formatting at parse time, which depends on DN attribute count rather than the signature algorithm. The slightly lower relative slowdown for NSS on ML-DSA (18× vs 21×) is entirely because synta is slower on ML-DSA (5.78 ms vs 5.10 ms), not because NSS is faster.

synta’s throughput is consistent at ~1.7–2.0 M certs/sec across all three datasets, confirming linear O(n) scaling. Parse rate is slightly lower for the ML-DSA synthetic hierarchy (1.71 M/sec) than for the CCADB traditional hierarchy (1.94 M/sec) because the larger ML-DSA SubjectPublicKeyInfo and signature BIT STRING fields add bytes to the tag+length-header scan that synta performs at parse time. The intermediates-only sub-benchmark is slightly lower than roots-only in each dataset (1.76 M/sec vs 1.98 M/sec for ML-DSA; 1.90 M/sec vs 2.01 M/sec for CCADB) because intermediate CAs tend to have more complex DNs and extension lists.

Finally, individual property access for a pre-parsed certificate, single field read, no allocation unless noted:

Field Mozilla (1,070 B) CCADB (10,432 B) ML-DSA (6,705 B) Notes
issuer_raw / subject_raw 4.1 / 4.1 ns 4.2 / 4.1 ns 4.5 / 4.4 ns Zero-copy slice
public_key_bytes / signature_bytes 4.1 / 4.1 ns 4.2 / 4.2 ns 4.6 / 4.4 ns Zero-copy slice
signature_algorithm / public_key_algorithm 5.9 / 5.4 ns 5.9 / 5.5 ns 6.3 / 6.4 ns OID → &'static str
serial_number 10.9 ns 6.8 ns 7.5 ns Integer → i64, length-dependent
validity 180 ns 206 ns 231 ns Two time-string allocations
issuer_dn 401 ns 224 ns 246 ns format_dn()String
subject_dn 404 ns 292 ns 324 ns format_dn()String

Zero-copy fields (issuer_raw, subject_raw, public_key_bytes, signature_bytes) cost ~4–5 ns — the price of reading a pointer and length from a struct field. The slightly higher cost for CCADB and ML-DSA fields vs Mozilla is within measurement noise.

identify_signature_algorithm() and identify_public_key_algorithm() match the OID component array against a static table and return &'static str — no allocation, no string formatting. The ~5–6 ns cost is a few comparisons and a pointer return.

serial_number cost depends on the integer’s byte length: the Entrust Mozilla cert carries a 16-byte serial number (parsed via SmallVec<[u8; 16]>), while the CCADB and ML-DSA synthetic medians have shorter serials. At 10.9, 6.8, and 7.5 ns respectively, all are negligible.

validity (~180–231 ns) allocates two strings: UTCTime and GeneralizedTime are formatted from their raw DER bytes into owned Strings. The two calls account for essentially all of the cost; the YYMMDDHHMMSSZ to RFC 3339 formatting is the dominant work.

format_dn() is the most variable field: it walks the Name DER bytes, decodes each SEQUENCE OF SET OF SEQUENCE, looks up each attribute OID by name, and formats the result into an owned String. The Mozilla cert’s issuer DN is more complex (multiple attributes, longer values: 401 ns) than the CCADB median (224 ns) or the ML-DSA synthetic median (246 ns). The ML-DSA synthetic median’s subject DN (324 ns) is slightly more expensive than the CCADB median (292 ns) because a different cert occupies the median position after key replacement. format_dn() cost is proportional to the DN’s attribute count and string lengths.

Why C Libraries Are Slower

CERT_NewTempCertificate (NSS) and OpenSSL’s d2i_X509 perform significantly more work per certificate than synta:

  1. Eager DN formatting — NSS formats the issuer and subject Distinguished Names into internal C strings during CERT_NewTempCertificate, even when the caller never reads them. Distinguished Name formatting is the single most expensive operation in certificate parsing; doing it unconditionally at parse time accounts for roughly 80% of NSS’s total parse cost. OpenSSL decodes DN structure eagerly as well.

  2. Arena and heap allocation — each NSS certificate allocates a PLArena block and copies the full DER buffer into it (copyDER = 1). OpenSSL allocates from the C heap. These allocations are additional work beyond decoding.

  3. Library state and locking — NSS acquires internal locks on every CERT_NewTempCertificate call to update the certificate cache, even when the resulting certificate is marked as temporary. This serialises concurrent parsing in multi-threaded applications.

  4. FFI boundary costs — the rust-openssl and ossl measurements include the overhead of crossing from Rust into the C library via extern "C" calls and pointer marshalling.

synta defers all of (1): issuer and subject are stored as RawDer<'a> (borrowed byte spans) and decoded only when the caller calls format_dn(). There is no locking, no arena, and no FFI boundary.

In these tests I also found out that PyCA’s cryptography-x509 doesn’t have optimizations for multiple accesses to the same fields. It is typically not a problem if you are just loading a certificate and use it once. If you have to return back to it multiple times, that becomes visible and hurts your performance. So I submitted a pull request to apply some of the optimizations I found with synta. The pull request had to be split into smaller ones and few of them were already merged, so performance to access issuer, subject, and public key in certificates and to some attributes in CSRs was improved 100x. The rest waits for improvements in PyO3 to save some of memory use.

March 23, 2026 08:33 AM

Powered by Planet