| TOC |
|
By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 24, 2010.
This memo proposes a standard method for forming hashed User IDs in OpenPGP Certificates for users who want to take advantage of public OpenPGP infrastructure without exposing their User IDs to public enumeration. It also discusses implementation considerations to simplify the use of these User IDs in the existing OpenPGP Web of Trust.
1.
Introduction
1.1.
Requirements Language
2.
Hashed User ID Format
3.
Choice of Hash Algorithm
4.
Implementation Considerations
4.1.
User ID Canonicalization
4.1.1.
Domain Name and URL Scheme Case-insensitivity
4.1.2.
IP Addresses
4.1.2.1.
IPv4 Addresses
4.1.2.2.
IPv6 Addresses
4.1.3.
Human Names
4.1.4.
Other Case-insensitivity
4.2.
Avoiding Loops
4.3.
Unusual Hash Algorithms
4.4.
User Interaction
4.5.
Local Storage
4.6.
Interaction with Trust Signature Regular Expressions
5.
Rationales for decisions
6.
Acknowledgements
7.
IANA Considerations
8.
Security Considerations
9.
References
9.1.
Normative References
9.2.
Informative References
§
Authors' Addresses
§
Intellectual Property and Copyright Statements
| TOC |
OpenPGP (Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” November 2007.) [RFC4880] certificates are traditionally available through a publicly-accessible lookup (using keyservers or other transports). This public lookup mechanism provides a number of useful features, including certificate and signature revocation, metadata updates (including expiry), and third-party certification. However, public certificate retrieval combined with the multilateral nature of the Web of Trust (WoT) allow for trivial enumeration of all OpenPGP certificates in the well-connected set. Since OpenPGP certificates are easily mapped to their corresponding real-world entities by their User IDs, the real-world identities of keyholders in the well-connected set are also trivially-enumerable.
Many established best-practices discourage public enumeration of real-world entities. As OpenPGP User IDs grow to encompass a wide range of real-world entities, some users will be reluctant to adopt OpenPGP certificates because of these concerns. Allowing the binding between certificates and real-world entities to be a one-way binding enables users to take advantage of the benefits of OpenPGP infrastructure without exposing real-world identities to trivial public enumeration.
This document describes a standard method for forming hashed User IDs in OpenPGP Certificates to address these concerns. The specific form of User ID described is entirely compatible with existing OpenPGP implementations, though implementations aware of this specification may want to make modifications to offer a superior user experience when encountering hashed User IDs.
Crucial to this standard is the existence of a predictable one-way mapping from cleartext User IDs to certificates. But when this mechanism is in use, the reverse mapping (from certificates to cleartext User IDs) should be computationally infeasible. This is accomplished by passing a canonicalized version of the User ID through a standard digest algorithm and formatting the result in an unambiguous way.
Note that User IDs hashed by this mechanism can be searched for neither by substring nor by regular expression. They may only be found by direct lookup with an exact, full-text match on the User ID. Thus, if Bob uses this mechanism to hash his User ID, and if Alice already knows Bob's full identity, she can trivially find his key on the public keyservers or from a set of employee keys published elsewhere by Bob's employer. But if Mallory wants to get a list of everyone employed at Bob's place of work, she will be unable to retrieve Bob's information by the same means of retrieval. Note that this mechanism does not prevent Mallory from finding Bob's certificate if Mallory already knows Bob's identity.
New uses of OpenPGP certificates like [RFC5081] (Mavrogiannopoulos, N., “Using OpenPGP Keys for Transport Layer Security (TLS) Authentication,” November 2007.) suggest other uses for OpenPGP User IDs beyond simple [RFC5322] (Resnick, P., Ed., “Internet Message Format,” October 2008.) e-mail addresses. And section 5.11 of [RFC4880] (Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” November 2007.) explicitly states that there are no restrictions on the content of the User ID field as long as it is a UTF-8 (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.) [RFC3629] string. These novel uses raise similar concerns about real-world entity publication and enumeration as do traditional User IDs. This mechanism affords the same protections to all compliant forms of OpenPGP User IDs.
The OpenPGP standard (Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” November 2007.) [RFC4880] allows for the use of arbitrary UTF-8 text strings as User IDs. These User IDs are useful for the public Web of Trust (WoT) because they allow easy human-readable certification, revocation, and expiration of associated certificates on the public keyservers. However, there may be reluctance to take advantage of this infrastructure because of enumeration of the association of the OpenPGP user IDs and their real-world counterparts on the public keyservers.
OpenPGP and the WoT provide powerful tools for cryptographic purposes, but can also be used to reveal otherwise-hidden information. Some potential consequences include: exposing the real-world identity associated with an e-mail address, exposing the e-mail address associated with a real-world identity, offline derivation of social and business relationship maps, and service or host enumeration. The WoT is not typically considered to be a repository for hidden information, since the User IDs themselves are generally not obscured in any way. As a result of this information being available and un-obscured (on OpenPGP keyservers or elsewhere) it can be used to trivially expose a comprehensive listing of all cryptographically authenticated identity information related to individuals, their social relationships, organizations, services, and hosts.
Individuals wish to utilize the cryptographic identity verification mechanisms provided by the OpenPGP WoT. However, some are uncomfortable publishing their identity due to the ready availability to that information being provided by the WoT. In response to those concerns these individuals tend to utilize various problematic mechanisms to obscure this information, or more worrisome simply not participate in the WoT in any meaningful way.
Many are deeply troubled with publishing a complicated dossier detailing social connections. The process of exchanging OpenPGP signatures with individuals provides to the public not only the identities of those individuals, but also the social relationship between the two parties doing the exchange. These concerns are amplified due to how trivial it is to build a relationship map with the readily available, unobscured data in the WoT. In the context of building a public key infrastructure, this mapping has functional use, but for many it raises privacy concerns. These concerns have been likened to the difference between letting a random person call your company and ask for Jenny Smith's phone number versus sending them a copy of your entire corporate phone directory.
The practice of using the User ID fields in OpenPGP keys for service and host authentication results in similar concerns around publishing easily enumerated information about internals of a network related to a given domain.
Enumeration concerns are not unique to OpenPGP, in fact two common best practices in DNS configuration, namely "Split Horizon" and restricted zone transfers are undertaken explicitly for the purposes of limiting enumeration of the entire list of names or other information contained in zones. Because DNS stores a wealth of information regarding the configuration of a network, being able to enumerate such information is an invaluable resource for would-be attackers because this list can give important, and detailed information about your internal infrastructure that may not be otherwise published. Separating DNS into external and internal views ("Split Horizon") and ensuring that only approved slave servers can transfer zones from your primary name server is an important mechanism to restrict remote users to only be able to look up records for domain names they already know, one at a time.
These best practices have evolved over time into legal requirements. European countries are bound by the EU Data Protection Directive, and as a result the DENIC http://en.wikipedia.org/wiki/DENIC (manager of the .de top-level country-code domain for Germany) has stated that zone enumeration violates Germany's Federal Data Protection Act. Additionally, the information obtained through zone enumeration can be used as a key for multiple WHOIS queries which can reveal registrant data. Data which many registrars are under strict legal obligations to protect under various contracts.
The impact of these legal consideration forced DNSSEC, as defined in RFCs [RFC4033] (Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, “DNS Security Introduction and Requirements,” March 2005.) through [RFC4035] (Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, “Protocol Modifications for the DNS Security Extensions,” March 2005.), to address this issue. DNSSEC has the goal of increasing security, however contrary to these best practices, forces exposure of zone information. Although the IETF DNS Extensions working group originally considered zone enumeration to be a non-issue by arguing that DNS data is considered public, the significant concerns raised to the working group by large registrars about the legality of zone enumerability resulted in the creation of [RFC5155] (Laurie, B., Sisson, G., Arends, R., and D. Blacka, “DNS Security (DNSSEC) Hashed Authenticated Denial of Existence,” March 2008.) "DNSSEC Hashed Authentication Denial of Existence" to specifically address this issue. In this scheme, instead of including the name directly (which would enable zone enumeration), the record includes a cryptographically hashed value of the name.
The OpenPGP WoT, as is typically implemented in keyservers is not the only mechanism available for worldwide public key infrastructure. However, contrary to the typical OpenPGP WoT implementation, other implementations have addressed enumeration concerns. For example, [RFC4398] (Josefsson, S., “Storing Certificates in the Domain Name System (DNS),” March 2006.) describes how to distribute email certificates that DNSSEC can validate, making it possible to use DNSSEC as a worldwide public key infrastructure for email addresses. However, [RFC4398] (Josefsson, S., “Storing Certificates in the Domain Name System (DNS),” March 2006.) acknowledges this configuration as unlikely for most organizations to implement due to enumerability concerns: "If an organization chooses to issue certificates for its employees, placing CERT RRs in the DNS by owner name, and if DNSSEC (with NSEC) is in use, it is possible for someone to enumerate all employees of the organization. This is usually not considered desirable, for the same reason that enterprise phone listings are not often publicly published and are even marked confidential."
This memo suggests a standard way to obscure the OpenPGP User ID such that entities who know the User ID they are looking for can use the cryptographic infrastructure, but entities without knowledge of the User ID in question can't enumerate the User IDs in use through the WoT alone.
This is accomplished by establishing a standard hashed format for User IDs, which can be used by compliant OpenPGP clients willing to offer this feature. As the hashed User ID is itself a [RFC4880] (Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” November 2007.) conformant UTF-8 textual string, tools can also make use of older, compliant clients by specifying the hashed User ID directly.
FIXME: describe threat model of snooping keyserver operator? does this is defend against such an attacker?
FIXME: describe threat model of snooping on the wire between keyserver and OpenPGP client. Is this the best defense against such an attacker?
FIXME: discuss what it means to sign a hashed User ID
FIXME: discuss querying in cleartext if the hashed UID fails to find anything
| TOC |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].
| TOC |
The standard hashed User ID MUST be a single line of ASCII text with three fields, delimited by the '#' character:
For example, the User ID "Mary User <mary@example.net>" would instead be represented (stored, transmitted, etc) as "hash#SHA256#41992fd90a113fdbb700ee3f7b7c4e8ba6ee14ace7a2cc6f5c26c1e702138647"
| TOC |
OpenPGP implementations creating new hashed User IDs have a choice of which hash algorithm to use. Based on current understanding of the hash algorithms available, and the specific requirements of this application, implementations SHOULD use the SHA256 algorithm (as specified in [FIPS180] (National Institute of Standards and Technology, “Secure Hash Signature Standard (SHS) (FIPS PUB 180-2).,” 2002.)) to generate new hashed User IDs.
FIXME: discuss clients querying keyservers
FIXME: discuss clients searching local keyring storage
FIXME: discuss migration to new hash algorithms
| TOC |
| TOC |
Some types of User ID (such as those containing domain names inside of [RFC5322] (Resnick, P., Ed., “Internet Message Format,” October 2008.) e-mail addresses) have components that can be represented in various ways with the same semantic content. For a hashed User ID to be retrievable, a canonical form of the User ID SHOULD be used when creating and looking up the hashed User ID. This section attempts to establish reasonable canonical forms for relatively-common types of User ID.
| TOC |
User IDs may include DNS names internally, for example in [RFC5322] (Resnick, P., Ed., “Internet Message Format,” October 2008.) e-mail addresses or [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.) URLs. [RFC4343] (Eastlake, D., “Domain Name System (DNS) Case Insensitivity Clarification,” January 2006.) indicates that DNS names are case-insensitive. Any substring within a User ID representing a DNS name MUST be canonicalized to its lower-case representation before hashing.
Section 3.1 of [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.) indicates that the scheme part of a Uniform Resource Locator (URL) is also case-insensitive, and that the canonical form is lower-case. Any substring within a User ID representing a URL scheme MUST be canonicalized to its lower-case representation before hashing.
For example:
The [RFC5322] (Resnick, P., Ed., “Internet Message Format,” October 2008.) e-mail address "Mary User <Mary@EXAMPLE.NET>" would be canonicalized to "Mary User <Mary@example.net>" before hashing.
The [RFC3986] (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.) URL "HTTPS://FOO.Example.NET" would be canonicalized to "https://foo.example.net" before hashing.
| TOC |
Some User IDs (for example, those containing URLs) may include a host's IP address. IP addresses MUST be canonicalized before hashing.
| TOC |
Canonicalized IPv4 addresses MUST be represented as 4 dot-separated decimal numbers, without any leading zeroes.
For example:
An IPv4 address with leading zeros such as "192.000.002.123" MUST be canonicalized to "192.0.2.123".
| TOC |
IPv6 addresses MUST be canonicalized to the shortest representation that does not contain an elision of a group of zeroes. Point 1 of [RFC4291] (Hinden, R. and S. Deering, “IP Version 6 Addressing Architecture,” February 2006.), section 2.2 provides an example of an address in its shorted representation without dropping a group of zeroes.
For Example:
The address with the full hexadecimal representation "2001:0db8:0000:0000:0000:0000:0000:0001" MUST be canonicalized as "2001:db8:0:0:0:0:0:1".
| TOC |
Human names are, for obvious reasons, hard to canonicalize. Therefore, this document makes no specific suggestions for a "standard" way to canonicalize human names.
| TOC |
FIXME: discuss other canonicalization (IDN?)
| TOC |
Client tools that handle hashed User IDs should be able to recognize that a User ID is already hashed. If the client tool recognizes that a given User ID matches the specification of User ID hashing outlined in this document, it should not re-hash the User ID for the purpose of creating, looking up, sign, etc. such User IDs.
| TOC |
FIXME: how many queries are worth doing?
| TOC |
FIXME: suggest ways to cleanly interact with users -- display unhashed User IDs?
| TOC |
FIXME: should compliant implementations store local copies of the unhashed User IDs for future convenience?
| TOC |
Section 5.2.3.14 of [RFC4880] (Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” November 2007.) describes the use of regular expressions in a trust signature. When interpreting a hashed User ID where the cleartext of the User ID is known, trust signatures should be considered to be applied to the cleartext User ID, not to the hashed User ID.
| TOC |
Why a User ID instead of a new User Attribute Type?
Why a text string instead of numeric representation of hash algo?
Why the fixed-string prefix?
Why hex instead of Base64?
| TOC |
Thanks for significant discussion: FIXME.
| TOC |
This memo includes no request to IANA.
| TOC |
FIXME: if there is not fall-back to clear text UIDs, there is potential for denial of a service attack against users who do *not* publish hashed UIDs. attackers can publish hashed versions of the original users UID, which would prevent the original users key from ever being found. the original user could get around this by publishing a hashed UID along side the non-hashed ID.
FIXME: Does hashing User IDs protect against a keyserver operator snooping traffic?
FIXME: Are there better, or additional defenses that one can take against an attacker who is snooping on the wire between the keyserver and the OpenPGP client?
FIXME: Discuss inevitable relative hash algorithm strength obsolescence as cryptographic research advances
FIXME: discuss signing weakly-hashed User IDs with stronger hashes
FIXME: discuss local storage of non-hashed User IDs.
| TOC |
| TOC |
| [FIPS180] | National Institute of Standards and Technology, “Secure Hash Signature Standard (SHS) (FIPS PUB 180-2).,” 2002. |
| [RFC2119] | Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML). |
| [RFC4880] | Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. Thayer, “OpenPGP Message Format,” RFC 4880, November 2007 (TXT). |
| TOC |
| TOC |
| Daniel Kahn Gillmor | |
| Independent | |
| XXXXX XXXXX St. | |
| Brooklyn, NY XXXXX | |
| USA | |
| Phone: | +1 718 XXX XXXX |
| Email: | dkg@fifthhorseman.net |
| Jameson Graef Rollins | |
| Independent | |
| XXXXX XXXXX St. | |
| Brooklyn, NY XXXXX | |
| USA | |
| Phone: | +1 718 XXX XXXX |
| Email: | jrollins@finestructure.net |
| Micah Anderson | |
| Riseup Networks | |
| PO Box 4282 | |
| Seattle, WA 98194 | |
| USA | |
| Phone: | +1 206 279 5902 |
| Email: | micah@riseup.net |
| Matthew James Goins | |
| Openflows Community Technology Lab | |
| XXXXX XXXXX St. | |
| Brooklyn, NY XXXXX | |
| USA | |
| Phone: | +1 612 XXX XXXX |
| Email: | mjgoins@openflows.com |
| TOC |
Copyright © The IETF Trust (2009).
This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.
This document and the information contained herein are provided on an “AS IS” basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.