Why unique identifiers?

Unique identifiers serve as a primary key or “social security number” for identifying a given research object and providing the ability for search engines to parse them is paramount. For search engines, unique identifiers are simple methods for disambiguating entities with similar names. For identifiers to function in this mode, they need to be unique, that is, the same ID should not point to two different entities, and they need to be persistent, that is, they need to outlive the entity itself. They also need to be at least minimally machine-processable. While many authors supplied identifying information like the catalog number for an antibody supplied by the vendor, or the official strain nomenclature supplied by the IMSR for a mouse, neither of these served the required functions. A catalog number is not a unique identifier, but rather a useful way for vendors to identify their products. If the same antibody is sold by different vendors, it will have different catalog numbers. If the same antibody is sold in different aliquots, it may have different catalog numbers. When the antibody is no longer available, the catalog number may disappear, or in some cases be recycled for use with another antibody. All of these features are undesirable in an identifier system. The Antibody Registry, in contrast, was specifically designed to supply useful and stable identifiers for antibodies and not as a commercial source of antibodies. Similarly, the strain nomenclature developed by the the Jackson Laboratory, with its superscripts and special characters, is useful for human curators to identify a particular strain, but causes hiccups in most search engines because of all of the special characters. We believe that a well curated registry is essential to the success of such a system, because of the necessity of these two functions, which currently cannot be replaced with a simple uncurated registration service. For example, we found in the registries we maintain, both software or antibodies, that authors sometimes register an entity that is found by a curator to be a duplicate.