Thinking Deeply about IDs, Names, and Renaming

Published on August 20, 2024

A guide to handle name/ID changes

The Problem

As we have discussed before, when designing systems, we should carefully consider aspects that are difficult to change later. Names are one such consideration.

Names play a crucial role in our architecture, and there are two related concepts to keep in mind:

Names: These are intended for human use. When users interact with the system and create entities (such as a service, a reservation, or a document), they want to name them to make them easier to find. However, users often need to rename entities because initial names may be incorrect or not thoughtfully chosen.
IDs: Systems also need to keep track of entities, which is where IDs come in. Unlike names, IDs are not meant to be human-readable; they need to be unique identifiers that the system uses across the entire architecture.

IDs must be unique. Fixing duplicate or inconsistent IDs (e.g., IDs that are not associated with anything or associated with more than one thing) later is difficult or impossible without manual intervention. We must prevent such issues from occurring, as they can affect system correctness and lead to data loss or corruption during migrations or backups. Ensuring the uniqueness and consistency of IDs is a key design goal of any architecture.

Many designs use names as IDs, which introduces several complexities:

Personal Identifiable Information (PII): Users may include PII (subject to GDPR regulations) in their names. If this happens without our knowledge, we cannot manage that information correctly.
Name Changes: If names serve as IDs and can change, this creates complexity. For instance, when an ID changes, the system must search for and update all instances of that ID.
Log and Audit Management: When IDs change, logs and audit entries may still reference the old ID, complicating the mapping and management of these records to the new ID.
Namespaces and Collisions: A collection of names is called a namespace. Some namespaces are shared among multiple users, such as URL paths in REST design. Shared namespaces can lead to name collisions when two users use the same name.
API Dependencies: Names or IDs are often used in APIs. If these names change, any application relying on those APIs can break. Similarly, if names are part of the hierarchy used to implement permissions, changes can disrupt dependent applications.

If we do not consider these cases, the challenges mentioned above will manifest as user feedback and bug reports. Addressing these issues in an ad hoc manner will degrade the architecture. Furthermore, the current architecture may limit the potential fixes, forcing us to cut corners and create more edge cases. This will result in additional bug reports, perpetuating a vicious and hard-to-escape cycle. Even when aware of this cycle, escaping it can be difficult due to the complexity of redesign, data migration, and the need to maintain a good user experience.

It is far better to avoid such situations through proactive design decisions.

Based on my current experiences, the following approaches would provide a names/ID model that can address the five challenges we discussed. If you have better solutions or suggestions for improvements, please add them as comments, and I will update the draft accordingly.

The Proposed Solution without Shared Namespaces

The main idea is to use unique, system-generated, immutable IDs as identifiers for system entities. All references within the system and API calls related to CRUD operations should use only these IDs. Names can be provided as needed and mapped to the corresponding IDs. Since users will primarily remember names, we must use names when interacting with users, while the internal system should always use IDs. However, search API calls must use names.

To apply this model effectively, it is crucial to design the API correctly. Most systems have APIs, and one or more applications interact with these APIs to deliver the end-user experience. A common mistake is to include both names and IDs as inputs in different places within the API. The API should use IDs as inputs exclusively, except for search functions where users will search by name. Often, our UIs need to display the name, which necessitates another API call to retrieve the name given the ID. To avoid this, all API results should return both the name and the ID with results. The application can display the name but use the ID for any downstream API calls. While this approach slightly increases the payload size, the buffered nature of network IO means this increase adds minimal overhead, if any.

When implementing names and IDs, Google Docs provides an excellent example to follow.

For logs and audits, we need to use IDs in the logs but display the names when presenting results to the user. If users view logs directly through log files, they will see the IDs, but this should only occur during complex debugging sessions, which should be rare.

This approach can address problems 1, 2, and 3. It is worth noting that for some applications, problems 1–3 may not be concerns (e.g., names would never change), and in such cases, this section can be disregarded.

The Proposed Solution with Shared Namespaces

Many cases, such as APIs, URL paths, and shared file structures, are shared by definition. When namespaces are shared, and users refer to them by name (e.g., URL paths), names used by different users can clash. In such cases, the namespace has meaning, and users must use names. Unlike before, users almost always refer to entities by their names, not IDs.

Sometimes, we avoid this complexity by prefixing names to make them unique, making this scenario similar to the earlier case without shared namespaces. However, this solution is not always acceptable.

To handle this scenario, we can do two things:

Support Names for API Operations: We need to support names for both CRUD operations and search within APIs. In contrast, when using IDs without shared namespaces, we only needed to support names for search API calls.
Enforce Uniqueness: We must enforce uniqueness when entities are created by notifying the user about the clash and requiring them to resolve duplications. The same checks must be performed with symbolic links/URL rewrites that map names to entities.

Even in this case, to support renaming/editing, we should use both the underlying ID and the name. We must not use the name as the ID, as the same complexities we discussed earlier will arise again.

This approach will address problem four.

The Proposed Solution with API Clients

When there are API clients for your application, the situation becomes more complex.

If a user changes the names of entities, any API call that depends on those changed names can fail. There are several scenarios to consider:

No Shared Namespaces: Only search API calls can fail in this case. However, since search is human-driven, users can often understand and recover from name changes.
Controlled Clients: If the same organization controls all clients making API calls, we can design the clients so that interactions, other than search, happen with IDs.
External Clients: In other cases, clients can break if the entities are renamed.

If our use case falls under the first or second scenarios, then there is nothing additional to do. In the third case, warning the user about potential issues may be sufficient in some instances.

If none of the above are applicable, we must keep old names intact while also enabling new names. For example, if an entity had the name /a/foo and we renamed it to /a/bar, we need to map both names to the same entity, allowing both old and new clients to continue working. APIs and services can handle this with versions, where the older version responds to the older path, and the new version responds to the new path. Older versions can be deprecated after some time, after which support for older paths can be dropped.

However, this approach does not work clearly with shared file structures or equivalents. In such cases, it is best to support both old and new names to access the same entity.

If the underlying entity is deleted, we need to notify users about all the associated names and warn them about the possibility that dependent applications may break.

Finally, if your resource hierarchy is interwoven with authorization, for example, as in Zanzibar-based models, renaming entities can change access permissions. An example of such a model is given in the article from Aserto on the Google Zanzibar ReBAC authorization model. These changes can break dependent applications. Keeping old and new paths may or may not meet the needs of the use case. If you use such an authorization model, take extra care to ensure users choose the right paths initially, as changing them later is complicated.

Conclusion

Names are often deeply integrated into most architectures, making changes to their logic later difficult, expensive, or nearly impossible. Carefully considering names from the outset can help avoid many future challenges.

We discussed some of my findings that can assist in this process. Here is a summary of our discussion:

Unchanging Names: No special handling is needed if names do not change.
Changing Names Without Shared Namespaces: We can use IDs; names are only required in search API operations.
Shared Namespaces: We need to support names in both CRUD and search API operations and detect clashes.
API Stability: We need special handling to prevent API Clients from breaking.

I hope this was useful. If you know a better way to handle these problems, please comment, and I will update the draft accordingly.

If you enjoyed this post, you might also like my new Book: Software Architecture and Decision-Making. You can find more examples from the book.

Get the Book, or find more details from the Blog.

Please note that as an Amazon Associate, I earn from qualifying purchases.

Continue reading on website

Other news

🌸 Spring bingo - Wellness challenge - Halfway! 🌸

April 15, 2025

Hey Hivebriters! Quick check-in on our April Wellness Challenge - Spring Bingo! We're halfway through the month, and it's the perfect time to jump in if you haven't started yet (or keep going if you have)! Quick Reminders:Complete rows or columns for 5 raffle entries eachSquares with 📷 require photo submissions in the commentsSubmit completed rows/columns through the form by April 30thBonus entri