APEX: Our journey in scaling User IAM across multiple interconnected products and networks

Published on January 6, 2025

Hello there! In this article, I will share how our APEX API Gateway Platform has evolved from handling user accounts and subscriptions manually in silo, to using a centralised system for user IAM management. This evolution has made it possible for us to integrate our interconnected products and support multi-tenant user self-service. Throughout this article, I will provide detailed context to help you understand our thought process.

For those unfamiliar, with APEX, it is a platform designed for developers, product teams, agencies, and businesses to easily host and collaborate via API publishing and sharing. Right now, we offer our users four data planes to manage their APIs

To support our tenants’ API monitoring, we (APEX) have integrated our traffic logs with another SGTS product called StackOps. This setup enables tenants to log in and monitor the health and traffic of their live APIs.

Special shoutouts to these SGTS products in the article for making the evolution possible!

Problem

Since our initial launch, APEX had accrued a significant user management debt. We have had to manage multiple siloed data planes, systems, and portals, requiring users to manually request access to each one.

Over time, user accounts have become stale, the intended state of user access has become fragmented, and key administrative users have become inactive. These issues obstruct the completion of many essential functional steps while perpetuating a suboptimal user experience. As a result, it took an average of 10 business days for our users to fully onboard and operate within APEX to get any of their work done. Let’s see how we got here in the first place.

We started our MVP gateway platform to require users to register separately for each of our various portals and systems. In the very beginning, our users only needed to log into at most 2 portals to perform all their activities in the initial days.

Figure P1 — The 2 portals we started with

With only 2 portals to handle, we chose to settle our user management process by requiring users to submit service requests to our team, figuring that this approach would be the path of least resistance for us (…or so we thought).

As part of the service request process, our users must obtain the necessary permissions to join an APEX organisation, typically from a higher-ranking officer in their department. The same approvals will be needed for any user configuration changes that comes after as well. Permissions are managed individually for each system or portal, and without documented authorisation, access to APEX is denied.

The diagram below (Figure P2) illustrates the request-for-access process:

Figure P2 — Sequence diagram of our previous manual approval flow (APEX’s Service Request flow)

Over time, and in the spirit of being agile, our product offering grew! We gradually introduced more and more integrated portals to provide our users with a more comprehensive experience for navigating their API lifecycles.

The diagram below illustrates how our user activities have grown and how they are distributed across various product and network spaces. It also highlights how users may need to access multiple portals to complete their tasks and assignments:

Figure P3 — How our portals and systems operate together as it grows
Figure P4 — User access after being approved based on their role in today’s setup

Typical user stories involved

To gain a clearer understanding of how our users interact with our systems across the different portals and network zones, here are some common (and simplified) examples of our tenants’ activities:

  • As an API Provider
    I want to publish my APIs in both the Internet and Intranet zones
    So that I can serve both API consumer types regardless of where they are
  • As an API Provider
    I want to be able to monitor the health and usage rate of my APIs
    So that I can debug problematic API requests
  • As an API Provider
    I want to publish my APIs onto the Marketplace
    So that I can let non-government users create meaningful applications with my APIs
  • As an API Provider
    I want to promote my APIs from staging to production
    So that I can let my users consume the latest validated version of my API

Before going live and settling on the manual user management process above, we did not anticipate encountering many issues. At most, we expected the process to be somewhat inconvenient for users during the initial approval stages, with the assurance that it would eventually guide them to where they needed to be.

However, our first eight months of operations revealed a different reality. We faced a consistent stream of operational challenges related to our user management process.

Here are some of my favourites amongst the many we have received:

“My API certificates are expiring, so I need to onboard to your system to change them ASAP. My supervisor has left a while ago, how do I get access ??”
“You had a maintenance window? We did not receive any notification on that. Oh, right, we did not update our point-of-contact email address. That person had left too… What else did we miss??”
“Wait, there is a staging environment that we could have tested with? You mean we did not need to go through the full production setup just to test out our proof-of-concept?”
“Can you help? I’m taking over my team’s API, but there is this consumer organisation that has been receiving 4XX errors for the past 2 months. The organisation name is `migrated-batch-7–123456` if that helps! I will terminate it if there’s no response by 3 days.”

Patterns (functional and behavioural) identified

Going through these requests proved to be quite enlightening, as we discovered that our support inquiries within the user management process often follow a common pattern:

  • Inactive (Zombie) User Accounts
    These accounts remain unmanaged and forgotten about, hindering the effective dissemination of critical information and activities.
  • Dependence on Supervisors
    Users heavily rely on their supervisors to understand required actions, resulting in repeated (and oftentimes delayed) communications to obtain necessary approvals.
  • Lack of Verified Information for User Identification
    The lack of proper verifications make it challenging to accurately identify users within producer-consumer relationships, potentially leading to the wrongful termination of subscribed applications.
  • Underestimation of User Management Time
    Misjudging the time needed to get access to a service before initiating any tasks leads to coordination lapses and missed deadlines.
  • Incomplete Onboarding of Accounts
    The complexity in onboarding and user management may cause users to overlook the full scope of services and systems necessary to perform their activities effectively, leaving them with half-onboarded accounts. This may require them to reinitiate the user management process again if approval for the missed service is required.

Insights to our problem

These patterns nudged us to pay closer attention at how the entire process impacts our users. We wanted to create a good reference point to improve the entire user management experience, starting with onboarding. By reviewing our support tickets and having conversations with our newly onboarded tenants, we developed the following insights:

For new users who intend to onboard to APEX:

Time of first onboarding ticket submission → 
Time of first login to any portal system
≈ 3–4 business days

For new users who intends to use APEX for API related activities:

Time of first onboarding ticket submission (inclusive) → 
Time of first activity in APEX API Portal (Application/API creation)
≈ 6–10 business days

Solution

With a clearer understanding of our users’ user-management experiences and ample data to analyse, it was time to redesign and enhance the process. Within the APEX team, we took the chance to ideate and settle on key guiding principles to scope out our next steps to address the issues we had previously uncovered:

  • Centralised Management of User Account Information
    Staleness and maintenance of user account states should be handled outside of APEX, preferably through a centralised Identity Provider (IDP).
  • Shift-left Approach for User Management
    Users and their appropriate authorisers should have maximum control over their own user management activities as early as possible, eliminating the need for back-and-forth interactions with our Level 1 support team.
  • Enhancing User Awareness of Services
    Users should have comprehensive awareness of the services they can onboard upfront as part of their self-service capabilities.
  • Leveraging Existing Solutions
    Avoid reinventing the wheel by not creating our own dedicated user management system. Preferably, utilise solutions within our internal teams or seek ready-built, open-source alternatives.

Introducing TechPass and TechBiz: Our Collaborative Partners

Fortunately for us, we have 2 sister products within the Singapore Government Tech Stack (SGTS), TechPass and TechBiz, dedicated to solving the very challenges we were facing.

The combination of TechPass and TechBiz forms a “centralisation hub” of sorts when it comes to accessing SGTS products, APEX being one of the recent additions. Let’s have a closer look at both these sibling systems

TechBiz

TechBiz is a Service management tool that helps government officers discover, subscribe, and manage SGTS Products on a single console.
By integrating all SGTS Products into a single console and offering a consistent and streamlined interface for users, TechBiz overcomes the complexity of reconciling subscription plans, resource usage and invoicing data across multiple Products as more SGTS Products are rolled out across the government.

Let’s first explore TechBiz, where our users can log into the TechBiz portal to manage the entire lifecycle of their APEX subscription centrally by:

  1. Subscribing to APEX as an SGTS product
    Browsing from a list of available products, our users may search up APEX and begin the subscription process.
  2. Getting Approval via an auto-trigger to the assigned approver
    The approval steps are automatically forwarded to the assigned approver of their department/team with all the necessary steps and documentation readily available for them.
  3. Resource/organisation creation for their subscribed product
    Subsequently, users create a Resource, which aligns with APEX’s concept of an organisation. TechBiz ensure that every resource created is tied back to a unique system captured in Digital Governance Platform (DGP). Here, through DGP, users (and our product team as TechBiz tenants) can derive the agency the resource belongs to, the respective system owners, and intended purpose of the created system.
  4. Assignment of users
    Finally users can assign their team members the appropriate roles they need to pilot APEX. These actions are auditable and fully self-serviced.
Figure S1 — UI for users to create a resource for an APEX subscription, with the option to choose the available data plane
Figure S2 — TechBiz’s collection of entities that forms a subscription

TechPass

TechPass is an Identity & Access Management solution coupled with Single Sign-On, enabling users to access developer services securely and seamlessly.

On the other side of the partnership, we have TechPass, serving as the core identity provider to achieve our desired user identity federation:

  • Comes built in with user account staleness checks to ensure the registered identities are indeed actively using their accounts (and not zombies).
  • For every user assignment made in TechBiz, a corresponding group claim is added to the assigned user's TechPass identity. This claim uniquely identifies both their role and assigned resource, enabling APEX to interpret it across each portal and system.
    e.g. APEX:TB_R_01234567890_Org-Admin
  • If a user account is terminated or deactivated within TechPass, TechBiz receives that event and propagates the information to all associated resources containing that user. This effectively cleans up user data across all subscribed products.

The final piece to solving our user management problem

Leveraging the capabilities of the TechPass and TechBiz solutions, we capitalised on the events emitted by their user interactions within these portals. By developing a webhook to funnel these changes into our different platforms and applications, we maintain synchronised and accurate user and organisational data, ensuring alignment with the users’ intended states

Figure S3 — Webhook to receive IAM data changes

Thanks to the expertise of both TechPass and TechBiz teams, these webhook integrations are fully equipped with resiliency in mind. For any missed events, TechPass maintains an event history that allows us to diagnose, debug, and retry them at a later time.

Figure S4 — Retry-able events from the TechPass UI

Today, our users are empowered to maintain their access and roles across all our subsystems independently (Figure S5). Because user and organisational data are standardised at the TechPass and TechBiz level, we are better prepared for future integrations with minimal disruption to the users’ intended management configurations. Stay tuned for those upcoming enhancements!

Figure S5 — Stackable user roles assigned via a Product-Resource-Role matrix in TechBiz

Insights to our integration

For new users who intend to use APEX when onboarding via TechBiz:

Time of first subscription request → 
Time to first Portal login
≈ Same day

For new users who intends to use APEX for API related activities after onboarding via TechBiz:

Time of first subscription request (inclusive) → 
Time of first activity in APEX API Portal (Application/API creation)
≈ < 2 business days

One noteworthy observation following the implementation of the new onboarding experience is that the time until the first activity within our APEX portals has been significantly reduced. However, this should not be taken as definitive evidence of improvement, as we recognise that correlation does not imply causation.

We were curious on the impact the new onboarding experience, so we had a chat with some of the users (both new and old), and this is what we have learnt:

Significant gaps and ambiguity in the onboarding process can subconsciously cause users to deprioritise their work on your platform. In contrast, when processes are straightforward and easily accessible, they gently encourage users to complete their tasks more promptly by building positive momentum with your platform or product.
TL;DR: strive to get your users to their aha moment quicker!

Key takeaways and conclusion

You don’t have to solve all the problems by yourself
It is likely that you and your team are subject matter experts in a particular domain, such as our focus on API gateway products. There are others who specialise in areas where you may need assistance. It is important to learn how to communicate these needs upstream (early), and create shared goals that everyone collaborating can benefit from.

Put the users first when designing experiences for them
Although it may seem obvious, we can sometimes become distracted by short-term conveniences. Ideally, you should validate your designs and systems early with your users to obtain necessary feedback. This gives you the best shot at achieving the best possible outcomes without expending too much effort.

Be data-driven when building products/features
By establishing measurable data points, we gain the ability to determine when improvements or iterations should be prioritised. It is beneficial to share these data points with the larger team, regardless of whether the results are positive or negative. This approach provides other teams within your organization with valuable learning opportunities when they are ready to champion their own changes.

Thank you for reading this article! As we continue to strive to improve our products, we do hope you look forward to more articles like this one!


APEX: Our journey in scaling User IAM across multiple interconnected products and networks was originally published in Government Digital Products, Singapore on Medium, where people are continuing the conversation by highlighting and responding to this story.