Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: Multiple principals for a Subject #1317

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

dadrus
Copy link
Owner

@dadrus dadrus commented Apr 4, 2024

Related issue(s)

closes #921

Checklist

  • I agree to follow this project's Code of Conduct.
  • I have read, and I am following this repository's Contributing Guidelines.
  • I have read the Security Policy.
  • I have referenced an issue describing the bug/feature request.
  • I have added tests that prove the correctness of my implementation.
  • I have updated the documentation.

Background

In heimdall the term Subject is defined to represent the source of a request which is created upon successful authentication. This way, a Subject may be any entity, such as a person, a service, or something else. Until now, a Subject was represented by the following JSON schema

{
    "type": "object",
    "additionalProperties": false,
    "required": [ "ID" ],
    "properties": {
        "ID": {
            "description": "The unique identifier of the subject",
            "type": "string"
        },
       "Attributes": {
            "description": "Optional attributes describing the data used during the authentication of the subject",
            "type": "object",
            "uniqueItems": true
       }
    }
}

with ID being a unique identifier of the subject and Attributes representing a dictionary of attributes related to the authenticated subject. These attributes could be for examples claims from a JWT used to authenticate the subject.

This abstraction was enough for long time. But it has its drawbacks. In a real life a user wanting accessing an API, may use for example a laptop equipped with a client certificate from which the actual request is sent to the aforesaid API. It can be an IoT device, like e.g. a heating system, an end customer is using. It may even be an environment to which a user should authenticate first. In all these cases, we're actually talking about different and complementing authentication aspects related to the same request, but representing different entities (like a user and a device).

Reasoning, why not going for new authorizer types instead

It would simply lead to code bloating and potentially to a lot of duplication as for each authenticator, there would be a need for an authorizer doing the same thing.

Description

NOTE: This PR is in a very early stage. The text below describes the current ideas which might change during the implementation.

For the above said reasons, this PR introduces the following changes:

  • It refactors the Subject object to support multiple Principals (the different entities mentioned above). That way the Subject becomes an object holding the different authenticated principals, each having at least an ID and Data attributes.
  • Unlike the old Subject object, the Prinipal objects are immutable, which means they cannot store data collected during the execution of the authentication & authorization pipeline.
  • Since there is a need to store data collected during the execution of the authentication & authorization pipeline and it is not possible to store it in the Data property of the Principal objects, a new object, named Outputs has been introduced.

This way, a Principal is very similar to the old Subject and can be represented by the following JSON schema:

{
    "type": "object",
    "additionalProperties": false,
    "required": [ "ID" ],
    "properties": {
        "ID": {
            "description": "The unique identifier of the subject",
            "type": "string"
        },
       "Data": {
            "description": "Optional data used during the authentication of the subject",
            "type": "object",
            "uniqueItems": true
       }
    }
}

The Data property can be the used JWT, the response from the introspection endpoint, an x509 certificate (in the future) and many more.

The new Subject is then just an object in sense of a JSON object.

{
   "<principal name 1>": {
      "ID": "some identifier",
      "Data": {}
   },
   "<principal name 2>": {
      "ID": "some identifier",
      "Data": {}
   },
   ...
}

With <principal name ...> being the Principal entries representing the Subject. Even it does not have an ID property any more, it has a new Primary() function, which returns the the primary principal - the principal, which has been created by an authentication step marked with primary: true, if multiple were specified in the pipeline, or if there was only one authenticator, the principal it creates.

The new Outputs object can be represented using the following JSON schema:

{
    "description": "Holds results from particular mechanisms executed in the pipeline",
    "type": "object",
    "uniqueItems": true
}

Changes in Detail

Mechanisms

  • Since authenticators do now create principals and populate a Subject with them, the subject property of all authenticators have been renamed to principal.
  • The semantic of the authentication stage has been changed. The old behavior was, if multiple authenticators are specified, the subsequent ones are only executed if the previous either failed and allowed a fallback or were not responsible for the authentication data available in the request. Now, all authenticators are executed.
  • The configuration of the anonymous authenticator has been updated and does not allow any configuration anymore. That is, the principal id created by it is always "anonymous". It became a "special" authentictor. It cannot be combined with other authenticators any more. See the next section for details and reasoning.

Authentication & Authorization Pipline

Important to note is that there are two types of fall backs for authenticators:

  1. No authentication data is present, the particular authenticator is supposed to work on. E.g. the authenticator expects a bearer token in a JWT format, but there is either no token present, or it is not in the JWT format.
  2. The validation of the present authentication data has failed. E.g. the required JWT is present, but the authenticator failed to verify its signature.

Before this PR, the first type has been addressed by just executing the next authenticator specified in the pipeline, and the second type by doing the same if the allow_fallback_on_error property of the failed authenticator was set to true.

In sense of authentication stage configuration the requirement "verify a JWT if it is present and consider the request to be anonymous otherwise" (which basically means, the authentication step is optional) could be implemented as

- authenticator: jwt_authenticator
- authenticator: anonymous

and the requirement "verify the JWT using one JWKS endpoint if it comes from an idp1 and using another JWKS, if it comes from an idp2" could be implemented as

- authenticator: jwt_idp1
  config:
    allow_fallback_on_error: true
- authenticator: jwt_idp2

Since, as written above, all authenticators specified in the pipeline are always executed, the following authentication stage

- authenticator: jwt_authenticator
- authenticator: anonymous

would not describe a fallback any more and the question arises how to deal with fall backs now. This is addressed by introducing an new property optional of type boolean on the authentication stage step level. So, the above example can now be rewritten to

- authenticator: jwt_authenticator
  optional: true

The usage of the anonymous authenticator becomes implicit. That is also the reason, why the updated authenticator of type anonymous does not allow any configuration any more more. The ID of the principal it creates, has always the value anonymous.

Since the allow_fallback_on_error property addresses the behavior of the pipeline and not the actual behavior of the authenticator, this property has been dropped from the authenticators, previously defining these. A new property continue_on_error of type boolean has been introduced on the step level. So, the requirement "verify the JWT using one JWKS endpoint if it comes from an idp1 and using another JWKS, if it comes from an idp2" from above would now be implemented as

- authenticator: jwt_idp1
  continue_on_error: true
- authenticator: jwt_idp2
  primary: true

With that in place, one can easily combine and chain any amount of authenticators, specify which are allowed to fail and which are optional to implement different requirements. That has however an implication: The combination of the anonymous authenticator with other authenticator types is not allowed. It can only be used standalone.

Please note that the three new properties, optional, continue_on_error, and primary introduced by this PR for authentication steps represent different requirements:

  • optional: true means, that the execution of the authenticator is optional. If there is no authentication data, the authenticator should work on, ignore that and continue. If however, the authentication data is present and the authenticator fails validating it, the execution of the pipeline will be terminated with the corresponding error.
  • continue_on_error: true goes beyond this and allows processing of the pipeline even if the authenticator fails validating the available authentication data. However, there must be at least one further authenticator in the given stage, otherwise the execution of the pipeline is terminated with an error as well.
  • primary: true means, this particular authenticator creates the primary principal - the principal identifying the primary entity, which ID is also used for the sub claim by the JWT finalizer. Indeed, if multiple authenticators are specified in the authentication stage, one of these must have the primary property set to true. It is an error otherwise and heimdall will refuse loading the corresponding rule, respectively rule set.

There is one additional notable change: Each step in any stage has received an id, which is logged when the step is executed and has the following meaning:

  • For authentication steps in the authentication stage, this id defines the name of the principal, created by the particular authenticator (the <principal name ...> in the definition of the Subject above). If not set, the principal name is set to the id of the authenticator used in the step.
  • For authorizers and contextualizers it defines the key under which the results of the particular step are written to the Outputs object and which then can be used in templates and expressions to access the corresponding data. If not set, the id of the particular mechanism used in the step is used instead.
  • For finalizers, it does not have any special functionality beyond being logged

All ids in the particular rule pipeline must be unique.

Templates & Expressions

There is still a Subject object available as before, but as written above with a different structure. And, as also written above, there is a new Outputs object holding the results from the executed mechanisms. That affects how data can be accessed. Here are two examples highlighting the differences.

Old

some_property: |
  { 
    # accessing the ID of the Subject
    "user": {{ .Subject.ID | quote }},

     # accessing the response from a contextualizer with the id "foo_contextualizer"
     "some_contextualizer_response": {{ .Subject.Attributes.foo_contextualizer | quote }},

     # accessing iss claim from the JWT used the authenticate the subject
     "jwt_claim": {{ .Subject.Attributes.iss | quote }}
  }

New

some_property: |
  { 
    # accessing the ID of the Principal created by the authenticator creating the primary principal 
    "user": {{ .Subject.Primary.ID | quote }},

    # the following was not possible before
    # accessing the ID of the Principal created by the authenticator with principal name set to "device"
    "device": {{ .Subject.device.ID | quote }}

    # accessing the response from a contextualizer step with the id "foo"
    "some_contextualizer_response": {{ .Outputs.foo | quote }},

    # accessing iss claim from the JWT used to authenticate the primary principal
    "jwt_claim": {{ .Subject.Primary.Data.iss | quote }}
  }

As can be seen, the main differences are:

  • Instead of using Subject.ID, you have now to use Subject.<some principal name>.ID, respectively Subject.Primary.ID
  • Instead of using Subject.Attributes.iss to access the iss property from the authentication data used for the actual authentication, you have now to use Subject.<some principal name>.Data.iss, or Subject.Primary.Data.iss
  • Data collected during the execution of the previously run mechanisms is not available under Subject.Attributes anymore. Instead it is available on the Outputs. So instead of using Subject.Attributes.something, you have now to use Outputs.something

Examples

With that in place, one can now chain multiple authentication mechanisms. Here examples for the implementation of the requirements described in #921.

  • Access to a staging environment. Only project members should be able to access the services (via e.g. a browser) to see and test the new deployed features. There is also an IAM in the staging environment itself which manages the "customers". So, the first IAM manages the access to the environment . The X-Env-JWT header certifies that the request has been routed through an authorized gateway (so access to the environment was legitimate). And the second IAM represents the actual users of the services deployed. Here, the Authorization header represents the user and describes its permissions through the scope claim

    - authenticator: jwt_env_authenticator
      id: env
    - authenticator: jwt_user_authenticator
      id: user
      primary: true
      config:
        assertions:
          scopes:
            - whatever

    with the jwt_env_authenticator being configured to extract the token from the Authorization header and the jwt_user_authenticator being configured to extract the token from the X-Env-JWT header.

    This example indicates that the two mechanisms referenced in the above steps are pretty much the same. The only difference would be the configuration of the jwt_source, which extracts the token from different headers. this duplication is a tradeoff between simplicity in the rules and duplication in the config. Reconfiguration of the jwt_source in a pipeline step was however never possible before. Opening it to the pipeline steps is possible, would however introduce a source for errors.

  • Verification that the request came over a specific intermediary. Depending on the path the request took, the gateway issues an additional token, e.g. X-Caller-ID, which is then present in addition to the token in the Authorization header.

    - authenticator: jwt_authenticator
      id: request_source
    - authenticator: oauth2_auth
      id: user
      primary: true

Open Topics

  • The current implementation makes it possible to have kind of xor combination of authenticators, so, either the one, or another, or yet another authenticator must succeed. At the end, there is just one subject, create by one of these. This allows implementation of use cases, like "An organization has multiple country organizations, which employees should be able to use their country specific IdP for authentication purposes and are able to access a particular endpoint using the corresponding authentication data, like e.g. tokens issued by these IdPs". It is unclear, how that shout be made possible with the approach described above. There is a need to define a type of an authenticators group, which creates one principal, regardless of the amount of authenticators in that group. In the naming used by this PR, all authenticators in that group are optional and all are primary, which is kind of weird.
    Maybe it would be better to organize everything in stages and make them explicit? That way, the old behavior would be preserved in sense of the semantic and also when the anonymous authenticator can be used. And it would be the different authentication stages which would each create a principal. That would mean less breaking changes, but
    • how can we define a primary principal then?
    • how to ensure, there is a unique name for the principal?
    • how to address defaults?
    • ...

Current PR Status

In a very early stage. The changes implemented so far is the update of the Subject to let it be a map of Principal objects and have the code compilable.

Copy link

codecov bot commented Apr 5, 2024

Codecov Report

Attention: Patch coverage is 97.72727% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 89.30%. Comparing base (32c53c8) to head (7522727).

Files Patch % Lines
internal/accesscontext/access_context.go 75.00% 1 Missing ⚠️
...l/handler/middleware/grpc/accesslog/interceptor.go 85.71% 1 Missing ⚠️
internal/rules/default_execution_condition.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1317      +/-   ##
==========================================
+ Coverage   89.25%   89.30%   +0.04%     
==========================================
  Files         270      271       +1     
  Lines        8870     8880      +10     
==========================================
+ Hits         7917     7930      +13     
+ Misses        704      703       -1     
+ Partials      249      247       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Authorizer for access token verification
1 participant