Authentication & Authorisation Mechanisms
For Data Scientists
Authentication
Each time a Data Scientist wishes to perform a task, Bitfount authenticates them to prove their identity to the Pod(s) they are trying to access. User authentication can be carried out using one of three authentication methods:
- OpenID Connect (OIDC) Device Authorisation Flow (Default Authentication Method)
- OpenID Connect (OIDC) Authorisation Code Flow
- Security Assertion Markup Language (SAML)
All pods support all authentication methods. If a Data Scientist wishes to authenticate in a different manner to the default Device Authorisation Flow, they can specify which alternative method they’d like to use by passing it into the identity_verification_method
parameter.
Methods
OIDC Device Authorisation Flow
OpenID Connect (OIDC) is an identity layer built on top of the OAuth 2.0 framework. It allows third-party applications to verify the identity of the end-user and to obtain basic user profile information. There are multiple different flows for OIDC - more information can be found here. The device authorisation flow is Bitfount's default authorisation mechanism because it does not require a temporary web server to be set up by the Data Scientist to authenticate. Instead, the Data Scientist receives a code from the Pod which they must match against the code displayed to them in their browser. This flow is required if you're running code remotely, which means you will need to confirm your device in the browser each time you attempt to connect to and execute a task against a Pod.
identity_verification_method = "oidc-device-code"
OIDC Authorisation Code Flow
The authorisation code flow is an alternative to the device authorisation flow which does not require a click-through verification each time a Data Scientist wishes to act on a Pod. Instead, this method connects to a specified local webserver to verify authentication. See below for details on how to switch to this authorisation method.
identity_verification_method = "oidc-auth-code"
SAML
SAML is an older authentication protocol and can be thought of as a predecessor to OIDC. SAML differs in a number of ways from OIDC, for instance it is not based upon OAuth 2.0 and it uses XML documents as opposed to JSON Web Tokens to structure data . As far as user experience is concerned, it is the same the OIDC Authorisation Code Flow and does not require click-through verification for each authentication event. Both require a temporary web server to be set up by the Data Scientist as a challenge handler to respond to challenges from the Pods in order to prove their identity. In most cases, this authentication protocol is not recommended over the more recent OIDC.
identity_verification_method = "saml"
How to Authenticate Manually
Python API
To change authentication methods, pass the identity_verification_method
parameter to the appropriate function or method with your identity verification method of choice. The string names for these methods can be found in the IdentityVerificationMethod class. In the example below, the identity_verification_method
is being passed to the model fit
method but it is taken by any function or method which is used to submit task requests to Pods.
model.fit( pod_identifiers=["pod-identifier-1", "pod-identifier-2"], identity_verification_method="oidc-auth-code",)
YAML Config
The Data Scientist can also specify their identity verification method of choice in the YAML config if using the Bitfount CLI:
modeller: username: my-username identity_verification_method: samlpods: identifiers: - username/my-podtask: protocol: name: FederatedAveraging arguments: steps_between_parameter_updates: 100 algorithm: name: FederatedModelTraining aggregator: secure: False model: name: PyTorchTabularClassifier hyperparameters: steps: 100 batch_size: 32 optimizer: name: RAdam params: lr: 0.0001