Skip to main content

Introduction to Bitfount

Welcome to Bitfount! Our mission is to make the world’s intractable data interactable. Our platform can help you tackle your most challenging problems at the intersection of data science, data collaboration, and data privacy.

Collected data often goes vastly underutilised due to barriers to collaboration stemming from privacy-, security- or commercial-sensitivity of the datasets and from technical barriers to the creation and maintenance of multiple up-to-date copies of a dataset.

Bitfount remedies this by providing a flexible, easy-to-use platform for privacy-preserving data collaboration. It enables data teams, analysts, and researchers to gather statistical insights as well as train and evaluate machine learning models on data which they don't have access to in raw form, whether internal to their organisation or external. It also enables data custodians to share the benefits of their data without giving up control or privacy.

How it works

Bitfount is built on a concept we call “Distributed Data Science”, which enables a data scientist or analyst to send encrypted code to data where it already lives and retrieve the results of analysis queries or ML model training without the need to access raw data. This means data owners no longer need to transfer data or take on unnecessary risk to collaborate with internal or external data science talent.

You can think of Bitfount as acting as a switchboard between people who want to do analysis and the data on which the analysis is run. With Bitfount, the switchboard not only enables connection to data at its source, but it also contains a powerful usage-based access control layer which ensures that data scientists are only able to run the analyses that they have been authorised to execute.

There are three key user types in the Bitfount platform (see diagram below):

  • Data Scientist: Someone who wants to perform an analysis, train, or evaluate a model against a given dataset. This person is typically a data scientist, BI analyst, machine learning engineer, or researcher.
  • Data Custodian: Someone who makes data and computational resources available via Bitfount.
  • Authoriser: A person who decides whether a Data Scientist can interact with datasets that have been made available by a Data Custodian. This person also determines exactly which analyses the Data Scientist is allowed to run.

bitfount_architecture_diagram.png

On Bitfount, the application which runs locally, connects to the dataset, and handles the execution of analysis tasks is called the Processor of Data (Pod). When an analysis is requested, the Pod communicates with the Access Manager to ensure that the requesting user is authenticated, and only allows analyses to run if the user has been authorised.

Uses of the Bitfount platform

The Bitfount platform can be used in multiple ways:

  • Consortium: Group of organisations mutually collaborating on each partner’s data. For example, research hospitals who wish to safely contribute imaging data to a broader data pool informing better diagnostic models.
  • Internal use: Secure data collaboration across jurisdictions or departmental silos. For example, retailers may wish to enforce privacy-preserving techniques against loyalty card data when their internal data scientists query the data.
  • 3rd party data evaluation: Evaluating 3rd party data while maintaining privacy and control. For example, a buyer may wish to conduct summary statistics on a 3rd-party dataset prior to purchasing it for model training.
  • Validation of AI-based models: Evaluating models on distributed datasets without centralising data. For example, a regulator may wish to audit the effectiveness of an online trust and safety model without physical transfer of harmful content material.
  • Modeller consulting: Outsource ML model development without granting access to raw data. For example, a financial services institution may wish to improve its automated fraud detection capabilities without needing to hire an in-house ML team.
  • SDK: Improve your products by training on your customers’ on-premise data. For example, a clinical research organisation may wish to embed Bitfount in their clinical trial recruitment software as part of their patient recruitment suite of products.

Community Support

For general community support, please refer to Bitfount’s official guides, tutorials, and API reference docs. For additional support, we provide the following resources: