Skip to content

Simian Wrapper: your shortcut to computational reproducibility.

Introduction

Simian Wrapper facilitates reproducible research and operation in a way “too easy not to do”. Securing a collaborative, reusable and transparent workflow, Simian Wrapper provides building blocks for data integration and governance – cornerstones of computational reproducibility.

Use cases for Simian Wrapper reside not only in environments where auditability and case-study tracking are paramount, driven by regulatory demands or the need to cover liabilities. But also in research environments where methodologies for innovative analytics are being explored and developed – iterative processes by definition.

Adhering to the paradigms of The Turing Way handbook to reproducible, ethical, and collaborative data science (*), reproducibility is defined as work that can be independently recreated from the same data and the same code originally used in an analytics case-study. By design, Simian Wrapper facilitates the four different dimensions of reproducible operation coined by The Turing Way:

simian-wrapper-the-turing-way-reproducible-recolored
Reproducible: when the same analysis steps performed on the same dataset consistently produces the same answer.
simian-wrapper-the-turing-way-replicable-recolored
Replicable: when the same analysis, performed on different datasets, produces qualitatively similar answers.
simian-wrapper-the-turing-way-robust-recolored
Robust: when the same dataset is subjected to different analyses or workflows to perform the same computational operation and a qualitatively similar or (near)identical answer is produced. Across model versions or coding languages.
simian-wrapper-the-turing-way-gereralisable-recolored
Generalisable: combining replicable and robust findings help deducting generalizable results. Generalization is a first, but important step towards achieving qualitatively similar results, relatively independent of datasets and model versions.

(*) https://the-turing-way.netlify.app, The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807.

 

Benefits

Harnessing your models in Simian Wrapper offers benefits beyond the primary goal of computational reproducibility, which might not be immediately evident at first glance:

track case study history
Track case study history

Track software versions and ingested data. Additionally, capture the history and narrative of your case studies.

collaborate and review
Collaborate & review

Create shared understanding through inherent coding consistency, facilitating meaningful collaboration.

avoid misinformation
Avoid misinformation

Operate from a single source of truth by design, and inherently prevent data turmoil and ambiguity.

efficient reporting
Efficient reporting

Guaranteed, seamless access to all generated results and underlying data, facilitating ever more elaborate reporting.

audit and reproduction
Audit & reproduction

Be truly auditable, by consistent storage of metadata, data, and results. Ensuring preparedness for ever stricter scrutiny.

ensure continuity
Ensure continuity

Communicate your work with different stakeholders and enjoy easy handovers, to let others continue seamlessly.

Solution

The Simian Wrapper components collectively address all four dimensions of reproducibility, ultimately providing a pathway to generalized results.

simian-wrapper-workflow-v3

Creation

  • Project definition
  • Lib of backends
  • Lib of frontends

Data integration

  • Data dictionary
  • Data ingestion
  • Preprocessing

Governance

  • DataStore
  • Job files
  • Computational reproducability

Analysis & publication

  • Results
  • Reporting
  • Data lineage

 

How it all works

simian-wrapper-create-analytics-appa
Build analytics apps
At creation, populate libraries for backends and frontends:

  • Host multiple versions of a specific model.
  • Host various methodical variations of a specific model.
  • Host multiple models within a specific domain.
simian-wrapper-data-ingestion
Onboard your data
At ingestion, define input data in a structured manner:

  • Define dictionaries to source data from multiple origins.
  • Fetch, preprocess and save the captured data in an immutable store.
simian-wrapper-store-data-in-jobs
Store work in jobs
At operation, intrinsically secure your governance:

  • Code confidently, with the consistent coding-interface.
  • Capture metadata, input data and computational results in job files.
  • Enjoy reproducibility, with jobs and managed code repos.
simian-wrapper-report-and-publish-results
Report your results
At evaluation, efficient review and reporting is ensured:

  • Obtain data usage insights with data lineage analysis.
  • Effortlessly share job files, ensuring smooth teamwork all around.
  • Report efficiently, with help of the consistent interfaces.