SciFlow
Early access

Launch research GPU workspaces without fighting the cluster

SciFlow gives labs and AI teams a self-service control plane for template-backed GPU instances, org quotas, image commits, and usage visibility on top of Kubernetes.

Built on
Kubernetes
Control plane
Rust services
Identity
OIDC / Authentik

Launch instance

PyTorch + Jupyter

Starts now
H100
A100
L40S
Templatepytorch:2.4 + jupyter
Org chargedVision Lab
GPU shape1 × H100 (1/1)
Auto-stop12h
The problem

Research compute should not depend on Slack messages and manual kubectl

Most teams build the same control plane twice — once in scripts, once in tickets. SciFlow replaces both.

GPU access is hard to share fairly

Without a real admission layer, the loudest user wins and queues become Slack threads.

Users want environments, not raw pods

Researchers need long-lived SSH or Jupyter workspaces, not short-lived containers tied to a pod spec.

Image save and reuse is manual

Saving a working environment for the next experiment usually means scripts, registries, and copy-pasted commands.

Admins lack quota and usage visibility

Org-level quotas, member quotas, and GPU usage rollups end up scattered across spreadsheets and dashboards.

Product

A control plane for interactive GPU research

Six product surfaces that together replace the patchwork of scripts, dashboards, and manual approvals most clusters rely on today.

Template-backed instances

Launch long-lived SSH, Jupyter, or custom entrypoint environments from reusable templates with versioned defaults.

Org-aware quota

Users pick which org they are charging while admins manage member quotas and integer GPU budgets per type.

Queue-first admission

Reject invalid requests, queue valid ones when capacity is temporarily unavailable. No empty pods reserving GPUs.

Commit running work into images

Save a configured workload as a reusable image to seed future templates and reproducible experiments.

Account-scoped keys

Keep SSH keys and API keys attached to accounts and injected at launch time — not buried inside templates.

Usage and reporting

Usage records, billing summaries, GPU accounting, and scheduled rollups for cluster admins and finance.

Workflow

From idea to running GPU workspace in minutes

A consistent path from template selection to a saved, reusable image — the same loop researchers already follow informally.

  1. 01

    Pick a template

    Choose a versioned template — image, launch mode, startup script, ports, env vars.

  2. 02

    Choose GPU shape and org

    Pick the GPU type and fraction, and select which org membership you are charging.

  3. 03

    Launch or queue automatically

    Admission runs quota math. Start now if capacity exists, otherwise queue fairly.

  4. 04

    Save the environment as an image

    Commit your running workload back into a reusable image for the next experiment.

Admin · Quotas

Cluster overview

live

GPUs total

24

In use

11

Queued

3

OrgH100A100Queue
Vision Lab4 / 62 / 41
NLP Group1 / 23 / 30
Robotics0 / 11 / 22
For admins

Built for platform teams, not just individual users

SciFlow gives platform admins the controls and visibility they need without forcing the underlying cluster into a rigid layout.

  • Create orgs and appoint org admins
  • Set org GPU quotas per gpu_type
  • Set member-level quotas in 1/8 GPU steps
  • See queue state and runtime status
  • Keep Kubernetes as a generic execution pool
  • Avoid hard-binding physical nodes to groups
Architecture

Designed as a Rust control plane for Kubernetes

Five focused services with clear ownership boundaries. Authentik handles login at the edge; SciFlow handles authorization, quota, and runtime.

P

Policy

Local user projection, quota ledgers, and admission decisions live here.

O

Orchestrator

Templates, instance lifecycle, queue state, and runtime status.

I

Images

Image metadata, commit operations, registry ownership.

O

Operations

Durable workers, retries, reconciliation, node-local execution.

R

Reporting

Usage records, billing summaries, GPU accounting rollups.

Outside SciFlow

Identity, ingress, Postgres

Authentik, oauth2-proxy, and cluster infra are managed in a separate FluxCD repository — SciFlow stays application-only.

Who it's for

For teams that run serious research infrastructure

University AI labs

Shared GPUs across students and projects with sane quotas.

Internal ML platform teams

A self-service surface that replaces ticket queues.

GPU cluster administrators

Org and member quota math without empty placeholder pods.

Startup research teams

Reproducible templates and saved images per experiment.

Shared compute environments

Org-aware fairness with explicit queueing and leases.

Bring self-service GPU compute to your research cluster

SciFlow helps teams launch, govern, and reuse interactive GPU environments without turning Kubernetes into the user interface.