Data Model

SeqDesk uses Prisma ORM with PostgreSQL. This page describes the core entities and their relationships.

Entity Relationship Overview


User ──< Order ──< Sample >── Study
  │                  │           │
  │                  ├── Read    ├── PipelineRun ──< PipelineRunStep
  │                  │                │
  │                  ├── Assembly     ├── PipelineRunEvent
  │                  │                │
  │                  └── Bin          └── PipelineArtifact
  │
  ├── Department
  └── AdminInvite

Core Models

User

Field	Type	Description
id	String	Primary key (CUID)
email	String	Unique login identifier
password	String	bcrypt hash
firstName, lastName	String	Display name
role	String	`RESEARCHER` or `FACILITY_ADMIN`
researcherRole	String?	PI, POSTDOC, PHD_STUDENT, MASTER_STUDENT, TECHNICIAN, OTHER
institution	String?	Research institution
departmentId	String?	Foreign key to Department

Order

The Order model is the schema-level name for what the UI labels a Sequencing Order. The database model, table, and field names below are unchanged — only the user-facing label differs.

Field	Type	Description
id	String	Primary key (CUID)
orderNumber	String	Unique, format: `ORD-YYYYMMDD-XXXX`
name	String?	Descriptive name
status	String	`DRAFT`, `SUBMITTED`, or `COMPLETED`
platform	String?	Deprecated compatibility fallback for old/imported records
libraryStrategy	String?	WGS, RNA-Seq, AMPLICON, etc.
librarySource	String?	GENOMIC, METAGENOMIC, etc.
customFields	String?	JSON for form builder fields, including `_sequencing_tech`
userId	String	Foreign key to User

Sample

Field	Type	Description
id	String	Primary key (CUID)
sampleId	String	Internal ID (format: `S-{timestamp}-{random}`)
sampleAlias	String?	User-friendly alias
scientificName	String?	Taxonomic name
taxId	String?	NCBI taxonomy ID
checklistData	String?	JSON — MIxS metadata
checklistUnits	String?	JSON — units for metadata
customFields	String?	JSON — custom form fields
orderId	String	Foreign key to Order
studyId	String?	Foreign key to Study (optional)

Study

Field	Type	Description
id	String	Primary key (CUID)
title	String	Study title
alias	String?	Unique alias for ENA
checklistType	String?	MIxS checklist type
studyAccessionId	String?	ENA accession (PRJEB…)
submitted	Boolean	Whether submitted to ENA
readyForSubmission	Boolean	Marked ready by user
userId	String	Foreign key to User (owner)

Read

Field	Type	Description
id	String	Primary key (CUID)
file1	String?	Forward reads path (R1)
file2	String?	Reverse reads path (R2)
checksum1, checksum2	String?	MD5 checksums
experimentAccessionNumber	String?	ENA experiment (ERX…)
runAccessionNumber	String?	ENA run (ERR…)
sampleId	String	Foreign key to Sample
sequencingRunId	String?	Foreign key to SequencingRun
pipelineRunId	String?	Foreign key to PipelineRun that produced this read (set null on delete)
pipelineSources	String?	JSON describing the upstream reads/pipeline a generated read came from

Classification and QC

A Read carries an explicit data-class so the app can protect raw inputs and supersede them when a cleaning pipeline produces a derived set.

Field	Type	Description
dataClass	String	`cleaned` (default), `raw`, or `unknown`
dataClassSource	String	How the class was set: `legacy_assumed_cleaned` (default), `associate`, `upload`, `sequencer_ingest`, `pipeline`, `manual`
isActive	Boolean	`true` (default) for the live read; cleared when superseded
supersededByReadId	String?	Points at the read that replaced this one (e.g. a cleaned read superseding the raw input)
readCount1, readCount2	Int?	Read counts per file
avgQuality1, avgQuality2	Float?	Average base quality per file
fastqcReport1, fastqcReport2	String?	Paths to per-file FastQC reports
classifiedAt	DateTime?	When the class was last set
classifiedById	String?	User who set the class, when manual
classificationNote	String?	Free-text note explaining the classification

Indexes on (sampleId, isActive), dataClass, and supersededByReadId support the per-sample active-read lookups and the supersede chain.

PipelineRun

Field	Type	Description
id	String	Primary key (CUID)
runNumber	String	Unique, format: `{PIPELINE}-{DATE}-{NNN}`
pipelineId	String	Pipeline identifier (e.g., `mag`)
status	String	pending, queued, running, completed, failed, cancelled
targetType	String	`study` (default) or `order` — which entity the run targets
progress	Int?	0–100
studyId	String?	Foreign key to Study (set when `targetType` is `study`)
orderId	String?	Foreign key to Order (set when `targetType` is `order`)
userId	String	Foreign key to User (who started it)
runFolder	String?	Path to run directory (set once execution is staged)

Assembly

Field	Type	Description
id	String	Primary key (CUID)
assemblyName	String?	Assembly identifier
assemblyFile	String?	Path to FASTA file
assemblyAccession	String?	ENA accession
sampleId	String	Foreign key to Sample
createdByPipelineRunId	String?	Foreign key to PipelineRun

Bin

Field	Type	Description
id	String	Primary key (CUID)
binName	String?	Bin identifier
binAccession	String?	ENA accession
binFile	String?	Path to bin FASTA
completeness	Float?	CheckM completeness (0–100)
contamination	Float?	CheckM contamination (0–100)
sampleId	String	Foreign key to Sample
createdByPipelineRunId	String?	Foreign key to PipelineRun

Supporting Models

Model	Purpose
Department	User grouping with name, description, isActive
AdminInvite	Invite codes for admin registration
StatusNote	Audit trail on sequencing orders (STATUS_CHANGE, SAMPLES_SENT, INTERNAL)
SequencingRun	Run-level metadata and QC metrics
PipelineConfig	Per-pipeline enabled flag and settings
PipelineRunStep	Individual process status within a run

The remaining supporting models are documented in detail below — each is either touched by automation, surfaced in a dashboard, or has structure that the table-summary doesn’t capture.

Sampleset

Per-sequencing-order metadata form configuration. One Sampleset per Order (orderId is unique).

Field	Type	Description
`id`	String	Primary key (CUID)
`checklists`	JSON string	Array of enabled MIxS checklist accessions, e.g. `["ERC000022"]`
`selectedFields`	JSON string?	Subset of checklist fields the order surfaces, when narrower than the full checklist
`fieldOverrides`	JSON string?	Per-field overrides keyed by field id — typically `{ label, required, helpText }` to localize defaults
`sampleType`	Int	Discriminator (default `1` for standard biological samples)
`orderId`	String, unique	Foreign key to Order

The combination of checklists, selectedFields, and fieldOverrides is how a facility tailors the MIxS metadata form per sequencing order without forking the underlying checklist definition.

Submission

ENA submission tracking per entity. Drives the Submissions Dashboard.

Field	Type	Description
`id`	String	Primary key (CUID)
`submissionType`	String	`STUDY`, `SAMPLE`, `READ`, `ASSEMBLY`, `BIN`
`status`	String	One of `PENDING`, `SUBMITTED`, `PARTIAL`, `ACCEPTED`, `REJECTED`, `ERROR`, `CANCELLED`
`xmlContent`	String?	Generated XML SeqDesk sent to ENA (kept for debugging)
`response`	JSON string?	ENA response plus a `steps[]` timeline rendered by the dashboard
`accessionNumbers`	JSON string?	Map of returned accessions, e.g. `{ "studyAccession": "PRJEB12345" }`
`entityType`	String	`study` or `sample`
`entityId`	String	FK into Study or Sample
`createdAt`, `updatedAt`	DateTime	Standard timestamps

SequencingUpload

Tracks an in-progress chunked upload session for a sequencing artifact (reads or other artifacts). Created by POST /api/orders/[id]/sequencing/uploads and finalized by the matching complete endpoint.

Field	Type	Description
`id`	String	Primary key (CUID)
`orderId`	String	Foreign key to Order
`sampleId`	String?	Optional; when the upload is a per-sample artifact
`targetKind`	String	`read` or `artifact`
`targetRole`	String	For `read` uploads, `R1` or `R2`
`originalName`	String	Filename as submitted
`tempPath`	String	Server path receiving chunks
`finalPath`	String?	Promoted path once `complete` succeeds
`expectedSize`	BigInt	Bytes expected by initiate
`receivedSize`	BigInt	Bytes received so far
`status`	String	`PENDING`, `COMPLETED`, `ABORTED`
`checksumProvided`, `checksumComputed`	String?	Optional MD5 verification
`mimeType`	String?	MIME type if known
`metadata`	JSON string?	Caller-supplied metadata
`createdById`	String	User who initiated the upload
`createdAt`, `updatedAt`	DateTime	Standard timestamps

Indexes on (orderId, status) and (sampleId, status) keep the upload list queries fast.

PipelineRunEvent

Event-feed entry from a pipeline run, populated by the weblog endpoint and internal step transitions.

Field	Type	Description
`id`	String	Primary key (CUID)
`pipelineRunId`	String	Foreign key to PipelineRun
`eventType`	String	`run_started`, `process_completed`, `process_failed`, etc. — values are open, not gate-kept
`processName`	String?	Nextflow process or SLURM step identifier
`stepId`	String?	SeqDesk-side step id
`status`	String?	`RUNNING`, `COMPLETED`, `FAILED`, …
`message`	String?	Human-readable detail
`payload`	JSON string?	Full event body, trimmed to a reasonable size
`source`	String?	`weblog`, `trace`, `queue`, `process`
`occurredAt`	DateTime	Indexed alongside `pipelineRunId` for timeline queries

PipelineArtifact

Files produced by a pipeline run. Used by the assembly browser, bin viewer, and QC report previewer.

Field	Type	Description
`id`	String	Primary key (CUID)
`type`	String	`reads`, `assembly`, `bins`, `qc_report`, `alignment`
`name`	String?	Display name
`path`	String	Path relative to the configured data base path
`checksum`	String?	MD5 if computed
`size`	BigInt?	Bytes if known
`outputId`	String?	Manifest output id, when the artifact was resolved from a named package output (indexed)
`studyId`, `sampleId`	String?	Optional lineage links
`pipelineRunId`	String?	Foreign key to PipelineRun
`producedByStepId`	String?	Step that wrote the artifact
`metadata`	JSON string?	Tool-specific info (e.g., MultiQC summary stats)
`createdAt`	DateTime	Standard timestamp

DemoWorkspace

Disposable demo session record. See Demo mode for the full lifecycle.

Field	Type	Description
`id`	String	Primary key (CUID)
`tokenHash`	String, unique	SHA-256 of the bootstrap token; raw token only lives in the browser cookie
`userId`	String, unique	Foreign key to User (the demo researcher)
`adminUserId`	String?, unique	Foreign key to User (demo facility admin)
`seedVersion`	Int	Bumped when the seed schema changes
`lastSeenAt`	DateTime	Updated on each request
`expiresAt`	DateTime, indexed	Cleanup query target
`createdAt`, `updatedAt`	DateTime	Standard timestamps

SiteSettings

Singleton configuration row (id is always "singleton"). Stores the bulk of facility configuration; large or nested config is held in two JSON columns:

Column	Shape
`modulesConfig`	`{ "modules": { "sequencing-tech": true, … } }` — module enable map
`extraSettings`	Free-form JSON for everything else: `studyFormFields`, `studyFormGroups`, `pipelineExecution`, `telemetry`, `ena`, `sequencingTechConfig`, etc.

OrderFormConfig

Single-row table holding the order form schema as configured in the Order Form Builder.

Column	Shape
`schema`	JSON `{ fields: [...], groups: [...] }` of the order form, surfaced by `GET /api/form-schema`
`version`	Int (default `1`) — bumped on each save; returned verbatim as the schema `version` from the form-schema API

The schema is merge-applied when configuration is applied, so a default form can be shipped without overwriting facility-local additions.

TicketMessage

Per-message row inside a Ticket thread. Tickets back the in-app messaging UI between researchers and facility admins.

Field	Type	Description
`id`	String	Primary key (CUID)
`ticketId`	String	Foreign key to Ticket
`userId`	String	Foreign key to User (the author)
`content`	String	Message body
`createdAt`	DateTime	Standard timestamp

Read state is tracked on the parent Ticket, not per message: Ticket carries userReadAt / adminReadAt (plus lastUserMessageAt / lastAdminMessageAt) to drive unread indicators for each side.

PipelineResultSelection

Records the explicit “final result” a facility admin selected among the pipeline runs for one study/sequencing-order target. Backs the run-selection endpoints.

Field	Type	Description
`id`	String	Primary key (CUID)
`pipelineId`	String	Pipeline whose runs are being selected from
`targetKey`	String	`study:<id>` or `order:<id>` — unique together with `pipelineId`
`studyId`, `orderId`	String?	Whichever target applies
`selectedRunId`	String, unique	The chosen PipelineRun
`selectedById`	String?	User who made the selection
`selectedAt`	DateTime	When it was selected
`createdAt`, `updatedAt`	DateTime	Standard timestamps

SequencingArtifact

Facility-managed file produced or received during sequencing (raw reads, QC outputs, delivery bundles, etc.), distinct from PipelineArtifact.

Field	Type	Description
`id`	String	Primary key (CUID)
`orderId`	String	Foreign key to Order
`sampleId`	String?	Optional per-sample link
`sequencingRunId`	String?	Optional run link (set null on run delete)
`stage`	String	`sample_receipt`, `sequencing`, `raw_reads`, `qc`, `delivery`
`artifactType`	String	Caller-defined artifact type
`source`	String	How it was produced/received
`visibility`	String	`facility` (default) — gates researcher access
`path`	String	Stored path
`originalName`	String	Filename as received
`size`	BigInt?	Bytes
`checksum`	String?	MD5 if computed
`mimeType`	String?	MIME type if known
`metadata`	JSON string?	Caller-supplied metadata
`createdById`	String?	Uploader
`createdAt`, `updatedAt`	DateTime	Standard timestamps

SequencingRunSample

Join row assigning a sample (and barcode) to a SequencingRun. Written by the run-plan import and barcode assignment flows.

Field	Type	Description
`id`	String	Primary key (CUID)
`sequencingRunId`	String	Foreign key to SequencingRun
`sampleId`	String	Foreign key to Sample
`barcode`	String?	Demux barcode; unique within a run
`customFields`	JSON string?	Run-assignment custom fields
`notes`	String?	Free-text note
`createdAt`, `updatedAt`	DateTime	Standard timestamps

Unique on (sequencingRunId, sampleId) and (sequencingRunId, barcode).

InAppNotification

Per-user in-app notification (e.g. a pipeline run reaching a terminal state).

Field	Type	Description
`id`	String	Primary key (CUID)
`userId`	String	Recipient
`eventType`	String	Notification event type
`severity`	String	`info` (default), etc.
`title`	String	Headline
`body`	String?	Detail text
`linkPath`	String?	In-app link target
`sourceType`	String	Originating entity type
`sourceId`	String?	Originating entity id
`dedupeKey`	String, unique	Prevents duplicate notifications
`readAt`	DateTime?	When the user read it
`archivedAt`	DateTime?	When archived
`createdAt`, `updatedAt`	DateTime	Standard timestamps

BackgroundWorkerProcess

Tracks a long-running background worker process (e.g. the stream monitor) for the Admin → Background Workers page.

Field	Type	Description
`id`	String	Primary key (CUID)
`name`	String	Worker name
`pid`	Int	OS process id
`status`	String	`RUNNING` (default), terminal states
`startedAt`	DateTime	Process start
`startedById`	String?	User who started it
`stoppedAt`	DateTime?	Process stop
`exitCode`	Int?	Exit code when stopped
`logPath`	String	Path to the worker log file
`lastErrorMsg`	String?	Last recorded error

Live stream (MinKNOW ingest)

These models back the live ONT stream ingest subsystem, where the stream-monitor watches a MinKNOW output directory and ingests FASTQ files in real time. See the stream endpoints.

StreamRun

One live ingest session bound to a sequencing order and a watched output directory. Only one ACTIVE run may watch a given directory at a time.

Field	Type	Description
`id`	String	Primary key (CUID)
`orderId`	String	Foreign key to Order
`minknowRunId`	String?	MinKNOW run identifier
`flowCellId`, `deviceId`	String?	Device metadata
`outputDir`	String	Watched directory (validated under the configured root)
`status`	String	`ACTIVE` (default), then stopping/stopped states
`totalBases`	BigInt	Running total, default `0`
`totalReads`	Int	Running total, default `0`
`barcodeMap`	JSON string?	Barcode → sample mapping
`startedAt`	DateTime	Session start
`lastSeenAt`	DateTime	Updated on each ingest
`stoppedAt`	DateTime?	When stopped
`monitorId`	String?	Owning monitor process id
`heartbeatAt`	DateTime?	Monitor liveness timestamp

StreamIngestedFile

One FASTQ file ingested into a stream run. The unique (streamRunId, filePath) constraint makes ingest idempotent: re-emitted filesystem events cannot double-count reads (the monitor upserts here before incrementing run totals).

Field	Type	Description
`id`	String	Primary key (CUID)
`streamRunId`	String	Foreign key to StreamRun
`sampleId`	String?	Resolved sample (set null on delete)
`filePath`	String	Ingested file path; unique within a run
`barcode`	String?	Demux barcode
`size`	Int	Bytes, default `0`
`reads`	Int	Reads counted, default `0`
`bases`	BigInt	Bases counted, default `0`
`ingestedAt`	DateTime	When ingested

StreamRunEvent

Append-only event log for a stream run, served via cursor pagination on the monotonic seq.

Field	Type	Description
`id`	String	Primary key (CUID)
`streamRunId`	String	Foreign key to StreamRun
`seq`	Int	Auto-increment sequence (pagination cursor)
`ts`	DateTime	Event time
`kind`	String	e.g. `RUN_STARTED`, `FILE_INGESTED`
`payload`	JSON string?	Event detail

Workbench (data imports)

The Workbench subsystem lets a researcher import reference datasets into a private workspace and arrange them on an analysis canvas.

WorkbenchWorkspace

One private workspace per owner (ownerId is unique).

Field	Type	Description
`id`	String	Primary key (CUID)
`name`	String	Default `"Private Workbench"`
`ownerId`	String, unique	Foreign key to User
`isDefault`	Boolean	Default `true`
`createdAt`, `updatedAt`	DateTime	Standard timestamps

WorkbenchAnalysis

A canvas of imported datasets inside a workspace.

Field	Type	Description
`id`	String	Primary key (CUID)
`workspaceId`	String	Foreign key to WorkbenchWorkspace
`name`	String	Default `"Untitled analysis"`
`description`	String?	Optional
`canvas`	String	Serialized canvas state
`revision`	Int	Default `1`
`isDefault`	Boolean	Default `false`
`createdAt`, `updatedAt`	DateTime	Standard timestamps

WorkbenchDataset

A cached reference dataset, deduplicated by cacheKey.

Field	Type	Description
`id`	String	Primary key (CUID)
`providerId`	String	Source provider
`cacheKey`	String, unique	Dedupe key
`name`	String	Display name
`description`	String?	Optional
`sourceType`	String	Provider source type
`sourceMetadata`	JSON string?	Provider metadata
`storagePath`	String?	Local storage path
`sizeBytes`	BigInt?	Bytes
`checksumSha256`	String?	SHA-256
`genomeCount`	Int?	Number of genomes
`status`	String	`ready` (default)
`createdAt`, `updatedAt`	DateTime	Standard timestamps

WorkbenchWorkspaceDataset links datasets into workspaces ((workspaceId, datasetId) unique), optionally recording the import job that created the link.

WorkbenchImportJob

Tracks an asynchronous dataset import.

Field	Type	Description
`id`	String	Primary key (CUID)
`workspaceId`	String	Foreign key to WorkbenchWorkspace
`providerId`	String	Source provider
`status`	String	`queued` (default), then running/terminal
`phase`	String?	Sub-phase label
`request`	String	Serialized import request
`preview`	String?	Serialized preview
`progress`	Int?	0–100
`logPath`, `targetPath`	String?	Worker log / output paths
`error`	String?	Failure detail
`createdById`	String	Requesting user
`resultDatasetId`	String?	Produced dataset
`analysisId`, `analysisNodeId`	String?	Canvas placement target
`startedAt`, `finishedAt`	DateTime?	Execution window
`createdAt`, `updatedAt`	DateTime	Standard timestamps