Skip to Content
ReferenceData Model

Data Model

SeqDesk uses Prisma ORM with PostgreSQL. This page describes the core entities and their relationships.

Entity Relationship Overview

User ──< Order ──< Sample >── Study │ │ │ │ ├── Read ├── PipelineRun ──< PipelineRunStep │ │ │ │ ├── Assembly ├── PipelineRunEvent │ │ │ │ └── Bin └── PipelineArtifact ├── Department └── AdminInvite

Core Models

User

FieldTypeDescription
idStringPrimary key (CUID)
emailStringUnique login identifier
passwordStringbcrypt hash
firstName, lastNameStringDisplay name
roleStringRESEARCHER or FACILITY_ADMIN
researcherRoleString?PI, POSTDOC, PHD_STUDENT, MASTER_STUDENT, TECHNICIAN, OTHER
institutionString?Research institution
departmentIdString?Foreign key to Department

Order

FieldTypeDescription
idStringPrimary key (CUID)
orderNumberStringUnique, format: ORD-YYYYMMDD-XXXX
nameString?Descriptive name
statusStringDRAFT, SUBMITTED, or COMPLETED
platformString?Deprecated compatibility fallback for old/imported records
libraryStrategyString?WGS, RNA-Seq, AMPLICON, etc.
librarySourceString?GENOMIC, METAGENOMIC, etc.
customFieldsString?JSON for form builder fields, including _sequencing_tech
userIdStringForeign key to User

Sample

FieldTypeDescription
idStringPrimary key (CUID)
sampleIdStringInternal ID (format: S-{timestamp}-{random})
sampleAliasString?User-friendly alias
scientificNameString?Taxonomic name
taxIdString?NCBI taxonomy ID
checklistDataString?JSON — MIxS metadata
checklistUnitsString?JSON — units for metadata
customFieldsString?JSON — custom form fields
orderIdStringForeign key to Order
studyIdString?Foreign key to Study (optional)

Study

FieldTypeDescription
idStringPrimary key (CUID)
titleStringStudy title
aliasString?Unique alias for ENA
checklistTypeString?MIxS checklist type
studyAccessionIdString?ENA accession (PRJEB…)
submittedBooleanWhether submitted to ENA
readyForSubmissionBooleanMarked ready by user
userIdStringForeign key to User (owner)

Read

FieldTypeDescription
idStringPrimary key (CUID)
file1String?Forward reads path (R1)
file2String?Reverse reads path (R2)
checksum1, checksum2String?MD5 checksums
experimentAccessionNumberString?ENA experiment (ERX…)
runAccessionNumberString?ENA run (ERR…)
sampleIdStringForeign key to Sample
sequencingRunIdString?Foreign key to SequencingRun
pipelineRunIdString?Foreign key to PipelineRun that produced this read (set null on delete)
pipelineSourcesString?JSON describing the upstream reads/pipeline a generated read came from

Classification and QC

A Read carries an explicit data-class so the app can protect raw inputs and supersede them when a cleaning pipeline produces a derived set.

FieldTypeDescription
dataClassStringcleaned (default), raw, or unknown
dataClassSourceStringHow the class was set: legacy_assumed_cleaned (default), associate, upload, sequencer_ingest, pipeline, manual
isActiveBooleantrue (default) for the live read; cleared when superseded
supersededByReadIdString?Points at the read that replaced this one (e.g. a cleaned read superseding the raw input)
readCount1, readCount2Int?Read counts per file
avgQuality1, avgQuality2Float?Average base quality per file
fastqcReport1, fastqcReport2String?Paths to per-file FastQC reports
classifiedAtDateTime?When the class was last set
classifiedByIdString?User who set the class, when manual
classificationNoteString?Free-text note explaining the classification

Indexes on (sampleId, isActive), dataClass, and supersededByReadId support the per-sample active-read lookups and the supersede chain.

PipelineRun

FieldTypeDescription
idStringPrimary key (CUID)
runNumberStringUnique, format: {PIPELINE}-{DATE}-{NNN}
pipelineIdStringPipeline identifier (e.g., mag)
statusStringpending, queued, running, completed, failed, cancelled
targetTypeStringstudy (default) or order — which entity the run targets
progressInt?0–100
studyIdString?Foreign key to Study (set when targetType is study)
orderIdString?Foreign key to Order (set when targetType is order)
userIdStringForeign key to User (who started it)
runFolderString?Path to run directory (set once execution is staged)

Assembly

FieldTypeDescription
idStringPrimary key (CUID)
assemblyNameString?Assembly identifier
assemblyFileString?Path to FASTA file
assemblyAccessionString?ENA accession
sampleIdStringForeign key to Sample
createdByPipelineRunIdString?Foreign key to PipelineRun

Bin

FieldTypeDescription
idStringPrimary key (CUID)
binNameString?Bin identifier
binAccessionString?ENA accession
binFileString?Path to bin FASTA
completenessFloat?CheckM completeness (0–100)
contaminationFloat?CheckM contamination (0–100)
sampleIdStringForeign key to Sample
createdByPipelineRunIdString?Foreign key to PipelineRun

Supporting Models

ModelPurpose
DepartmentUser grouping with name, description, isActive
AdminInviteInvite codes for admin registration
StatusNoteAudit trail on orders (STATUS_CHANGE, SAMPLES_SENT, INTERNAL)
SequencingRunRun-level metadata and QC metrics
PipelineConfigPer-pipeline enabled flag and settings
PipelineRunStepIndividual process status within a run

The remaining supporting models are documented in detail below — each is either touched by automation, surfaced in a dashboard, or has structure that the table-summary doesn’t capture.

Sampleset

Per-order metadata form configuration. One Sampleset per Order (orderId is unique).

FieldTypeDescription
idStringPrimary key (CUID)
checklistsJSON stringArray of enabled MIxS checklist accessions, e.g. ["ERC000022"]
selectedFieldsJSON string?Subset of checklist fields the order surfaces, when narrower than the full checklist
fieldOverridesJSON string?Per-field overrides keyed by field id — typically { label, required, helpText } to localize defaults
sampleTypeIntDiscriminator (default 1 for standard biological samples)
orderIdString, uniqueForeign key to Order

The combination of checklists, selectedFields, and fieldOverrides is how a facility tailors the MIxS metadata form per order without forking the underlying checklist definition.

Submission

ENA submission tracking per entity. Drives the Submissions Dashboard.

FieldTypeDescription
idStringPrimary key (CUID)
submissionTypeStringSTUDY, SAMPLE, READ, ASSEMBLY, BIN
statusStringOne of PENDING, SUBMITTED, PARTIAL, ACCEPTED, REJECTED, ERROR, CANCELLED
xmlContentString?Generated XML SeqDesk sent to ENA (kept for debugging)
responseJSON string?ENA response plus a steps[] timeline rendered by the dashboard
accessionNumbersJSON string?Map of returned accessions, e.g. { "studyAccession": "PRJEB12345" }
entityTypeStringstudy or sample
entityIdStringFK into Study or Sample
createdAt, updatedAtDateTimeStandard timestamps

SequencingUpload

Tracks an in-progress chunked upload session for a sequencing artifact (reads or other artifacts). Created by POST /api/orders/[id]/sequencing/uploads and finalized by the matching complete endpoint.

FieldTypeDescription
idStringPrimary key (CUID)
orderIdStringForeign key to Order
sampleIdString?Optional; when the upload is a per-sample artifact
targetKindStringread or artifact
targetRoleStringFor read uploads, R1 or R2
originalNameStringFilename as submitted
tempPathStringServer path receiving chunks
finalPathString?Promoted path once complete succeeds
expectedSizeBigIntBytes expected by initiate
receivedSizeBigIntBytes received so far
statusStringPENDING, COMPLETED, ABORTED
checksumProvided, checksumComputedString?Optional MD5 verification
mimeTypeString?MIME type if known
metadataJSON string?Caller-supplied metadata
createdByIdStringUser who initiated the upload
createdAt, updatedAtDateTimeStandard timestamps

Indexes on (orderId, status) and (sampleId, status) keep the upload list queries fast.

PipelineRunEvent

Event-feed entry from a pipeline run, populated by the weblog endpoint and internal step transitions.

FieldTypeDescription
idStringPrimary key (CUID)
pipelineRunIdStringForeign key to PipelineRun
eventTypeStringrun_started, process_completed, process_failed, etc. — values are open, not gate-kept
processNameString?Nextflow process or SLURM step identifier
stepIdString?SeqDesk-side step id
statusString?RUNNING, COMPLETED, FAILED, …
messageString?Human-readable detail
payloadJSON string?Full event body, trimmed to a reasonable size
sourceString?weblog, trace, queue, process
occurredAtDateTimeIndexed alongside pipelineRunId for timeline queries

PipelineArtifact

Files produced by a pipeline run. Used by the assembly browser, bin viewer, and QC report previewer.

FieldTypeDescription
idStringPrimary key (CUID)
typeStringreads, assembly, bins, qc_report, alignment
nameString?Display name
pathStringPath relative to the configured data base path
checksumString?MD5 if computed
sizeBigInt?Bytes if known
outputIdString?Manifest output id, when the artifact was resolved from a named package output (indexed)
studyId, sampleIdString?Optional lineage links
pipelineRunIdString?Foreign key to PipelineRun
producedByStepIdString?Step that wrote the artifact
metadataJSON string?Tool-specific info (e.g., MultiQC summary stats)
createdAtDateTimeStandard timestamp

DemoWorkspace

Disposable demo session record. See Demo mode for the full lifecycle.

FieldTypeDescription
idStringPrimary key (CUID)
tokenHashString, uniqueSHA-256 of the bootstrap token; raw token only lives in the browser cookie
userIdString, uniqueForeign key to User (the demo researcher)
adminUserIdString?, uniqueForeign key to User (demo facility admin)
seedVersionIntBumped when the seed schema changes
lastSeenAtDateTimeUpdated on each request
expiresAtDateTime, indexedCleanup query target
createdAt, updatedAtDateTimeStandard timestamps

SiteSettings

Singleton configuration row (id is always "singleton"). Stores the bulk of facility configuration; large or nested config is held in two JSON columns:

ColumnShape
modulesConfig{ "modules": { "sequencing-tech": true, … } } — module enable map
extraSettingsFree-form JSON for everything else: studyFormFields, studyFormGroups, pipelineExecution, telemetry, ena, sequencingTechConfig, etc.

OrderFormConfig

Single-row table holding the order form schema as configured in the Order Form Builder.

ColumnShape
schemaJSON { fields: [...], groups: [...] } of the order form, surfaced by GET /api/form-schema
versionInt (default 1) — bumped on each save; returned verbatim as the schema version from the form-schema API

The schema is merge-applied when configuration is applied, so a default form can be shipped without overwriting facility-local additions.

TicketMessage

Per-message row inside a Ticket thread. Tickets back the in-app messaging UI between researchers and facility admins.

FieldTypeDescription
idStringPrimary key (CUID)
ticketIdStringForeign key to Ticket
userIdStringForeign key to User (the author)
contentStringMessage body
createdAtDateTimeStandard timestamp

Read state is tracked on the parent Ticket, not per message: Ticket carries userReadAt / adminReadAt (plus lastUserMessageAt / lastAdminMessageAt) to drive unread indicators for each side.

PipelineResultSelection

Records the explicit “final result” a facility admin selected among the pipeline runs for one study/order target. Backs the run-selection endpoints.

FieldTypeDescription
idStringPrimary key (CUID)
pipelineIdStringPipeline whose runs are being selected from
targetKeyStringstudy:<id> or order:<id> — unique together with pipelineId
studyId, orderIdString?Whichever target applies
selectedRunIdString, uniqueThe chosen PipelineRun
selectedByIdString?User who made the selection
selectedAtDateTimeWhen it was selected
createdAt, updatedAtDateTimeStandard timestamps

SequencingArtifact

Facility-managed file produced or received during sequencing (raw reads, QC outputs, delivery bundles, etc.), distinct from PipelineArtifact.

FieldTypeDescription
idStringPrimary key (CUID)
orderIdStringForeign key to Order
sampleIdString?Optional per-sample link
sequencingRunIdString?Optional run link (set null on run delete)
stageStringsample_receipt, sequencing, raw_reads, qc, delivery
artifactTypeStringCaller-defined artifact type
sourceStringHow it was produced/received
visibilityStringfacility (default) — gates researcher access
pathStringStored path
originalNameStringFilename as received
sizeBigInt?Bytes
checksumString?MD5 if computed
mimeTypeString?MIME type if known
metadataJSON string?Caller-supplied metadata
createdByIdString?Uploader
createdAt, updatedAtDateTimeStandard timestamps

SequencingRunSample

Join row assigning a sample (and barcode) to a SequencingRun. Written by the run-plan import and barcode assignment flows.

FieldTypeDescription
idStringPrimary key (CUID)
sequencingRunIdStringForeign key to SequencingRun
sampleIdStringForeign key to Sample
barcodeString?Demux barcode; unique within a run
customFieldsJSON string?Run-assignment custom fields
notesString?Free-text note
createdAt, updatedAtDateTimeStandard timestamps

Unique on (sequencingRunId, sampleId) and (sequencingRunId, barcode).

InAppNotification

Per-user in-app notification (e.g. a pipeline run reaching a terminal state).

FieldTypeDescription
idStringPrimary key (CUID)
userIdStringRecipient
eventTypeStringNotification event type
severityStringinfo (default), etc.
titleStringHeadline
bodyString?Detail text
linkPathString?In-app link target
sourceTypeStringOriginating entity type
sourceIdString?Originating entity id
dedupeKeyString, uniquePrevents duplicate notifications
readAtDateTime?When the user read it
archivedAtDateTime?When archived
createdAt, updatedAtDateTimeStandard timestamps

BackgroundWorkerProcess

Tracks a long-running background worker process (e.g. the stream monitor) for the Admin → Background Workers page.

FieldTypeDescription
idStringPrimary key (CUID)
nameStringWorker name
pidIntOS process id
statusStringRUNNING (default), terminal states
startedAtDateTimeProcess start
startedByIdString?User who started it
stoppedAtDateTime?Process stop
exitCodeInt?Exit code when stopped
logPathStringPath to the worker log file
lastErrorMsgString?Last recorded error

Live stream (MinKNOW ingest)

These models back the live ONT stream ingest subsystem, where the stream-monitor watches a MinKNOW output directory and ingests FASTQ files in real time. See the stream endpoints.

StreamRun

One live ingest session bound to an order and a watched output directory. Only one ACTIVE run may watch a given directory at a time.

FieldTypeDescription
idStringPrimary key (CUID)
orderIdStringForeign key to Order
minknowRunIdString?MinKNOW run identifier
flowCellId, deviceIdString?Device metadata
outputDirStringWatched directory (validated under the configured root)
statusStringACTIVE (default), then stopping/stopped states
totalBasesBigIntRunning total, default 0
totalReadsIntRunning total, default 0
barcodeMapJSON string?Barcode → sample mapping
startedAtDateTimeSession start
lastSeenAtDateTimeUpdated on each ingest
stoppedAtDateTime?When stopped
monitorIdString?Owning monitor process id
heartbeatAtDateTime?Monitor liveness timestamp

StreamIngestedFile

One FASTQ file ingested into a stream run. The unique (streamRunId, filePath) constraint makes ingest idempotent: re-emitted filesystem events cannot double-count reads (the monitor upserts here before incrementing run totals).

FieldTypeDescription
idStringPrimary key (CUID)
streamRunIdStringForeign key to StreamRun
sampleIdString?Resolved sample (set null on delete)
filePathStringIngested file path; unique within a run
barcodeString?Demux barcode
sizeIntBytes, default 0
readsIntReads counted, default 0
basesBigIntBases counted, default 0
ingestedAtDateTimeWhen ingested

StreamRunEvent

Append-only event log for a stream run, served via cursor pagination on the monotonic seq.

FieldTypeDescription
idStringPrimary key (CUID)
streamRunIdStringForeign key to StreamRun
seqIntAuto-increment sequence (pagination cursor)
tsDateTimeEvent time
kindStringe.g. RUN_STARTED, FILE_INGESTED
payloadJSON string?Event detail

Workbench (data imports)

The Workbench subsystem lets a researcher import reference datasets into a private workspace and arrange them on an analysis canvas.

WorkbenchWorkspace

One private workspace per owner (ownerId is unique).

FieldTypeDescription
idStringPrimary key (CUID)
nameStringDefault "Private Workbench"
ownerIdString, uniqueForeign key to User
isDefaultBooleanDefault true
createdAt, updatedAtDateTimeStandard timestamps

WorkbenchAnalysis

A canvas of imported datasets inside a workspace.

FieldTypeDescription
idStringPrimary key (CUID)
workspaceIdStringForeign key to WorkbenchWorkspace
nameStringDefault "Untitled analysis"
descriptionString?Optional
canvasStringSerialized canvas state
revisionIntDefault 1
isDefaultBooleanDefault false
createdAt, updatedAtDateTimeStandard timestamps

WorkbenchDataset

A cached reference dataset, deduplicated by cacheKey.

FieldTypeDescription
idStringPrimary key (CUID)
providerIdStringSource provider
cacheKeyString, uniqueDedupe key
nameStringDisplay name
descriptionString?Optional
sourceTypeStringProvider source type
sourceMetadataJSON string?Provider metadata
storagePathString?Local storage path
sizeBytesBigInt?Bytes
checksumSha256String?SHA-256
genomeCountInt?Number of genomes
statusStringready (default)
createdAt, updatedAtDateTimeStandard timestamps

WorkbenchWorkspaceDataset links datasets into workspaces ((workspaceId, datasetId) unique), optionally recording the import job that created the link.

WorkbenchImportJob

Tracks an asynchronous dataset import.

FieldTypeDescription
idStringPrimary key (CUID)
workspaceIdStringForeign key to WorkbenchWorkspace
providerIdStringSource provider
statusStringqueued (default), then running/terminal
phaseString?Sub-phase label
requestStringSerialized import request
previewString?Serialized preview
progressInt?0–100
logPath, targetPathString?Worker log / output paths
errorString?Failure detail
createdByIdStringRequesting user
resultDatasetIdString?Produced dataset
analysisId, analysisNodeIdString?Canvas placement target
startedAt, finishedAtDateTime?Execution window
createdAt, updatedAtDateTimeStandard timestamps