Skip to Content
Pipelines & AnalysisRunning a Pipeline

Running a Pipeline

SeqDesk supports two launch contexts:

  • Study pipelines run across the selected samples of a study.
  • Order pipelines run on the linked sequencing files of samples in an order.

In both cases, SeqDesk prepares the package inputs automatically and executes the workflow either locally or on a SLURM cluster.

Common Prerequisites

Before running a pipeline:

  • Pipelines must be enabled in admin settings (pipelines.enabled: true)
  • The execution environment must be configured (local or SLURM)
  • You need the FACILITY_ADMIN role

Some packages require linked reads. For example, MAG, FASTQ Checksum, FastQC, and Read Cleaning require FASTQ files to already be linked. Simulate Reads is the exception because it generates read files instead of consuming existing ones.

Launching a Study Pipeline

Study pipelines are the right choice for workflows that combine multiple samples into larger analyses, reports, or submission jobs.

Open the study

Navigate to the study that contains your samples. Go to the Pipelines tab.

Select a pipeline

Choose from the available pipelines (e.g., MAG). Each pipeline shows its description and requirements.

Configure parameters

Adjust pipeline-specific settings:

MAG Pipeline options:

ParameterDefaultDescription
Stub ModefalseTest mode — runs fast without actual analysis
Skip MEGAHITfalseSkip the MEGAHIT assembler
Skip SPAdestrueSkip the SPAdes assembler
Skip ProkkatrueSkip gene annotation
Skip CONCOCTtrueSkip CONCOCT binning
Skip Bin QCfalseSkip bin quality control
Skip GTDB-TkfalseSkip taxonomy classification

Select samples

Choose which samples from the study to include. All selected samples must have reads assigned.

Launch

Confirm and start the run. SeqDesk:

  1. Generates a samplesheet CSV from your samples and reads
  2. Creates a run directory (e.g., MAG-20240126-001/)
  3. Builds the Nextflow execution command
  4. Starts the pipeline (locally or via SLURM)

Launching an Order Pipeline

Order pipelines are the right choice for sample-level sequencing utilities such as read simulation, checksum validation, and read QC.

Open the order

Navigate to the order you want to work on. Use the sequencing or pipeline area for that order, depending on the package and your current workflow.

Review linked sequencing files

Check whether the samples already have linked FASTQ files. This is required for packages such as FASTQ Checksum and FastQC. If the order has no reads yet, you can start with Simulate Reads.

Select an order pipeline

Choose the package you want to run for that order. The current built-in order catalog includes Simulate Reads, FASTQ Checksum, FastQC, and Read Cleaning.

Configure parameters

Order pipelines typically have narrower configuration than study pipelines. Examples:

PipelineExample parameters
Simulate ReadsMode, read count, read length, replace existing files
FASTQ ChecksumUsually no additional configuration
FastQCUsually no additional configuration

Launch

Confirm and start the run. SeqDesk:

  1. Generates the package inputs from order samples and linked reads
  2. Creates a run directory for the package
  3. Builds the Nextflow execution command
  4. Starts the pipeline and tracks the run
  5. Resolves artifacts and validated Read writeback after completion

Run Number Format

Each run gets a unique number: {PIPELINE}-{YYYYMMDD}-{NNN} (e.g., MAG-20240126-001).

Run Lifecycle

A run moves through these statuses:

pending → queued → running → completed └─→ failed └─→ cancelled
StatusMeaning
pendingRun record created, not yet launched
queuedSubmitted to SLURM, waiting for resources (SLURM mode)
runningThe workflow is executing
completedFinished successfully and outputs resolved
failedPre-launch validation, prep, or execution failed
cancelledManually cancelled

Key behaviors:

  • Pre-launch validation. Before a run is launched, SeqDesk validates pipeline metadata and the derived launch config. If validation or input preparation fails, the run is moved to failed and never launched.
  • Single launch. Launching atomically claims a pending run. A second start request for an already-claimed run is rejected with 409 Conflict, so a run can never be double-started.
  • Local execution. Local runs are started as a detached bash run.sh process and tracked via queueJobId = local-{pid}.

Input Generation

SeqDesk auto-generates the package inputs that Nextflow expects. The exact file shape depends on the package, but the source data always comes from canonical SeqDesk records.

ScopeTypical generated input
Study pipelineA study-level samplesheet built from selected samples and their reads
Order pipelineA samplesheet or manifest generated from order samples and linked reads

For the MAG pipeline, each row contains:

ColumnSource
sampleSample ID (sample.sampleId)
groupStudy ID (study.id) — used for co-assembly grouping
short_reads_1Path to R1 FASTQ file
short_reads_2Path to R2 FASTQ file
short_reads_platformRequired; derived from the sequencing-technology selector (mapped to ILLUMINA/DNBSEQ/OXFORD/PACBIO…)

The generated input is saved in the run directory as samplesheet.csv (or a package-specific manifest file).

Execution Modes

Local

Nextflow runs directly on the SeqDesk server. Suitable for testing and small datasets.

SLURM

Nextflow submits jobs to a SLURM cluster. Configure in admin settings:

SettingDefaultDescription
QueuecpuSLURM partition name
Cores4CPUs per job
Memory64GBMemory per job
Time Limit12hMaximum run time
Additional OptionsExtra SLURM flags

The SLURM job ID is tracked in the queueJobId field for status monitoring. For local runs, queueJobId is set to local-{pid} (the detached process ID).

Run Directory Structure

Each run creates a directory under the configured pipelineRunDir. The exact outputs differ by package, but the common execution files are:

{PIPELINE}-{YYYYMMDD}-{NNN}/ ├── run.sh # Generated launch script (run with `bash run.sh`) ├── samplesheet.csv # Or another generated package input ├── nextflow.config # Generated Nextflow config (when one is produced) ├── output/ # Pipeline outputs directory (Nextflow --outdir) ├── logs/ │ ├── pipeline.out # stdout │ └── pipeline.err # stderr ├── trace.txt # Process trace (TSV) ├── dag.dot # Workflow DAG (Graphviz) ├── report.html # Nextflow execution report ├── timeline.html # Nextflow timeline report └── ... # Package-specific outputs and artifacts

In SLURM mode the scheduler also writes its own logs/slurm-%j.out / logs/slurm-%j.err files; the pipeline monitor reads logs/pipeline.out and logs/pipeline.err.