Pipeline Runtime

Pipeline runtime settings control how Nextflow workflows are executed — locally or on a SLURM cluster — and which tools are available.

Configuration Location

Pipeline runtime settings are at Admin → Pipeline Runtime (/admin/pipeline-runtime), or via the config file / environment variables.

Execution Mode

Mode	Description	Best For
`local`	Run Nextflow directly on the server	Testing, small datasets
`slurm`	Submit jobs to a SLURM cluster	Production, large datasets

The global default maps to the SEQDESK_PIPELINE_MODE environment variable (config key pipelines.execution.mode, values local or slurm).

SeqDesk resolves the execution target for every run in this order:

Run override selected by a facility admin when starting the run
Per-pipeline override from Admin → Pipeline Runtime
Global runtime setting
Local default

This means SLURM is optional. Keep the global mode local for small utilities, then set heavier pipelines such as MAG or MetaxPath to SLURM.

Local Execution

Nextflow runs as a child process on the SeqDesk server. All compute happens on the same machine.

SLURM Execution

Nextflow submits individual processes as SLURM jobs. Configure:

Setting	Env Variable	Default	Description
SLURM Enabled	`SEQDESK_SLURM_ENABLED`	`false`	Enable SLURM
Queue	`SEQDESK_SLURM_QUEUE`	`default`	Partition name
Cores	`SEQDESK_SLURM_CORES`	`4`	CPUs per job
Memory	`SEQDESK_SLURM_MEMORY`	`16GB`	Memory per job
Time Limit	`SEQDESK_SLURM_TIME`	`24`	Hours per job
Additional Options	—	—	Extra SLURM flags

SLURM commands are only required for runs that resolve to slurm. Local runs do not call sbatch, squeue, or sacct.

Per-Pipeline Overrides

Use Admin → Pipeline Runtime → Pipeline overrides to choose a default target per installed pipeline. The common production pattern is:

Pipeline	Mode	Reason
`simulate-reads`, `fastq-checksum`, `fastqc`	`local` or inherit	Small sequencing-order-level utilities
`mag`	`slurm`	Larger metagenomic assembly and binning workloads
`metaxpath`	`slurm`	Larger clinical metagenomics workflow

settings.json can express this policy:


{
  "pipelines": {
    "enabled": true,
    "execution": {
      "mode": "local",
      "runDirectory": "/data/pipeline_runs",
      "slurm": {
        "enabled": true,
        "queue": "cpu",
        "cores": 8,
        "memory": "64GB",
        "timeLimit": 12
      },
      "pipelineOverrides": {
        "mag": {
          "mode": "slurm",
          "slurm": {
            "queue": "long",
            "cores": 16,
            "memory": "128GB",
            "timeLimit": 48
          }
        },
        "metaxpath": {
          "mode": "slurm",
          "slurm": {
            "queue": "long",
            "cores": 16,
            "memory": "128GB",
            "timeLimit": 48
          }
        }
      }
    }
  }
}

Admins can still override the target for a single run from the launch dialog. Each run records the resolved execution mode and a non-secret scheduler snapshot so later debugging does not depend on changed global settings.

Conda Configuration

Nextflow pipelines use Conda for dependency management:

Setting	Env Variable	Default	Description
Conda Enabled	`SEQDESK_CONDA_ENABLED`	`false`	Use Conda
Conda Path	`SEQDESK_CONDA_PATH`	`/opt/conda`	Installation path
Conda Environment	`SEQDESK_CONDA_ENV`	—	Environment name

The Conda environment must have Nextflow installed or be configured to find it.

Nextflow Settings

The Nextflow binary is resolved automatically — there is no “Nextflow path” admin field. Nextflow must be available on the runtime PATH (or inside the configured Conda environment).

Additional Nextflow configuration available in the admin UI:

Nextflow profile — profile name to use (e.g., conda, docker)

Reference-database download paths (GTDB-Tk and other pipeline databases) are not configured here. They are managed on the Pipelines page (/admin/settings/pipelines), where each pipeline’s databases can be downloaded and their status reviewed. The download directory can also be set with the SEQDESK_PIPELINE_DATABASE_DIR environment variable (config key pipelines.databaseDirectory).

Weblog Setup

Real-time pipeline monitoring requires configuring the Nextflow weblog:

Setting	Description
Weblog URL	Your SeqDesk URL + `/api/pipelines/weblog`
Weblog Secret	Optional authentication token

The weblog URL is automatically included in the Nextflow execution command. See Monitoring & DAG Visualization for details.

Pipeline Run Directory

Setting	Env Variable	Default	Description
Run Directory	`SEQDESK_PIPELINE_RUN_DIR`	`./pipeline_runs`	Where run outputs are stored

Each pipeline run creates a subdirectory (e.g., MAG-20240126-001/) containing the samplesheet, scripts, logs, and outputs.

Infrastructure Validation

The Admin → Infrastructure page (/admin/data-compute) validates that your runtime environment is correctly configured before you run any pipelines. It tests four areas:

Check	What It Tests
Data Path	The sequencing data directory exists and is accessible
Run Directory	The pipeline run directory exists and is writable
Conda	Conda is installed and the configured environment is available
Weblog	The Nextflow weblog endpoint is reachable

Each check shows a pass/fail indicator. Fix any failures before attempting to run a pipeline.

Import Setup JSON

The infrastructure page also supports importing a complete infrastructure configuration as JSON. You can:

Upload a JSON file or paste a configuration
Load an example to see the expected format
Validate (dry-run) before saving
Save to apply the configuration

This is useful for automating setup or copying configurations between instances. The JSON supports both flat keys (e.g., dataBasePath) and nested keys (e.g., site.dataBasePath), as well as alternate names for flexibility.

Headless Runtime Smoke Test

On a dev or production-like Linux server, run the command-line smoke test from the SeqDesk application directory. It uses only HTTP/API calls and filesystem checks; no browser is required.


SEQDESK_RUNTIME_E2E_BASE_URL="https://your-seqdesk.example.org" \
SEQDESK_RUNTIME_E2E_EMAIL="admin@example.com" \
SEQDESK_RUNTIME_E2E_PASSWORD="admin-password" \
npm run pipeline:e2e:runtime -- --ensure-dummy-data

The default run:

Step	What it verifies
Login	Facility-admin API login works
Sequencing order selection	Uses the admin dummy-data sequencing order, or creates it with `--ensure-dummy-data`
Local run	`simulate-reads` starts locally and does not generate SLURM directives
SLURM run	`simulate-reads` submits through `sbatch`, records a numeric job ID, and writes SLURM logs
Output check	`pipeline.out` and `output/summary/simulation-summary.tsv` exist

Useful variants:


# Local-only check on a host without SLURM
npm run pipeline:e2e:runtime -- --skip-slurm --ensure-dummy-data
 
# SLURM-only check
npm run pipeline:e2e:runtime -- --skip-local --ensure-dummy-data
 
# Check the configured default policy for the selected pipeline
npm run pipeline:e2e:runtime -- --include-default-policy --expect-default-mode slurm
 
# Use a specific order instead of the dummy-data order
npm run pipeline:e2e:runtime -- --order-id <order-id>

For a full local + SLURM check, the host running the command must have sbatch, squeue, and sacct available and must be able to write the configured pipeline run directory.