Submitting to ENA
SeqDesk automates the submission process to the European Nucleotide Archive. This page covers the step-by-step submission workflow.
Submission Process
Prepare the study
Ensure all validation requirements are met — a non-empty title and description, at least one sample, and a tax ID on every sample.
Submit the study
The facility admin initiates submission from the study page or the ENA
Submissions page (/submissions). The system:
- Generates Study (PROJECT) XML with the alias, title, and description
- Generates Submission XML (the
ADDaction wrapper for ENA’s API) - Sends a multipart POST to ENA’s submit endpoint
- Parses the receipt XML for the project accession
On success, the study receives a studyAccessionId (format: PRJEB...).
Submit samples
In the same request, after the study is registered, the samples are submitted
together as a single SAMPLE_SET:
- Sample XML is generated with each sample’s alias, title,
TAXON_ID, optional scientific name, and MIxS metadata as sample attributes - The set is POSTed to ENA
sampleAccessionNumber(format:ERS...) is stored on each sample from the receipt
The XML registration path stores only sampleAccessionNumber. The BioSample
accession (SAMEA..., the biosampleNumber field) is written only by the
submg pipeline — see Submitting reads and assemblies with submg
below.
Track accession numbers
The study accession and sample accessions are stored on the corresponding records and surfaced in the study detail page and the Submissions Dashboard.
XML Generation
SeqDesk generates ENA-compliant XML for each entity type:
Study XML
<PROJECT_SET>
<PROJECT alias="my-study-alias">
<TITLE>Study Title</TITLE>
<DESCRIPTION>Study description...</DESCRIPTION>
<SUBMISSION_PROJECT>
<SEQUENCING_PROJECT/>
</SUBMISSION_PROJECT>
</PROJECT>
</PROJECT_SET>Sample XML
<SAMPLE_SET>
<SAMPLE alias="HG001">
<TITLE>Human Gut Sample 1</TITLE>
<SAMPLE_NAME>
<TAXON_ID>408170</TAXON_ID>
<SCIENTIFIC_NAME>human gut metagenome</SCIENTIFIC_NAME>
</SAMPLE_NAME>
<SAMPLE_ATTRIBUTES>
<SAMPLE_ATTRIBUTE>
<TAG>geographic location (country and/or sea)</TAG>
<VALUE>Germany</VALUE>
</SAMPLE_ATTRIBUTE>
<!-- Additional MIxS fields -->
</SAMPLE_ATTRIBUTES>
</SAMPLE>
</SAMPLE_SET>ENA API Endpoints
| Environment | URL |
|---|---|
| Test | https://wwwdev.ebi.ac.uk/ena/submit/drop-box/submit/ |
| Production | https://www.ebi.ac.uk/ena/submit/drop-box/submit/ |
Requests use HTTP Basic authentication with your Webin credentials and multipart form data containing the XML files.
Connection test
The Test Connection button in ENA settings does not register anything: it
sends a minimal multipart POST to the drop-box endpoint with a VALIDATE
action and a throwaway project. A success="true"/success="false" receipt
means the credentials authenticated; a 401 means they did not. The username
must match Webin-12345 (the ^Webin-\d+$ pattern) before the request is even
sent.
Broker-account mode
If the install runs as an ENA broker (configured in ENA settings), the
registration path adds a center_name attribute — taken from the configured
center name — to the PROJECT and SAMPLE XML it generates, so records are
attributed to the brokering center. Broker mode applies to the XML
registration path; the submg pipeline below does not inject a center name.
Submission Tracking
Each study registration creates one Submission record (the XML path creates
study submission records; it does not create separate per-sample records):
| Field | Description |
|---|---|
submissionType | STUDY for this path |
status | PENDING → ACCEPTED / PARTIAL / ERROR (an operator can set CANCELLED) |
xmlContent | The combined study + sample + submission XML, kept for debugging |
response | JSON wrapper: server, test flag, message, the steps[] timeline, and a nested receipt (with the raw receipt XML) and any errors |
accessionNumbers | JSON map, e.g. { "study": "PRJEB12345", "<sampleId>": "ERS..." } |
There is no separate receiptXml, requestPayload, or error-message column —
the receipt and error text live inside the response JSON. See the
Submissions Dashboard for the
full record shape and the status lifecycle.
Resubmitting
The XML path does not batch or retry samples individually. If a registration
fails or partially succeeds, fix the underlying records and re-run it from the
Submissions Dashboard — “Retry” simply re-POSTs /api/admin/submissions for the
same study, regenerating fresh XML. When a study already has an accession, the
re-run skips re-registering the project and submits only the samples that still
lack an accession.
Submitting reads and assemblies with submg
The XML path above registers the study and samples (metadata) only. To
submit the actual sequencing data — reads, the assembly, and optional bins —
SeqDesk runs the submg pipeline (“Submit to ENA”), which wraps the
submg CLI. It runs as a normal pipeline run:
it is visible to users but admin-triggered (userCanStart is false), can be
scoped to a whole study or a subset of samples, and writes accessions back to
the database when it finishes.
Prerequisites
Per sample, submg requires:
- Paired-end reads — each read must have both an R1 and an R2 FASTQ; submg rejects samples with no paired-end read files.
- MD5 checksums on both R1 and R2.
- An assembly FASTA that exists on disk (run the MAG pipeline first if a sample has no assembly).
- A study ENA accession (
PRJ...) — the study must already be registered via the XML path. - Checklist metadata —
collection dateandgeographic location (country and/or sea)are required; other checklist fields are passed through.
When the ENA target is the Test server, the study’s test registration must also be less than 24 hours old, because ENA expires test registrations.
An assembly coverage value is resolved from sample or order custom fields
(coverage_depth, coverage_value, target_coverage, and similar keys). If
none is found it defaults to 1 and the run emits a warning so you know the
coverage was assumed rather than measured.
What it submits
submg builds a YAML config per sample and calls submg submit with
--submit_samples --submit_reads --submit_assembly (and --submit_bins when
bins are present and enabled). When it completes, SeqDesk parses the submission
logs and writes back:
sampleAccessionNumber(ERS...) andbiosampleNumber(SAMEA...) on each sample- run and experiment accessions on each read
- the assembly accession (
ERZ...) - bin accessions when bins were submitted
This BioSample (SAMEA...) write-back is unique to the submg path.
Test vs Production
| Feature | Test Mode | Production |
|---|---|---|
| Server | wwwdev.ebi.ac.uk | www.ebi.ac.uk |
| Data persistence | 24 hours | Permanent |
| Accession numbers | Valid format but temporary | Permanent public accessions |
| Purpose | Validation and testing | Real submissions |
Always validate with test mode before submitting to production. Test submissions expire after 24 hours and do not create permanent records.