Samples¶
All sample operations are accessed via Client.samples.
List sample types, metadata attributes, organisms, and projects:
sample_types = client.samples.get_types()
attributes = client.samples.get_metadata_attributes()
organisms = client.samples.get_organisms()
projects = client.samples.get_owned_projects()
Upload with metadata, project, and organism:
from pathlib import Path
sample = client.samples.upload_sample(
name="Paired-end Sample",
sample_type="RNA-Seq",
data={
"reads1": Path("R1.fastq.gz"),
"reads2": Path("R2.fastq.gz"),
},
metadata={"strandedness": "reverse"},
project_id="proj_123",
organism_id="org_456",
)
Uploading Samples¶
Use upload_sample() to upload
demultiplexed (individual) samples. Each call creates one sample with one
or more data files:
from pathlib import Path
sample = client.samples.upload_sample(
name="My RNA-Seq Sample",
sample_type="RNA-Seq",
data={"reads1": Path("reads_R1.fastq.gz")},
metadata={"strandedness": "forward"},
)
For paired-end samples, provide both reads1 and reads2:
sample = client.samples.upload_sample(
name="Paired-end Sample",
sample_type="RNA-Seq",
data={
"reads1": Path("R1.fastq.gz"),
"reads2": Path("R2.fastq.gz"),
},
)
See upload_sample() for the full
list of parameters including metadata, project, and organism options.
Metadata attributes¶
Samples can have metadata attributes attached to them. Some attributes are
required for all samples, while others are only required for specific sample
types. Use get_metadata_attributes()
to discover what’s available.
Each MetadataAttribute tells you:
Whether it’s universally required (
required=True)Whether it’s required for specific sample types (
required_for_sample_typeslists the sample type identifiers)Whether it has a fixed set of valid values (
options) or accepts any free-text value (options=None)
Discover which attributes are required for a given sample type:
attributes = client.samples.get_metadata_attributes()
for attr in attributes:
is_required = attr.required or "RNA-Seq" in attr.required_for_sample_types
if is_required:
if attr.options is not None:
print(f"{attr.name} (required): choose from {attr.options}")
else:
print(f"{attr.name} (required): any value accepted")
Use the attribute identifiers as keys when uploading:
from pathlib import Path
attributes = client.samples.get_metadata_attributes()
sample = client.samples.upload_sample(
name="My Sample",
sample_type="RNA-Seq",
data={"reads1": Path("reads.fastq.gz")},
metadata={
attributes[0].identifier: "some value",
attributes[1].identifier: attributes[1].options[0],
},
)
Multiplexed uploads¶
Multiplexed uploads let you upload reads that contain data from multiple samples in a single file, along with an annotation sheet that describes how the reads should be demultiplexed.
The workflow has three steps:
Download an annotation template for your sample type
Fill in the template with one row per sample (names, metadata, etc.)
Upload the reads and annotation together
Download a template¶
Use get_annotation_template() to
get a pre-formatted spreadsheet for your sample type:
from pathlib import Path
template = client.samples.get_annotation_template("rna_seq")
Path("annotation_template.xlsx").write_bytes(template)
Open the file in a spreadsheet editor, fill in one row per sample, and save it.
Upload reads and annotation¶
Pass the reads files and the completed annotation sheet to
upload_multiplexed_data():
from pathlib import Path
result = client.samples.upload_multiplexed_data(
reads={"reads1": Path("multiplexed_R1.fastq.gz")},
annotation=Path("annotation.xlsx"),
)
print(f"Data IDs: {result.data_ids}")
print(f"Annotation ID: {result.annotation_id}")
For paired-end multiplexed reads, provide both reads1 and reads2:
result = client.samples.upload_multiplexed_data(
reads={
"reads1": Path("multiplexed_R1.fastq.gz"),
"reads2": Path("multiplexed_R2.fastq.gz"),
},
annotation=Path("annotation.xlsx"),
)
Handling annotation warnings¶
The server validates the annotation sheet before uploading reads. If
there are hard validation errors, an
AnnotationValidationError is raised
and no reads are uploaded.
Some issues produce warnings rather than errors (e.g. unusual but valid values). By default, warnings are automatically accepted and included in the result for inspection:
result = client.samples.upload_multiplexed_data(
reads={"reads1": Path("multiplexed_R1.fastq.gz")},
annotation=Path("annotation.xlsx"),
)
if result.warnings:
for warning in result.warnings:
print(f"Warning: {warning}")
To reject the upload when warnings are present, set
ignore_warnings=False:
result = client.samples.upload_multiplexed_data(
reads={"reads1": Path("multiplexed_R1.fastq.gz")},
annotation=Path("annotation.xlsx"),
ignore_warnings=False,
)
This raises AnnotationValidationError
if the annotation has any warnings.
API Reference¶
- class flowbio.v2.samples.SampleResource(transport, config)¶
Provides access to sample-related API endpoints.
Accessed via
Client.samples:client = Client() sample_types = client.samples.get_types()
- upload_sample(name, sample_type, data, metadata=None, project_id=None, organism_id=None)¶
Upload a sample with one or more files.
Multiple files are linked together into a single sample. Chunk size and progress display are controlled via
flowbio.v2.ClientConfig.Requires authentication.
Example:
from pathlib import Path result = client.samples.upload_sample( name="My RNA-Seq Sample", sample_type="RNA-Seq", data={"reads1": Path("reads_R1.fastq.gz")}, metadata={"strandedness": "forward"}, ) print(f"Sample ID: {result.sample_id}")
- Parameters:
name (
str) – The name of the sample.sample_type (
str) – The sample type identifier (e.g."RNA-Seq"). This must be a valid sample type specified in the Flow application. You can get valid sample types fromget_types().A mapping of data type identifiers to file paths. For sequencing samples, use
reads1and optionallyreads2— these are the only valid reads keys, andreads1is always uploaded first:# Single-end {"reads1": Path("sample.fastq.gz")} # Paired-end {"reads1": Path("R1.fastq.gz"), "reads2": Path("R2.fastq.gz")}
For non-sequencing sample types, any key names are accepted and files are uploaded in the order given:
{"input": Path("counts.csv")}
metadata (
dict[str,str] |None) – Optional metadata key-value pairs. See Metadata attributes for details on required attributes.project_id (
str|None) – Optional project ID to assign the sample to. This must be a project you own. You can see available projects by callingget_owned_projects().organism_id (
str|None) – Optional organism ID to associate with.
- Raises:
ValueError – If reads keys are invalid (e.g.
reads3) orreads2is provided withoutreads1.FlowApiError – If any of the data is invalid, e.g. sample_type doesn’t exist or missing required metadata attributes.
- Return type:
- upload_multiplexed_data(reads, annotation, ignore_warnings=True)¶
Upload multiplexed reads and an annotation sheet.
Validates and uploads the annotation sheet first, so that reads files are not uploaded if the annotation is invalid. Then uploads one or two reads files to
/upload/multiplexed.By default, annotation warnings are automatically accepted (the upload is retried with
ignore_warnings=True) and included in the result for inspection. Setignore_warnings=Falseto reject the upload on warnings instead.Requires authentication.
Example:
from pathlib import Path result = client.samples.upload_multiplexed_data( reads={"reads1": Path("multiplexed_R1.fastq.gz")}, annotation=Path("annotation.xlsx"), ) print(f"Data IDs: {result.data_ids}") print(f"Annotation ID: {result.annotation_id}") if result.warnings: print(f"Warnings: {result.warnings}")
- Parameters:
A mapping of reads keys to file paths. Use
reads1for single-end, orreads1andreads2for paired-end.reads1is always uploaded first:# Single-end {"reads1": Path("multiplexed.fastq.gz")} # Paired-end {"reads1": Path("R1.fastq.gz"), "reads2": Path("R2.fastq.gz")}
annotation (
Path) – Path to the annotation sheet (.xlsxor.csv). Useget_annotation_template()to download a template.ignore_warnings (
bool) – IfTrue(the default), annotation warnings are automatically accepted and included in the result. IfFalse, warnings cause aBadRequestErrorto be raised.
- Raises:
ValueError – If reads keys are invalid (e.g.
reads3) orreads2is provided withoutreads1.AnnotationValidationError – If the annotation has hard validation errors that cannot be ignored.
AnnotationValidationError – If
ignore_warnings=Falseand the annotation has warnings.
- Return type:
- get_annotation_template(sample_type='generic')¶
Download an annotation sheet template for multiplexed uploads.
Annotation sheets are spreadsheets that describe multiple samples in a single file. Download a template, fill in one row per sample with names, file paths, and metadata, then submit the completed sheet to upload all samples in one batch.
A type-specific template (e.g.
"rna_seq") includes columns for metadata attributes relevant to that sample type. The"generic"template includes only the base columns shared by all types.Returns the raw xlsx bytes. Write them to disk to get a usable spreadsheet:
from pathlib import Path template = client.samples.get_annotation_template("rna_seq") Path("template.xlsx").write_bytes(template)
- Parameters:
sample_type (
str) – The sample type identifier (e.g."rna_seq"). Defaults to"generic"for a universal template. Seeget_types()for available sample types.- Return type:
- Returns:
The raw xlsx file bytes.
- Raises:
NotFoundError – If the sample type does not exist.
- get_types()¶
Return the available sample types.
Example:
sample_types = client.samples.get_types() for st in sample_types: print(f"{st.identifier}: {st.name}")
- Return type:
- get_owned_projects()¶
Return the projects owned by the authenticated user.
Requires authentication. Results are paginated lazily — pages are only fetched from the API as you iterate through the results.
The total count is available via
len()without fetching all pages:projects = client.samples.get_owned_projects() print(f"You have {len(projects)} projects")
Iterate to access individual projects:
for project in client.samples.get_owned_projects(): print(f"{project.id}: {project.name}")
Or convert to a list to fetch everything at once:
all_projects = list(client.samples.get_owned_projects())
- get_organisms()¶
Return the available organisms.
Example:
organisms = client.samples.get_organisms() for o in organisms: print(f"{o.id}: {o.name} ({o.latin_name})")
- get_metadata_attributes()¶
Return the available metadata attributes for samples. See Metadata attributes for more detail.
Example:
attributes = client.samples.get_metadata_attributes() required = [a for a in attributes if a.required]
- Return type:
Models¶
- pydantic model flowbio.v2.samples.Sample¶
A sample on the Flow platform. For now this only includes id, but when we add more methods to retrieve samples with more detail, more fields will be added.
- pydantic model flowbio.v2.samples.SampleType¶
A type of sample that can be uploaded to the Flow platform.
Example:
sample_types = client.samples.get_types() for st in sample_types: print(f"{st.identifier}: {st.name}")
- pydantic model flowbio.v2.samples.MetadataAttribute¶
A metadata attribute that can be attached to a sample. See Metadata attributes for a more detailed explanation.
Example:
attributes = client.samples.get_metadata_attributes() for attr in attributes: if attr.options is not None: print(f"{attr.name}: choose from {attr.options}")
- pydantic model flowbio.v2.samples.Project¶
A project that samples can be assigned to.
Example:
projects = client.samples.get_owned_projects() for p in projects: print(f"{p.id}: {p.name}")
- pydantic model flowbio.v2.samples.Organism¶
An organism that a sample can be associated with.
Example:
organisms = client.samples.get_organisms() for o in organisms: print(f"{o.id}: {o.name} ({o.latin_name})")
- pydantic model flowbio.v2.samples.MultiplexedUpload¶
Result of a multiplexed data upload.