CLI

CLI instructions for running scibiomart.

The CLI allows users to directly get attributes from Ensembl biomart. Filters are able to be used by passing the filters as json strings (e.g. –f “{"ensembl_gene_id": "ENSG00000139618,ENSG00000091483"}” and attributes as a comma separated list (e.g. –a “ensembl_gene_id,mmusculus_homolog_ensembl_gene”).

Example:

Here we show an example where we get the mouse ortholog for two ensembl IDs: ENSG00000139618,ENSG00000091483.

Ensembl human gene ID to mouse ortholog for only two genes:

scibiomart --m ENSEMBL_MART_ENSEMBL --d hsapiens_gene_ensembl --f "{\"ensembl_gene_id\": \"ENSG00000139618,ENSG00000091483\"}" --a "ensembl_gene_id,mmusculus_homolog_ensembl_gene"

Ensembl human gene ID to mouse ortholog for all genes:

scibiomart --m ENSEMBL_MART_ENSEMBL --d hsapiens_gene_ensembl --a "ensembl_gene_id,external_gene_name,mmusculus_homolog_ensembl_gene,mmusculus_homolog_perc_id_r1" --o mm10_orthologs_

Ensembl human gene ID to gene name:

scibiomart --m ENSEMBL_MART_ENSEMBL --d hsapiens_gene_ensembl --f "{\"ensembl_gene_id\": \"ENSG00000139618,ENSG00000091483\"}" --a "ensembl_gene_id,entrezgene_id,hgnc_symbol"

Get all mouse gene names and uniprot symbols:

scibiomart --m ENSEMBL_MART_ENSEMBL --d mmusculus_gene_ensembl --a "ensembl_gene_id,entrezgene_id,uniprotswissprot" --o mm10

Get all mouse gene names and positions and sort the data by gene starts:

scibiomart --m ENSEMBL_MART_ENSEMBL --d mmusculus_gene_ensembl --a "ensembl_gene_id,external_gene_name,chromosome_name,start_position,end_position,strand" --o mm10Sorted --s t

Arguments

sciloc2gene

usage: scibiomart [-h] [--m M] [--d D] [--a A] [--f F] [--o O] [--s S] [--marts MARTS] [--datasets DATASETS] [--attrs ATTRS] [--filters FILTERS] [--configs CONFIGS]

Named Arguments

--m
Mart: e.g. ENSEMBL_MART_ENSEMBL,

use –marts to see available marts.

--d
Dataset: e.g. hsapiens_gene_ensembl, mmusculus_gene_ensembl…

use –datasets to see available datasets.

--a
Attributes formatted as a JSON object

use –attrs to see available attributes.

--f
Filters as a comma separated list surrounded by “”.

use –filters to see available filters.

--o

Output folder

Default: “”

--s

Sort the dataframe before returning on gene starts (used for programs that require a sorted file e.g. sciloc2gene.

Default: “f”

--marts

Lists available marts.

--datasets

Lists available datasets for a specific mart (must use –m option)

--attrs

Lists available attributes for a mart and dataset (must use –m and –d options).

--filters

Lists available filters for a mart and dataset (must use –m and –d options).

--configs

Lists configs filters for a mart and dataset (must use –m and –d options).