CLI¶
CLI instructions for running scibiomart.
The CLI allows users to directly get attributes from Ensembl biomart. Filters are able to be used by passing the filters as json strings (e.g. –f “{"ensembl_gene_id": "ENSG00000139618,ENSG00000091483"}” and attributes as a comma separated list (e.g. –a “ensembl_gene_id,mmusculus_homolog_ensembl_gene”).
Example:¶
Here we show an example where we get the mouse ortholog for two ensembl IDs: ENSG00000139618,ENSG00000091483.
Ensembl human gene ID to mouse ortholog for only two genes:
scibiomart --m ENSEMBL_MART_ENSEMBL --d hsapiens_gene_ensembl --f "{\"ensembl_gene_id\": \"ENSG00000139618,ENSG00000091483\"}" --a "ensembl_gene_id,mmusculus_homolog_ensembl_gene"
Ensembl human gene ID to mouse ortholog for all genes:
scibiomart --m ENSEMBL_MART_ENSEMBL --d hsapiens_gene_ensembl --a "ensembl_gene_id,external_gene_name,mmusculus_homolog_ensembl_gene,mmusculus_homolog_perc_id_r1" --o mm10_orthologs_
Ensembl human gene ID to gene name:
scibiomart --m ENSEMBL_MART_ENSEMBL --d hsapiens_gene_ensembl --f "{\"ensembl_gene_id\": \"ENSG00000139618,ENSG00000091483\"}" --a "ensembl_gene_id,entrezgene_id,hgnc_symbol"
Get all mouse gene names and uniprot symbols:
scibiomart --m ENSEMBL_MART_ENSEMBL --d mmusculus_gene_ensembl --a "ensembl_gene_id,entrezgene_id,uniprotswissprot" --o mm10
Get all mouse gene names and positions and sort the data by gene starts:
scibiomart --m ENSEMBL_MART_ENSEMBL --d mmusculus_gene_ensembl --a "ensembl_gene_id,external_gene_name,chromosome_name,start_position,end_position,strand" --o mm10Sorted --s t
Arguments¶
sciloc2gene
usage: scibiomart [-h] [--m M] [--d D] [--a A] [--f F] [--o O] [--s S] [--marts MARTS] [--datasets DATASETS] [--attrs ATTRS] [--filters FILTERS] [--configs CONFIGS]
Named Arguments¶
- --m
- Mart: e.g. ENSEMBL_MART_ENSEMBL,
use –marts to see available marts.
- --d
- Dataset: e.g. hsapiens_gene_ensembl, mmusculus_gene_ensembl…
use –datasets to see available datasets.
- --a
- Attributes formatted as a JSON object
use –attrs to see available attributes.
- --f
- Filters as a comma separated list surrounded by “”.
use –filters to see available filters.
- --o
Output folder
Default: “”
- --s
Sort the dataframe before returning on gene starts (used for programs that require a sorted file e.g. sciloc2gene.
Default: “f”
- --marts
Lists available marts.
- --datasets
Lists available datasets for a specific mart (must use –m option)
- --attrs
Lists available attributes for a mart and dataset (must use –m and –d options).
- --filters
Lists available filters for a mart and dataset (must use –m and –d options).
- --configs
Lists configs filters for a mart and dataset (must use –m and –d options).