gene.etl.hgnc#
Defines the HGNC ETL methods.
- class gene.etl.hgnc.HGNC(database, seqrepo_dir=SEQREPO_ROOT_DIR, data_path=None, silent=True)[source]#
ETL the HGNC source into the normalized database.
- __init__(database, seqrepo_dir=SEQREPO_ROOT_DIR, data_path=None, silent=True)[source]#
Instantiate Base class.
- Parameters:
database (
AbstractDatabase) – database instanceseqrepo_dir (
Path) – Path to seqrepo directorydata_path (
Optional[Path]) – path to app data directorysilent (
bool) – if True, don’t print ETL result to console
- get_seqrepo(seqrepo_dir)[source]#
Return SeqRepo instance if seqrepo_dir exists.
- Parameters:
seqrepo_dir (
Path) – Path to seqrepo directory- Return type:
SeqRepo- Returns:
SeqRepo instance
- perform_etl(use_existing=False)[source]#
Public-facing method to begin ETL procedures on given data. Returned concept IDs can be passed to Merge method for computing merged concepts.
- Parameters:
use_existing (
bool) – if True, don’t try to retrieve latest source data- Return type:
list[str]- Returns:
list of concept IDs which were successfully processed and uploaded.