varity.ref-gene
Handles refGene.txt(.gz) and ncbiRefSeq.txt(.gz) content.
cds->genomic-pos
(cds->genomic-pos cds-pos rg)
(cds->genomic-pos cds-pos region {:keys [strand cds-start cds-end exon-ranges]})
cds-coord
(cds-coord pos rg)
Converts the genomic position into the coding DNA coordinate. The return
value is clj-hgvs.coordinate/CodingDNACoordinate record.
cds-coord->genomic-pos
(cds-coord->genomic-pos coord {:keys [strand], :as rg})
Converts the coding DNA coordinate into the genomic position. coord must be
clj-hgvs.coordinate/CodingDNACoordinate record.
cds-pos
(cds-pos pos {:keys [strand cds-start cds-end exon-ranges]})
cds-region
(cds-region {:keys [chr cds-start cds-end strand]})
Returns a genomic region of a coding sequence of the given gene. Returns nil
if the gene is a non-coding RNA.
cds-seq
(cds-seq {:keys [cds-start cds-end], :as ref-gene-record})
Returns a lazy sequence of exons included in a coding region of a
`ref-gene-record`. Note that exons outside of the CDS are removed and
partially overlapping ones are cropped in the result. Returns nil if the record
is a non-coding RNA.
exon-ranges->intron-ranges
(exon-ranges->intron-ranges exon-ranges)
exon-seq
(exon-seq {:keys [chr strand exon-ranges]})
Returns a lazy sequence of regions corresponding to each exon in a gene. The
exons are ordered by their index, thus they're reversed in genomic coordinate
if the refGene record is on the reverse strand.
in-any-exon?
(in-any-exon? chr pos rgidx)
Returns true if chr:pos is located in any ref-gene exon, else false.
in-cds?
(in-cds? pos {:keys [cds-start cds-end]})
Returns true if pos is in the coding region, false otherwise.
in-exon?
(in-exon? pos {:keys [exon-ranges]})
Returns true if pos is in the exon region, false otherwise.
index
(index rgs)
Creates refGene index for search.
load-gencode
(load-gencode f parse-line & {:keys [chunk-size], :or {chunk-size 10000}})
load-gff3
(load-gff3 f & {:keys [chunk-size], :or {chunk-size 10000}, :as opts})
load-gtf
(load-gtf f & {:keys [chunk-size], :or {chunk-size 10000}, :as opts})
load-ref-genes
deprecated in 0.8.0
(load-ref-genes f & {:keys [filter-fns], :or {filter-fns [identity]}})
DEPRECATED: Loads f (e.g. refGene.txt(.gz)), returning the all contents as a sequence.
load-ref-seqs
(load-ref-seqs f & {:keys [filter-fns], :or {filter-fns [identity]}})
Loads f (e.g. ncbiRefSeq.txt(.gz)), returning the all contents as a sequence.
read-coding-sequence
(read-coding-sequence seq-rdr ref-gene-record)
Reads a coding sequence of a ref-gene record `ref-gene-record` from
`seq-rdr`. Returns nil if the gene is a non-coding RNA.
read-exon-sequence
(read-exon-sequence seq-rdr {:keys [strand], :as exon})
Reads a base sequence of an `exon` from `seq-rdr`.
read-transcript-sequence
(read-transcript-sequence seq-rdr ref-gene-record)
Reads a DNA base sequence of a `ref-gene-record` from `seq-rdr`. The sequence
contains 5'-UTR, CDS and 3'-UTR.
ref-genes
(ref-genes s rgidx)
(ref-genes chr pos rgidx)
(ref-genes chr pos rgidx tx-margin)
Searches refGene entries with ref-seq, gene or (chr, pos) using index,
returning results as sequence. See also varity.ref-gene/index.
rna-accession?
(rna-accession? s)
seek-gene-region
(seek-gene-region chr pos rgidx)
(seek-gene-region chr pos rgidx name)
Seeks chr:pos through exon entries in refGene and returns those indices
tx-region
(tx-region {:keys [chr tx-start tx-end strand]})
Returns a genomic region of the given gene.