varity.ref-gene
Handles refGene.txt(.gz) and ncbiRefSeq.txt(.gz) content.
cds->genomic-pos
(cds->genomic-pos cds-pos rg)
(cds->genomic-pos cds-pos region {:keys [strand cds-start cds-end exon-ranges]})
cds-coord
(cds-coord pos rg)
Converts the genomic position into the coding DNA coordinate. The return
value is clj-hgvs.coordinate/CodingDNACoordinate record.
cds-coord->genomic-pos
(cds-coord->genomic-pos coord {:keys [strand], :as rg})
Converts the coding DNA coordinate into the genomic position. coord must be
clj-hgvs.coordinate/CodingDNACoordinate record.
cds-pos
(cds-pos pos {:keys [strand cds-start cds-end exon-ranges]})
cds-region
(cds-region {:keys [chr cds-start cds-end strand]})
Returns a genomic region of a coding sequence of the given gene. Returns nil
if the gene is a non-coding RNA.
cds-seq
(cds-seq {:keys [cds-start cds-end], :as ref-gene-record})
Returns a lazy sequence of exons included in a coding region of a
`ref-gene-record`. Note that exons outside of the CDS are removed and
partially overlapping ones are cropped in the result. Returns nil if the record
is a non-coding RNA.
exon-ranges->intron-ranges
(exon-ranges->intron-ranges exon-ranges)
exon-seq
(exon-seq {:keys [chr strand exon-ranges]})
Returns a lazy sequence of regions corresponding to each exon in a gene. The
exons are ordered by their index, thus they're reversed in genomic coordinate
if the refGene record is on the reverse strand.
GeneAnnotationIndex
protocol
members
lookup
(lookup this ks)
Returns ref-gene records specified by the ks.
in-any-exon?
(in-any-exon? chr pos gaidx)
Returns true if chr:pos is located in any ref-gene exon, else false.
in-cds?
(in-cds? pos {:keys [cds-start cds-end]})
Returns true if pos is in the coding region, false otherwise.
in-exon?
(in-exon? pos {:keys [exon-ranges]})
Returns true if pos is in the exon region, false otherwise.
index
(index rgs)
Creates refGene index for search.
load-gencode
(load-gencode f parse-line & {:keys [chunk-size], :or {chunk-size 10000}})
load-gff3
(load-gff3 f & {:keys [chunk-size], :or {chunk-size 10000}, :as opts})
load-gtf
(load-gtf f & {:keys [chunk-size], :or {chunk-size 10000}, :as opts})
load-ref-genes
deprecated in 0.8.0
(load-ref-genes f & {:keys [filter-fns], :or {filter-fns [identity]}})
DEPRECATED: Loads f (e.g. refGene.txt(.gz)), returning the all contents as a sequence.
load-ref-seqs
(load-ref-seqs f & {:keys [filter-fns], :or {filter-fns [identity]}})
Loads f (e.g. ncbiRefSeq.txt(.gz)), returning the all contents as a sequence.
read-coding-sequence
(read-coding-sequence seq-rdr ref-gene-record)
Reads a coding sequence of a ref-gene record `ref-gene-record` from
`seq-rdr`. Returns nil if the gene is a non-coding RNA.
read-exon-sequence
(read-exon-sequence seq-rdr {:keys [strand], :as exon})
Reads a base sequence of an `exon` from `seq-rdr`.
read-transcript-sequence
(read-transcript-sequence seq-rdr ref-gene-record)
Reads a DNA base sequence of a `ref-gene-record` from `seq-rdr`. The sequence
contains 5'-UTR, CDS and 3'-UTR.
ref-genes
(ref-genes s gaidx)
(ref-genes chr pos gaidx)
(ref-genes chr pos gaidx tx-margin)
Searches refGene entries with ref-seq, gene or (chr, pos) using index,
returning results as sequence. See also varity.ref-gene/index.
rna-accession?
(rna-accession? s)
seek-gene-region
(seek-gene-region chr pos gaidx)
(seek-gene-region chr pos gaidx name)
Seeks chr:pos through exon entries in refGene and returns those indices
tx-region
(tx-region {:keys [chr tx-start tx-end strand]})
Returns a genomic region of the given gene.