Here is a list of software I developed.


Functional noncoding variants prioritization:

  • traseR
    An R package for performing GWAS trait-associated SNP enrichment analyses in given genomic intervals.
  • DIVAN
    An R pipeline for prioritizing disease-specific noncoding risk variants in 45 diseases/traits using genome-wide precomputed functional scores.
  • WEVar
    A Python package for improving the prediction for noncoding regulatory variants using a weighted ensemble approach by integrating precomputed functional scores from multiple existing methods.
  • TIVAN
    An R pipeline for predicting noncoding regulatory variants in 44 tissues/cell types.
  • TIVAN-indel
    A Python package for implementing a novel computational method for predicting noncoding regulatory small insertions and deletions.
  • TLVar
    A Python package for implementing a deep transfer learning approach to improve the prediction for experimentally validated regulatory variants.
  • DeepPerVar
    A Python package for implementing a multi-modal deep learning model to predict individual-level noncoding functional variants.

Microbiome data analysis:

  • GMPR
    An R package for normalizing zero-inflated count data particular microbiome sequencing data.
  • glmgraph
    An R package for implementing sparse generalized linear models with graph-constrained regularization.
  • SICS
    An R package for implementing sparse generalized linear models with encouraging local smoothing in a phylogeny-constrained regularization for predictive modeling of microbiome data.
  • glmmTree
    An R package for implementing a phylogenetic tree-based generalized mixed effects model for predictive modeling of microbiome data.
  • powmic
    An R package for performing power assessment in microbiome sequencing data.
  • MDeep
    A Python package for implementing a novel deep learning model to predict phenotype using microbiome data by embedding the phylogenetic tree in the deep learning model.

Other multi-omics data analysis (single-cell genomics, epigenetics, noncoding RNA):

  • BAMMSC
    An R package for implementing a novel Bayesian mixture model to cluster droplet-based single cell transcriptomic data from multiple individuals.
  • DeepPHiC
    A Python package for implementing a novel multi-task and transfer deep learning model to predicting promoter-centered chromatin interactions using promoter-centered Hi-C data.
  • circMeta
    An R package for performing genomic feature annotation and implementing a novel Bayesian hierarchical model to detect differential expression analysis of circular RNAs.
  • tfLDA
    An R package for applying state-of-the-art topic models to decipher the combinatorial binding events of multiple TFs by integrating multiple ChIP- Seq datasets.
  • MTAE
    A Python package for implementing a novel multi-task deep autoencoder to predict AD progression using longitudinal DNA methylation data in peripheral blood.
  • ChIPComp
    An R package for implementing a novel Bayesian hierarchical model for quantitative comparison of multiple ChIP-seq datasets.
  • hmChIP
    A web server and database for cell type-specific ChIP-seq and ChIP-chip data query.