Category: How to tutorials
-
Microbial ortholog gene clustering using real-dataset
Once we assembled and annotated bacterial genomes, next step after inferring phylogeny usually is to finding and clustering orthologous genes. What are orthologs? Well, Genetics 101 tells us that they are genes evolutionary shared between multiple taxa descended from common ancestors. That means a series of (pretty-much) reliable replications and cell division passed between these…
-
Hybrid Assembly of Bacterial Genome
This is Part 4 of tutorial series: NGS Workflow for Genome Assembly to Annotation for Hybrid Bacterial Data We’ll be use hybrid sequencing data (Illumina and Nanopore). This tutorial has five parts. Disclaimer: This post is a work in progress. This is genome assembly and annotation workflow that I use for microbial genomics. Previously, I used…
-
Microbial De Novo Genome Assembly from Long-Read Data
This is Part 3 of tutorial series: NGS Workflow for Genome Assembly to Annotation for Hybrid Bacterial Data We’ll be use hybrid sequencing data (Illumina and Nanopore). This tutorial has five parts. Disclaimer: This post is a work in progress. This is genome assembly and annotation workflow that I use for microbial genomics. Previously, I used…
-
Microbial De Novo Genome Assembly from Short-Read Data
This is yet another tutorial on microbial genome assembly. This is actually the Part 2 of tutorial series: NGS Workflow for Genome Assembly to Annotation for Hybrid Bacterial Data In this series we’ll be use hybrid sequencing data (Illumina and Nanopore). This tutorial has five parts. Part 1: Downloading and preparing data Part 2: Assembly with…
-
Programmatically Downloading Raw Data from NCBI
This is part of tutorial series: NGS Workflow for Genome Assembly to Annotation for Hybrid Bacterial Data We’ll be use hybrid sequencing data (Illumina and Nanopore). This tutorial has five parts. Disclaimer: This post is a work in progress. This is genome assembly and annotation workflow that I use for microbial genomics. Previously, I used…
-
How to Locally Build and Test a Bioconda Program
Bioconda is a curated set of packages and programs useful in computational biology and bioinformatics. It’s similar to Anaconda, or more accurately, a channel of Conda, providing platform-independent bioinformatics related packages. I came to fully appreciate the Conda package manager for Python and R (and some other command-line tools) once I started heavily analyzing biological…
-
Creating a Publication Quality Phylogeny Using ggtree
A decade ago, circa 2012-2013, I used MEGA5 to infer phylogeny using simple Neighbour-Joining methods, and used the figure generated by MEGA5 to present and publish my results. Later, when I started learning other phylogeny reconstruction methods like Maximum Likelihood (ML) and Bayesian (which does not draw the tree for you), I started to explore…
-
Leveraging Power of Parallelization in UNIX Scripts
I needed to convert a bunch of bam files back into fastq files. I was implementing a regular for loop to load the files and convert them, one by one, serially. After running the first batch of conversion, I realized that the process is too slow, and is not the best thing to do when…
-
Phylogeny from multi-locus sequences aka MLSA
For one of my Ph.D. projects, I had to generate phylogeny from multi-locus sequence data. Often I have to repeat similar analyses and need to go back to the previous workflow to check what I actually did. I’m sharing the protocol here mainly to help my future self, and may be this is useful to…