Description
2nd Edition
by Tiago Antao (Author)
Discover modern, next-generation sequencing libraries from Python ecosystem to analyze large amounts of biological data
Key Features
Perform complex bioinformatics analysis using the most important Python libraries and applications Implement next-generation sequencing, metagenomics, automating analysis, population genetics, and more Explore various statistical and machine learning techniques for bioinformatics data analysis
Book Description
Bioinformatics
is an active research field that uses a range of simple-to-advanced
computations to extract valuable information from biological data.
This
book covers next-generation sequencing, genomics, metagenomics,
population genetics, phylogenetics, and proteomics. You'll learn modern
programming techniques to analyze large amounts of biological data. With
the help of real-world examples, you'll convert, analyze, and visualize
datasets using various Python tools and libraries.
This book
will help you get a better understanding of working with a Galaxy
server, which is the most widely used bioinformatics web-based pipeline
system. This updated edition also includes advanced next-generation
sequencing filtering techniques. You'll also explore topics such as SNP
discovery using statistical approaches under high-performance computing
frameworks such as Dask and Spark.
By the end of this book,
you'll be able to use and implement modern programming techniques and
frameworks to deal with the ever-increasing deluge of bioinformatics
data.
What you will learn
Learn how to process large next-generation sequencing (NGS) datasets Work with genomic dataset using the FASTQ, BAM, and VCF formats Learn to perform sequence comparison and phylogenetic reconstruction Perform complex analysis with protemics data Use Python to interact with Galaxy servers Use High-performance computing techniques with Dask and Spark Visualize protein dataset interactions using Cytoscape Use PCA and Decision Trees, two machine learning techniques, with biological datasets
Who this book is for
This
book is for Data data Scientistsscientists, Bioinformatics
bioinformatics analysts, researchers, and Python developers who want to
address intermediate-to-advanced biological and bioinformatics problems
using a recipe-based approach. Working knowledge of the Python
programming language is expected.
Table of Contents
Python and the Surrounding Software Ecology Next-generation Sequencing Working with Genomes Population Genetics Population Genetics Simulation Phylogenetics Using the Protein Data Bank Bioinformatics pipelines Python for Big Genomics Datasets Other Topics in Bioinformatics Machine learning in Bioinformatics