Cover
Copyright
Credits
About the Author
About the Reviewers
www.PacktPub.com
Table of Contents
Preface
Chapter 1: Python and the Surrounding Software Ecology
Introduction
Installing the required software with Anaconda
Installing the required software with Docker
Interfacing with R via rpy2
Performing R magic with IPython
Chapter 2: Next-generation Sequencing
Introduction
Accessing GenBank and moving around NCBI databases
Performing basic sequence analysis
Working with modern sequence formats
Working with alignment data
Analyzing data in variant call format
Studying genome accessibility and filtering SNP data
Chapter 3: Working with Genomes
Introduction
Working with high-quality reference genomes
Dealing with low-quality genome references
Traversing genome annotations
Extracting genes from a reference using annotations
Finding orthologues with the Ensembl REST API
Retrieving gene ontology information from Ensembl
Chapter 4: Population Genetics
Introduction
Managing datasets with PLINK
Introducing the Genepop format
Exploring a dataset with Bio.PopGen
Computing F-statistics
Performing Principal Components Analysis
Investigating population structure with Admixture
Chapter 5: Population Genetics Simulation
Introduction
Introducing forward-time simulations
Simulating selection
Simulating population structure using island and stepping-stone models
Modeling complex demographic scenarios
Simulating the coalescent with Biopython and fastsimcoal
Chapter 6: Phylogenetics
Introduction
Preparing the Ebola dataset
Aligning genetic and genomic data
Comparing sequences
Reconstructing phylogenetic trees
Playing recursively with trees
Visualizing phylogenetic data
Chapter 7: Using the Protein Data Bank
Introduction
Finding a protein in multiple databases
Introducing Bio.PDB
Extracting more information from a PDB file
Computing molecular distances on a PDB file
Performing geometric operations
Implementing a basic PDB parser
Animating with PyMol
Parsing mmCIF files using Biopython
Chapter 8: Other Topics in Bioinformatics
Introduction
Accessing the Global Biodiversity Information Facility
Geo-referencing GBIF datasets
Accessing molecular-interaction databases with PSIQUIC
Plotting protein interactions with Cytoscape the hard way
Chapter 9: Python for Big Genomics Datasets
Introduction
Setting the stage for high-performance computing
Designing a poor human concurrent executor
Performing parallel computing with IPython
Computing the median in a large dataset
Optimizing code with Cython and Numba
Programming with laziness
Thinking with generators
Index