Python for Biologists: A Complete Programming Course for Beginners

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Learning to program is one of the best investments that you can make for your research and your career. Python for biologists is a complete programming course for beginners that will give you the skills you need to tackle common biological and bioinformatics problems. Why learn programming? Maybe you see colleagues writing programs to save time and deal with large datasets. Maybe your supervisor has told you that you need to learn programming for your next project. Maybe you've been looking at job ads and noticed just how many of them are asking for programming skills. Table of contents In chapter one, you'll learn why Python is a good choice for biologists and beginners alike. You'll also learn how to install Python for your operating system and how to set up your programming environment, complete with links to all the free software you'll need. In chapter two, you'll learn how to manipulate text (including DNA and protein sequences) and how to fix errors in your programs. Exercises: calculating AT content, splicing introns. In chapter three, you'll learn how to read and write data to and from files. You'll also learn how to deal with file paths and the FASTA file format.Exercises: splitting genomic DNA, writing a FASTA file. In chapter four, you'll learn how to process many pieces of data in a single program and more advanced tools for sequence manipulation. Exercises: trimming adapter sequences, concatenating exons. In chapter five, you'll learn how to make Python even more useful by creating your own functions, including the best ways to test those functions in order to speed up development. Exercises: Analyzing the amino acid composition of protein sequences. In chapter six, you'll learn how to write programs that can make smart decisions about how to handle data and how to make your programs follow complex rules. Exercises: filtering genes based on multiple criteria. In chapter seven, you'll learn an incredibly powerful tool for working with patterns in text - regular expressions - and how to use it to search in DNA and protein sequences. Exercises: filtering accession names and calculating restriction fragment sizes. In chapter eight, you'll learn how to store huge amounts of data in a way that can still allows it to be retrived very efficiently. This allows simplification of much of the code from previous chapters. Exercises: translating DNA sequences to protein. In chapter nine, you'll learn how to make your Python programs work in harmony with existing tools, and how to polish up your programs so that they're ready for other people to use. Exercises: counting k-mers, binning DNA sequences by length. About the author Dr. Martin Jones has been teaching biologists to write software for over five years and has taught everyone from postgraduates to PIs. He is currently Lecturer in Bioinformatics at Edinburgh University.

Author(s): Martin O. Jones
Publisher: Createspace Independent Publishing Platform
Year: 2013

Language: English
Pages: 244

About the author
Preface
1: Introduction and environment
Why have a programming book for biologists?
Why Python?
Python vs. Perl
How to use this book
Exercises and solutions
Getting in touch
Setting up your environment
Installing Python
Running Python programs
Python 2 vs. Python 3
Text editors
Reading the documentation
2: Printing and manipulating text
Why are we so interested in working with text?
Printing a message to the screen
Quotes are important
Use comments to annotate your code
Error messages and debugging
Forgetting quotes
Spelling mistakes
Splitting a statement over two lines
Printing special characters
Storing strings in variables
Tools for manipulating strings
Concatenation
Finding the length of a string
Changing case
Replacement
Extracting part of a string
Counting and finding substrings
Splitting up a string into multiple bits
Recap
Exercises
Calculating AT content
Complementing DNA
Restriction fragment lengths
Splicing out introns, part one
Splicing out introns, part two
Splicing out introns, part three
Solutions
Calculating AT content
Complementing DNA
Restriction fragment lengths
Splicing out introns, part one
Splicing out introns, part two
Splicing out introns, part three
3: Reading and writing files
Why are we so interested in working with files?
Reading text from a file
Using open to read a file
Files, contents and file names
Dealing with newlines
Missing files
Writing text to files
Opening files for writing
Closing files
Paths and folders
Recap
Exercises
Splitting genomic DNA
Writing a FASTA file
Writing multiple FASTA files
Solutions
Splitting genomic DNA
Writing a FASTA file
Writing multiple FASTA files
4: Lists and loops
Why do we need lists and loops?
Creating lists and retrieving elements
Working with list elements
Writing a loop
Indentation errors
Using a string as a list
Splitting a string to make a list
Iterating over lines in a file
Looping with ranges
Recap
Exercises
Processing DNA in a file
Multiple exons from genomic DNA
Solutions
Processing DNA in a file
Multiple exons from genomic DNA
5: Writing our own functions
Why do we want to write our own functions?
Defining a function
Calling and improving our function
Encapsulation with functions
Functions don't always have to take an argument
Functions don't always have to return a value
Functions can be called with named arguments
Function arguments can have defaults
Testing functions
Recap
Exercises
Percentage of amino acid residues, part one
Percentage of amino acid residues, part two
Solutions
Percentage of amino acid residues, part one
Percentage of amino acid residues, part two
6: Conditional tests
Programs need to make decisions
Conditions, True and False
if statements
else statements
elif statements
while loops
Building up complex conditions
Writing true/false functions
Recap
Exercises
Several species
Length range
AT content
Complex condition
High low medium
Solutions
Several species
Length range
AT content
Complex condition
High low medium
7: Regular expressions
The importance of patterns in biology
Modules in Python
Raw strings
Searching for a pattern in a string
Alternation
Character groups
Quantifiers
Positions
Combining
Extracting the part of the string that matched
Getting the position of a match
Splitting a string using a regular expression
Finding multiple matches
Recap
Exercises
Accession names
Double digest
Solutions
Accession names
Double digest
8: Dictionaries
Storing paired data
Creating a dictionary
Iterating over a dictionary
Iterating over keys
Iterating over items
Recap
Exercises
DNA translation
Solutions
DNA translation
9: Files, programs, and user input
File contents and manipulation
A note on the code examples
Basic file manipulation
Deleting files and folders
Listing folder contents
Running external programs
Running a program
Saving program output
User input makes our programs more flexible
Interactive user input
Command line arguments
Recap
Exercises
Binning DNA sequences
Kmer counting
Solutions
Binning DNA sequences
Kmer counting