Next Generation Sequencing and Data Analysis

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This textbook provides step-by-step protocols and detailed explanations for RNA Sequencing, ChIP-Sequencing and Epigenetic Sequencing applications.

The reader learns how to perform Next Generation Sequencing data analysis, how to interpret and visualize the data, and acquires knowledge on the statistical background of the used software tools.

Written for biomedical scientists and medical students, this textbook enables the end user to perform and comprehend various Next Generation Sequencing applications and their analytics without prior understanding in bioinformatics or computer sciences.

Author(s): Melanie Kappelmann-Fenzl
Series: Learning Materials in Biosciences
Publisher: Springer
Year: 2021

Language: English
Pages: 230
City: Cham

Preface
Legend: How to Read this Textbook
Contents
Contributors
1: Next Generation Sequencing (NGS): What Can Be Sequenced?
What You Will Learn in This Chapter
1.1 Introduction
1.2 Biological Sequences
1.2.1 DNA
1.2.1.1 RNA
1.2.2 Protein
1.2.3 Other Important Features of the Genome
Take Home Message
Review Questions
Answers to Review Questions
References
2: Opportunities and Perspectives of NGS Applications in Cancer Research
What You Will Learn in This Chapter
2.1 Introduction: Using Genomic Data to Understand Cancer
2.2 Driver Mutations and Their Biological Mechanisms of Action
2.2.1 Oncogenes
Review Question 1
2.2.2 Tumor Suppressors
Review Question 2
2.2.3 Gene Fusions
Review Question 3
2.3 Sequencing in Cancer Diagnosis
2.4 Genome Sequences Can Reveal Cancer Origins
Review Question 4
2.5 Genome and Transcriptome Sequences Are Useful for Elucidating Cancer Biology
2.5.1 Sequencing of In Vitro and In Vivo Tumor Models Identifies Fundamental Biological Properties
2.5.2 Single-Cell DNA and RNA Sequencing
2.5.3 Exploring Intra-Tumor Heterogeneity Through Sequencing
2.6 Sequencing in Cancer Treatment
2.7 International Collaborative Efforts in Cancer Sequencing and Mutation Classification
2.7.1 The Cancer Genome Atlas (TCGA)
2.7.2 International Cancer Genome Consortium (ICGC)
2.7.3 Pan-Cancer Analysis of Whole Genomes (PCAWG)
2.7.4 Catalog of Somatic Mutations in Cancer (COSMIC)
2.7.5 ClinVar
2.8 Opportunities, Challenges, and Perspectives
Take Home Message
Answer to Question 1
Answer to Question 2
Answer to Question 3
Answer to Question 4
References
3: Library Construction for NGS
What You Will Learn in This Chapter
3.1 Introduction
3.2 Library Preparation Workflow
Take Home Message
Review Questions
Answers to Review Questions
References
4: NGS Technologies
What You Will Learn in This Chapter
4.1 Introduction
4.2 Illumina
4.3 Ion Torrent
4.4 Pacific Bioscience
4.5 Oxford Nanopore
4.6 NGS Technologies: An Overview
Take Home Message
Review Questions
Answers to Review Questions
References
5: Computer Setup
What You Will Learn in This Chapter
5.1 Introduction
5.2 Computer Setup for NGS Data Analysis
5.3 Installation
Take Home Message
References
6: Introduction to Command Line (Linux/Unix)
What You Will Learn in This Chapter
6.1 Introduction
6.2 The Linux/Unix File System
6.3 The Command Line Tool
Take Home Message
Review Questions
Answers to Review Questions
7: NGS Data
What You Will Learn in This Chapter
7.1 Introduction
7.2 File Formats
7.2.1 Basic Notations
Review Question 1
7.2.2 FASTA
7.2.3 FASTQ
Review Question 2
7.2.4 SAM
7.2.5 BAM
7.2.6 GFF/GTF
Review Question 3
7.2.7 BED
7.2.8 BedGraph
7.2.9 VCF
7.2.10 SRA (Sequence Read Archive)
Review Question 4
7.3 Quality Check and Preprocessing of NGS Data
7.3.1 Quality Check via FastQC
7.3.1.1 The Basic Statistics Module
7.3.1.2 Per Base Sequence Quality
7.3.1.3 Per Tile Sequence Quality
7.3.1.4 Per Sequence Quality Scores
7.3.1.5 Per Base Sequence Content
7.3.1.6 Per Base GC Content
7.3.1.7 Per Sequence GC content
7.3.1.8 Per Base N Content
7.3.1.9 Sequence Length Distribution
7.3.1.10 Sequence Duplication Levels
7.3.1.11 Overrepresented Sequences
7.3.2 Preprocessing of NGS Data-Adapter Clipping
Take Home Message
Answers to Review Questions:
References
8: Reference Genome
What You Will Learn in This Chapter
8.1 Introduction
8.2 Generate Genome Index via STAR
8.3 Generate Genome Index via Bowtie2
Take Home Message
References
9: Alignment
What You Will Learn in This Chapter
9.1 Introduction
9.2 Alignment Definition
9.2.1 Global Alignment (Needleman-Wunsch Algorithm)
9.2.2 Local Alignment (Smith-Waterman Algorithm)
9.2.3 Alignment Tools
9.2.3.1 STAR
9.2.3.2 Bowtie
9.2.3.3 Bowtie2
9.2.3.4 TopHat/TopHat2
9.2.3.5 Burrow-Wheeler Aligner (BWA)
9.2.3.6 HISAT2
Take Home Message
Review Questions
Answers to Review Questions
References
10: Identification of Genetic Variants and de novo Mutations Based on NGS
What You Will Learn in This Chapter
10.1 Introduction: Quick Recap of a Sequencing Experiment Design
10.2 How Are Novel Genetic Variants Identified?
Review Question 1
10.2.1 Naive Variant Calling
10.2.2 Bayesian Variant Calling
10.2.3 Heuristic Variant Calling
10.2.4 Other Factors to Take into Account When Performing Variant Calling
10.2.5 How to Choose an Appropriate Algorithm for Variant Calling?
10.3 Working with Variants
Review Question 2
10.4 Applying Post-variant Calling Filters
Review Question 3
10.5 De novo Genetic Variants: Population-Level Studies and Analyses Using Pedigree Information
10.6 Filtering Genetic Variants to Identify Those Associated to Phenotypes
10.6.1 Variant Annotation
10.6.2 Evaluating the Evidence Linking Variants Causally to Phenotypes
10.6.3 Variant Filtering and Visualization Programs
Take Home Message
Answers to Review Questions
Box: Genome Assembly
10.7 A Practical Example Workflow
References
11: Design and Analysis of RNA Sequencing Data
What You Will Learn in This Chapter
11.1 Introduction
11.2 RNA Quality
11.3 RNA-Seq Library Preparation
11.4 Choice of Sequencing Platform
11.5 Quality Check (QC) and Sequence Pre-processing
11.6 RNA-Seq Analysis
11.6.1 Reference-Based Alignment
11.6.1.1 Choice of Reference-Based Alignment Program
Review Question 1
11.6.2 De novo or Reference-Free Assembly
11.6.2.1 Choice of de novo Assembly Tools
11.7 Functional Annotation of de novo Transcripts
11.8 Post-alignment/assembly Assessment and Statistics
11.9 Visualization of Mapped Reads
11.10 Quantification of Gene Expression
11.11 Counting Reads Per Genes
11.12 Counting Reads Per Transcripts
11.13 Counting Reads Per Exons
Review Question 2
11.14 Normalization and Differential Expression (DE) Analysis
Review Question 3
11.15 Functional Analysis
Take Home Message
Answers to Review Questions
References
12: Design and Analysis of Epigenetics and ChIP-Sequencing Data
What You Will Learn in This Chapter
12.1 Introduction
12.2 DNA Quality and ChIP-Seq Library Preparation
12.3 Quality Check (QC) and Sequencing Pre-processing
12.4 Copy Number Variation (CNV) of Input Samples
12.5 Peak/Region Calling
12.6 Further ChIP-Seq Analysis
Take Home Message
Review Questions
Answers to Review Questions
References
Appendix
Library Construction for NGS
Example Protocol: TruSeq Stranded Total RNA LT Sample Preparation Kit (Illumina)
Example Protocol: TruSeq ChIP Library Prep Kit (Illumina)
NGS Technologies
Index