One-Year Internship Program in Computational Biology & Bioinformatics
Program Duration: One year, from September 16, 2019 – August 31, 2019
Number of slots: 10
Venue: Bio-Innovation Centre (BIC), Rajiv Gandhi Centre for Biotechnology,Trivandrum.
Application opens: July 29, 2019
Application deadline: August 31, 2019
Course Fee: ₹60,000
This one year internship program is a platform for highly motivated students to explore bioinformatics through practical experience. It provides a solid base to the use of bioinformatics by providing theory and hands-on training in methods and resources appropriate to all major fields of biological research. The internship provides best strategies for undertaking bioinformatics analysis, computer programming, statistical analysis, data management and reproducibility. All participants will have close and correct mentoring by RGCB faculty. Special invited lectures will be arranged by distinguished scientists and academicians.
The RGCB Academic Committee will screen all applications and potential candidates will be invited for personal interview (or online interview). In case more than 15 candidates are being short-listed after screening of applications, an online test will be conducted before the selection interview.
The total fee amount of 60,000 INR. No certificate will be issued without fulfilment of the curriculum & payment of the total fee. Program fees include admission, study materials, access to internal computational facilities and consumables used in the computational biology laboratory. It does not cover your travel and local accommodation.
RGCB Hostel facility will be limited. Assistance will be provided to find suitable local accommodation if hostel rooms are not available.
Who should apply?
This internship course is aimed at people with a background in biological sciences who have little or no experience in bioinformatics. Applicants are expected to be at an early stage of their career with an interest to develop their bioinformatics skills. Essential qualifications include a first class bachelor’s degree in medical/engineering sciences or a masters degree in any branch of life science. Previous knowledge of computer programming is not required for this program.
Submit your online application at www.rgcb.res.in/training. Please note, your application will not be considered without aStatement of Purpose (Why you want to train in this specialization?) & resume.
The curriculum is divided into 12 core bioinformatics modules (Theory + Practical exercises) plus a 3-month final dissertation. The syllabus includes
Module 1: Bioinformatics: The Big Picture, Open challenges, Web-Based & Command-Line Software culture, Introduction to UNIX environment, Unix file system; Installing & executing programs in LINUX environment; Navigating your computer from the shell; Basic command line operations; Fundamentals of computer programming & Biostatistics – Perl, Python, R, shell scripting, Database development using MySQL, working with remote machines. Introduction to common text editors like gedit, nedit, emacs & vi with special emphasis on vi editor basic commands.
Module 2: Biological data resources, access & management – Genomes across the tree of life, Major sequencing projects, Major centralized bioinformatics databases to store DNA, RNA & protein sequences. Major resources and services at NCBI, Web based and command-line access to information. Navigating through major resources and services at NCBI; Overview of major web resources for the study of genomes: Enseml, NCBI-Genome and UCSC genome browser. Basic programming in Python and Perl – Introduction to Perl variables (Scalar, Array and Hash) and Python variables (String, List, Dictionary, Tuple and Set) with examples & exercises.
Module 3: Biological sequence analysis – Homology, Similarity & Identity; Scoring matrices; EMBOSS tools; NCBI blast programs; Evaluation of significance of results using E-value and Bit score; Profile searches, HMMER, Sequence alignment programs. Different approaches to perform Multiple Sequence Alignment, Best strategies to perform pairwise and multiple sequence alignment. Multiple sequence alignment of genomic regions. Databases of Multiple sequence alignment. Basic loops in Perl & Python; Use of different loops like if, while, if-else, if-elsif-else, foreach, for and unless loops for simple data structures.
Module 4: Molecular phylogeny & Evolution – Principles of molecular phylogeny and evolution; Stages of Phylogenetic Analysis, Distance-Based, Character based & Model-Based Phylogenetic Inference;Maximum Likelihood(ML), Bayesian inference methods, PHYLIP, MEGA, Evaluation of phylogenetic trees; Phylogenetic networks.
Module 5: Advanced programming in Perl and Python – Complex data structures; Array of arrays, array of hashes, hash of hashes in Perl and list of lists, tuples & dictionaries in Python; Use of loops through complex data structures; Referencing and Dereferencing in Perl;Common useful perl modules from CPAN; Useful python libraries for Biologists.
Module 6: Genomics: Next generation sequence analysis – DNA Introduction to DNA Sequencing Technologies; Overview of Next-Generation Sequencing Data Analysis: From Generating Sequence Data to FASTQ; Quality control; Different genome assembly programs; Multiple read alignment software programs; The SAM format & SAMtools; Variant calling, VCF format & VCF tools; Interpreting variants; Visualizing & Tabulating NGS data; Storing Data in public repositories; Applications of NGS.
Module 7: Advanced programming in Perl and Python – Complex data structures; Array of arrays, array of hashes, hash of hashes in Perl and list of lists, tuples & dictionaries in Python; Use of loops through complex data structures; Referencing and Dereferencing in Perl;Common useful perl modules from CPAN; Useful python libraries for Biologists.
Module 8: Transcriptomics and Proteomics: Next generation sequence analysis – RNA Introduction to Microarrays and RNA-Seq: Data acquisition & Analysis.Microarray data analysis with NCBI-GEO2R/Bioconductor; RNA-Seq analysis using TopHat and Cuffflinks, Functional annotation of microarray/Rna-seq data.Proteomics: Protein analysis&prediction – Principles of Protein Structure (Primary, Secondary & Tertiary), Protein Data Bank (PDB), Protein structure visualization tools, Protein Domains and Motifs, SCOP & CATH Database; COG database; Basics of Protein Structure Prediction (Homology Modeling, Fold Recognition, Ab-Initio Prediction). Proteomic resources; Fundamentals of molecular docking,Chip-Seq data analysis;
Module 9: Fundamentals of systems biology (Networks & Pathways) – Introduction to systems biology;Functional annotation of gene expression data; Biological data integration (NCBI Biosystems); Bioinformatics resources for Pathways, Networks, and their Integration: (KEGG, REACTOME, MetaCyc); Protein-Protein interaction databases, Reconstruction of signaling pathways.
Module 10: R package – Introduction to R pacakge; Installation in windows/Mac/Linux environment, basic commands to store and print variables; Use of commands like read.table, read.csv, write.table to read/write data in R console. Basic statistics (Mean, standard deviation, correlation coeffiecient and p-value) in R, Use of loops, operators and assignments in R, Generating simple plots on screen or/and in pdf/png/jpg files (Publication quality figures).
Module 11: Bioconductor in R; Bioconductor packages for NGS; Quality assessment (packages: qrqc, seqbias, ReQON, htSeqTools, TEQC, Rolexa & ShortRead), RNA-seq (packages: DEXSeq, EDASeq, edgeR etc). Alignment (packages: Rsubread & Biostrings), Microbiome (packages: phyloseq, DirichletMultinomial, clstutils, manta & mcaGUI), Work flows (packages: ArrayExpressHTS, Genominator, easyRNASeq, oneChannelGUI & rnaSeqMap), Database (SRAdb).
Module 12: Genome analysis – Completed genomes: Viruses, Bacteria, Archaea & Eukaryotes; Comparison of prokaryotic genomes; Plant genomes; Major genome analysis projects; ENCODE project; Finding Genes in Eukaryotic Genomes; Human Genome project; A Bioinformatics perspective on Human Disease.
Final dissertation: Candidate can choose a six-month project from any of the on-going research at the Computational Biology & Bioinformatics Facility.
Morning lecture (9.30 to 10.30 am) followed by hands-on sessions & exercise until 5.00pm.
Exam and Grades
Upon completion of each module, online exams will be conducted. Final grade will be calculated based on module exams, lab activities (Journal presentation, assignments, discussion etc) and final project. Grade A: >80%, Grade B: 70 – 80%, Grade C: 60 – 70%, Grade D: <60%
For more information on the internship please contact:
Prof. Jagadeesh Chandran
Office of Academic Affairs
Rajiv Gandhi Centre for Biotechnology (RGCB)
Trivandrum, Kerala- 695014