Before we begin...
Python is a very commonly used programming language in the world of bioinformatics. Before diving into Python, let us examine some context: what tools are typically found in the "bioinformatics survival kit"?
- The Terminal, Shell and Command Line Interface. For a refresher on this topic consult the Software Carpentry lessons on the Unix shell.
- Working with remote computers: transfering files and logging in to computers over the network.
- Text transformation: filtering and transforming files in FASTA, VCF, TSV and other formats. Python and Shell tools are good for this, as are some of the tools available via Galaxy (see the Text Manipulation section on most Galaxy servers).
- Plotting and doing statistics. R and Python both have options for these tasks - ggplot for R, matplotlib and Altair for Python, for example.
- Dependency management: dependencies are tools and software modules that you need to use. This is a huge topic and there are two main approaches to know about:
- Scientific workflow languages and workflow systems that organise your work into re-useable units. Some examples are:
- A software development environment:
- Familiarity with software version control and the git and Github (or Gitlab) systems.
This list can seem overwhelming. Remember this list is here to offer context, not with an expectation that you will master all of these topics at once. Finally two more resources for training material collections are:
- An introduction to skills for microbial bioinformatics
- The Swiss Institute of Bioinformatics (SIB) list of training material
- The ELIXIR TeSS training search engine
Python for Bioinformatics: Outline
We will be working with the book Python for Biologists by Dr Martin Jones. A early edition of Dr Jones' book is available as a PDF under a permissive license. Hard copies of the follow on books, "Advanced Python for Biologists" and "Effective Python Development for Biologists" are available in the UWC Library. The Jupyter Lab interface will be used for working with the examples from this book except where the subject matter focuses on writing stand alone scripts.
Setting up your Python environment
The best way to install Python is using conda. There are two options for setting up your conda install: Anaconda, an all-in-one installer that installs Python and many other packages (include Jupyter Lab and a lot more) and Miniconda, a more compact installer that installs Python and the conda package manager and then gives you freedom to install further packages yourself as you need them.
Please see the instructions on the page about setting up your Python environment.