How to get to Ha Giang from Hanoi Airport (Noi Ba)

I just spent an amazing three days motorbiking round the Ha Giang loop in northern Vietnam. Fantastic thing to do, just mind blowingly amazingly spectacular. However, I had a bit of a stressful time making the connection between my plane arriving into Noi Ba airport and the bus to Ha Giang, but it all worked…

Alternative ways of giving credit

It’s a truth universally acknowledged, that for a research career to flourish you need to publish first author papers. However, like many things concerning scientific publishing, this norm fails the ‘explain it to a non-biologist’ test. Non-biologist: So, how do you get credit for the papers you publish? Biologist: Well, if you are listed first…

TB incidence cartogram

I saw this amazing ‘natures-heartbeat‘ cartogram, and thought it would be cool to have something similar for TB incdience. There doesn’t seem to be any TB cartograms already available, so I thought I would make my own. I used the cartogram package in R. After spending some time in the deeper circles of R dependecy…

Causal inference and the spectrum of association studies

I’m reading an interesting article which Marc Lipsitch tweeted about. The C-Word: Scientific Euphemisms Do Not Improve Causal Inference From Observational Data by Miguel Hernan. The main take-away messages for me are that Almost all scientific studies are aiming for causal inference, but people working on non-intervention, non-randomised studies (aka association/descriptive/exploratory studies) are generally discouraged from…

Intro to bacterial genomics

Here, in the interests of ‘if you have to email it twice, write a blog’ is my high-level overview of what a bacterial genomics pipeline looks like. 1. quality assess fastqs with e.g. fastqc, visualise these across your dataset with MultiQC. If data is particularly bad, do quality trimming, if not, then don’t. 2. do species…

Salmonella genomic epidemiology exercise

Adapted by Lauren Cowley from original PHE training material by Philip Ashton and Tim Dallman Answers at the end Salmonella Enteritidis PT14b outbreak exercise Bioinformatics training, interpretation of phylogenies Prepared by the Gastrointestinal, Emerging and Zoonotic Infections department and the Gastrointestinal Bacteria Reference Unit, Public Health England, Colindale, London.   Aim of Session To develop…

Guideline to bioinformatics tools

Quality Assessment and Trimming Trimmomatic http://www.usadellab.org/cms/index.php?page=trimmomatic Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bolger, A. M., Lohse, M., & Usadel, B. (2014). Bioinformatics, btu170. A flexible read trimming tool that will remove Illumina adapters, reads below a certain length and low quality ends of the read Seqtk https://github.com/lh3/seqtk Tool for processing sequences in the FASTA…

My time, your time, compute time

I have recently started using the high performance cluster at the Sanger Institute, and it comes with an interesting quandary. You have to be explicit in the amount of RAM you request when you submit your job (the PHE cluster wasn’t like this). Then, that amount of RAM is assigned to your job and wont…