5 useful things in Excel

There is a certain amount of snootiness amongst bioinformaticians when it comes to Excel. If a graph with the default excel colour scheme is showed at a conference there may be sideways glances and sniggers amongst the technorati. However, excel remains the most commonly used tool for bioinformatics (citation needed). When lab staff first join our…

Interesting new model for data sharing

Mike Schatz has posted an interesting paper on Biorxiv on ‘The next 20 years of genome research‘. In it, he argues that in future ‘it will become less and less practical to transfer data into these [NCBI/EBI] archives as they exist today’ and that ‘In its place, we will see the rise of federated approaches…

CLIMB hackathon outcome

TL;DR CLIMB is fricking awesome Last weekend, the MRC CLIMB initiative hosted a hackathon, with the broad aim of using the CLIMB resource to do some cool stuff, which Nick Loman has recently got up and running at Birmingham. There was lots of pizza, beer and bbq, as well as hacking. One of the things we…

Using GATK to call indels in bacterial genomes

TL:DR If you are interested in calling indels with GATK, check out the below. If not, don’t. ——————————————————– So, this annoying guy asked me to add an analysis of indels* to a paper that has been itching to get off my desk for months. I finally got around to doing this, so thought I would…

Link postcode with constituency

TL:DR If you are interested in linking postcode and constituency, see 2015.04.03.postcode_to_constituency_lookup.tsv.gz in this git repo. You can then link constituency with demographic info here I was impressed by the Democratic Dashboard website recently. You pop in your postcode and it tells you lots of info about your electoral constituency, the demographic make up, as well as the…

PhD viva advice

I was just asked for some advice on PhD viva, so have turned the email into a blog post. —————— They often start by asking you to place your work into context of existing literature and summarise your main findings in 5-10 minutes, so have your summary ready. The main prep I did was to…

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog. Here’s an excerpt: The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 10,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 4 sold-out performances for that many…

Lighter: better, faster, longer?

TL:DR? Lighter is an excellent sequencing read error correction tool, fast (90 seconds for 700 mb unzippped fastq) and well engineered (install was completely painless) It significantly speeds up assembly – 20% in my quick benchmark using a Salmonella genome It reduces the number of positions that have an AD ratio of <0.9, remember – every…