TL;DR
- CLIMB is fricking awesome
Last weekend, the MRC CLIMB initiative hosted a hackathon, with the broad aim of using the CLIMB resource to do some cool stuff, which Nick Loman has recently got up and running at Birmingham. There was lots of pizza, beer and bbq, as well as hacking.
One of the things we wanted to achieve was to set up our SNP calling pipeline so that it can be easily used by other people. CLIMB is the perfect place to do this, for reasons I will go into below.
What is CLIMB?
CLIMB is a cloud infrastructure for microbial bioinformatics. The cloud basically means that anyone with an internet connection to logon to a big ass server and create an ‘instance’, this is essentially your own computer in the sky. Normally, you have to pay Amazon or Google a non-insignificant amount to use such things, but CLIMB gives it to you for free.
The really brilliant thing about the cloud is that a bioinformatician can set up a machine just right, install all the tools needed for their pipeline, and then create an ‘image’ of that machine. This image can then be loaded by anyone when they start up an instance. The tools get used and there are no install headaches – everyone wins!
As it stands, people still have to use the command line to interact with CLIMB, but the guys from the Genomics Virtual Lab will hopefully have a solution for that soon.
I thought you said SNPs?
So how does this relate to SNPs? Well, I have created an image of our SNP calling pipeline. Essentially, all this means is installing and configuring the requirements (bwa, GATK, various python libraries, postgresql) and the python scripts to wrap it all together. This means that instead of having to install the correct version of all the dependencies for our pipeline, you can just load an instance on CLIMB, base it on the phe-gastro-snapperdb image and you have everything set up and ready to use. It is all a bit rough and ready at the moment, and not quite ready for prime time, but if anyone does use it, would be interested in their experiences.
Is it useful in real life?
It sure is! For example, we can set up an instance with some of our strains and their SNPs in a SNPdb, a collaborator can upload some of their own strains to their own CLIMB instance with our pipeline installed (by us), run our pipeline, with the results going into the same SNPdb. Everyone retains control of their data, doing their own analysis, and yet we get all the advantages of collaboration.
Lingua franca
Cloud – computers in the sky! (actually Birmingham)
Instance – your own slice of the cloud, a ‘virtual machine’ where you can install programs and generally run amok.
Image – basically, a pre-configured virtual machine, with lots of useful stuff (hopefully) installed on it.
CLIMB – a special cloud for microbial informatics, aren’t we lucky.
Shame that there’s nothing like this to do analysis on clinical samples for NHS molecular genetics.
Sounds great! I guess the catch is it only works for the people involved in this collaboration, right? You will not be able to share the pipeline/image you create on CLIMB with “strangers”? Or am I missing something 🙂
I mean, it makes sense… Birmingham can’t provide a free cloud to the world.
Hi Rolf, I’m not 100% sure what the access policy is for CLIMB, but I think it will be broadly available for microbial genomics. Especially for people who want to share tools on there, not just be users.
Also, as it uses the common open stack platform, it might be possible to transfer images from cloud to cloud.
Is there going to be anything like this as part of compare?
I think so. EBI is supposed to provide some kind of platform, the exact details are kind of vague right now, but the idea is you log on to the platform and get a virtual machine from which you are able to either upload your own pipelines/tools and/or run the default ones.
I believe the idea is also that you are able to share pipelines and benchmark them in a standardized way, and when better tools and pipelines are developed they will then become the new default tools… that is at least the dream 😉
With that said, we haven’t seen anything specific yet to play around with. Think Guy mentioned they would have something in December.
And of cause it would be nice if the different initiatives were able to exchange images, but we all know that it’s unfortunately often harder than it sounds.
Some good ‘decryption’ of amazon web services lingo https://www.expeditedssl.com/aws-in-plain-english