He is a member of the scientific team behind the development of a large genome analysis pipeline running at the National Computational Infrastructure computer cluster in Canberra. To date, over 3000 exomes and 1000 genomes have been analysed, generated from samples with a wide variety of complex human diseases, such as melanoma, lupus, diabetes, depression, and arthritis.
The aim of the computational system is to discover the underlying genetic mechanism of such diseases. But, Field says, with the analysis of a single human genome generating 100Gb of output, storage and distribution of results are problematic for most researchers.
“In order to work with sufficiently large numbers of samples required to unlock the genetic mechanisms of complex diseases we require access to large computational resources as well as access to robust and scalable methods for reliably distributing our results,” he says.
Many researchers working with genomics data lack the computational resources to store and analyse such large data on site at their respective institutions, which means that Field and his colleagues are currently collaborating with more than 50 research projects distributed all over Australia and internationally.
This is where the AARNet CloudStor service helps out. The large volumes of genomic data generated by the analysis process are transferred quickly and easily between ANU’s computational facility and the researchers.
“We rely heavily on CloudStor to distribute all our results and are always extremely impressed with fast loading times, intuitive interface, and extra security features,” says Field.
CloudStor is a file sharing and storage service designed and built to support data-intensive research collaborations. It is an on-net service for AARNet-connected institutions, providing individual researchers with 100GB free storage. Group quotas are also available on request.
Image: This CloudStor heatmap shows users logging in across the globe during the past 7 days to 12 May 2016