Since the project’s inception, we’ve subscribed over 30,000 users, sent well in excess of a petabyte of files, we’re storing well over 130 terabyte of working files and connect together several dozen large research groups internationally.
Why we built a tool for the long tail of researchers
Extreme tech will only ever be adopted by extreme technologists – unless you make extreme tech as approachable as an iPhone.
AARNet, being in the business of providing extreme networking (currently 100 gigabits per second, or crudely “10000 times consumer grade ADSL”) to its constituency of Research & Education users, began noticing about 10 years ago that its extreme tech had extreme trouble reaching beyond the extreme technologists (particle physicists, astronomers, climate scientists) and into the hands of “less digitally obsessed” researchers.
It was decided then that we needed to develop tools that would sit right with this long tail of research adopters, and we decided to tackle three distinct but synergistic workflows: direct sending of bundles of files from one person to another, synchronisation of files between different devices owned by the same person, and on-demand sharing of files or directories between groups of users.
Inventions are never made in splendid isolation, and indeed in those same 10 years these use cases have become more familiar to people as commercial file sharing services have evolved – except our versions deal with terabytes of files and gigabits of speed, because that’s the scale of science data these days.
Open source componentry
We built the architecture to be as low-cost, as modular and as maintainable as possible.
The file-sending part of the software codebase was developed in collaboration with a number of overseas R&E networks (this is FileSender).
We then spent time sourcing optimal componentry from the open source world; the great majority of project effort was spent on sourcing, hardening and proof-of-concept trialling of componentry, never settling for 90% performance, constantly killing bits.
This project demonstrates the remarkable democratising power of the modern open source landscape, that any competent entity with a vision can stack together readily available componentry and build a bespoke solution that 20 years ago would have costs millions to build and millions again in licenses to operate. We thank in particular the following projects: ownCloud, Apache, HAproxy, CERN EOS, and MariaDB.
CloudStor is our first and most successful foray into “democratised tooling”, where simplicity and massive uptake are the goal, rather than super-specialist niche application. Through the provisioning of these tools, we’ve managed to make many more research disciplines realise the power of networked science, which enables us to extract a much greater social dividend out of infrastructure that is ultimately paid for by tax dollars – and improved social dividend (through research) is the core of AARNet’s mission.
A CloudStor user story
The Basin GENESIS Hub project, led by the EarthByte Group at the School of Geosciences at the University of Sydney, is using global research network infrastructure, AARNet’s CloudStor service and Australia’s most powerful supercomputer to access, share and analyse huge datasets, to explain how sedimentary basins have formed and changed over hundreds of millions of years.
The project needed efficient ways to transfer very large files – for example, to share large raw datasets, model outputs and visualisations with international colleagues, and CloudStor has provided the solution.
Read the Case Study
Image: Since launching in 2014, CloudStor has subscribed more than 30,000 users, sent well in excess of a petabyte of files, is storing well over 130 terabyte of working files and connects together several dozen large research groups internationally.