At Gaia Resources, we have been working with a range of archives over the 20 years we’ve been operating. In the next few months - and leading up to some big conferences such as iPRES and the ASA at the end of the year - we thought we’d put forward some of our thoughts and experience on various aspects of the digital side of archives.
Let’s start at the beginning: choosing your digital infrastructure.
At the very highest level you have two choices for digital infrastructure: either you go with “on-premise” or you go with “the cloud”.
On-premise
When we talk about on-premise (or “on-prem” as you’ll hear it shortened to), you’re using your capital budget to purchase your own hardware - servers, disk space, backups, networking gear, the works. This is something that we used to do at Gaia Resources back in the day, with our own server racks in our office, including ensuring that someone took the backup tapes offsite each day for storage.
A lot of smaller archival institutions have very minimal on-prem infrastructure - from on-site servers and network drives, Network Attached Storage (NAS) boxes through to hard drives bought from the local stationery supplier. If you’re in the latter stage, with archival data on separate physical hard drives scattered around the office, then you’ve definitely got a problem. These drives can fail, and if they’re not backed up anywhere else then anything on them is lost. This is the stuff of nightmares from a digital infrastructure perspective - but more on what we can do about that later.

When you go with on-prem hardware, you also then need to factor in the costs of running and maintaining it out of your operating budgets as well. This means that as hardware fails (and it does) you need to be able to replace it - and more importantly, you also need to know when it fails! So you are committing to not only having the hardware within your organisation, but you are also looking to have the people that can manage it in your organisation as well.
On-prem options can be attractive if you’ve got larger capital works budgets, but if you do this without ongoing operational support and budget to match, you are going to end up with a real problem on your hands as the hardware ages and starts to fail - you have to maintain both the hardware and the people that are managing it.
The cloud
The cloud offers some nice solutions to this. The cloud is basically the same hardware that we mention above for the on-prem situation - except instead of buying it with your capital budget, you’re leasing it from a provider that has it in a data centre, so it comes out of your operating budgets. You will also still need to have staff on board that can help manage this - the cloud providers do not provide “do-ers”, just infrastructure.
The big companies that provide cloud services - namely Microsoft, Amazon, and Google - are global players in this market. This - especially at the time of writing this where there is a lot of uncertainty over the US market - does have a range of concerns; especially for archives. The biggest concern has traditionally been where your data is stored geographically (i.e. onshore in Australia) - and most of the cloud providers now have a range of options to choose from in Australia, thankfully. The setup of their "availability zones” - basically a geographically close cluster of physical data centres - means that they also provide multiple copies of your data across multiple data centres, meaning your system has a far greater level of resilience than the single server you might have on-prem.
As part of the arrangements with cloud providers, you don’t have to worry about the hardware getting rusty and old; these cloud providers have impressive maintenance cycles that mean that they swap out older hardware before it becomes a problem - and they have a few layers of abstraction from the actual metal that means for the most part, you don’t even notice.
Our journey
It might be worth explaining the journey that Gaia Resources went on to show the thinking and decision points that have helped us become an industry leader. We moved from on-prem to the cloud over a few jumps, and for different reasons, one of which might surprise you - sustainability.
Like anyone running their own metal, there were times when people had to drive to the office in the middle of the night to do some hardware resets or repairs. That wasn’t fun for a small business; on top of which we’d had to spring for a pretty amazing internet connection to our office at that time. Running our own metal was painful; buying parts was expensive, those 2am runs to the office weren’t fun at all, as well as managing the physical security around the servers. We were maintaining our own skill set in house, that we only needed rarely, at very inconvenient times - this wasn’t making sense.
We decided that we’d move from on-prem to hosted infrastructure - our servers were trucked out to a data centre in Perth, which we had chosen because of their sustainability ethos, with full solar power of the centre with backups and all the like. We didn’t have to worry as much about the servers, they provided remote hands to do resets if needed, and that started to solve a few problems. We managed to solve a few problems - sustainability of our digital footprint, to allow the team members to relax a bit (and stopping those 2am server runs), as well as providing very strong physical security.

When we decided to move into the cloud, we did our research around the sustainability part of the cloud first. This was something that we were significantly worried about, but we were pleasantly surprised to find a lot more sustainability information on the providers than we expected - and based on this we chose to partner with Amazon Web Services (AWS), who had the best sustainability initiatives at the time - green power, sustainable water use, local economic inputs, the works.
Another reason we started our journey with AWS was because they had the best coverage of their data centres across the Australian jurisdictions that we really needed, especially for archives. Since then, we’ve decided that we need to be more agnostic about our infrastructure and nowadays we deliver systems on top of either AWS or Microsoft Azure infrastructure, depending on what best suits the system and client.
So our journey started at on-prem, went through hosting in other data centres, right through to full cloud hosting - we’ve used them all.
So what suits an archive in 2025?
Things seem to come and go in cycles; with the turbulence and tech-bros in the US where most of the cloud providers are headquartered, there has been a bit more discussion about what is a resilient and sustainable solution for archives and other collecting institutions.
With all that in mind, I think the cloud still is the best solution for an Archive, with some considerations, such as:
- You can afford it (remember this is coming from your operating budget, not capital expenditure),
- You utilise Australian based data centres (so that you are ensuring that the archival data stays on shore), and
- Ideally, you utilise more than one provider to give yourself redundancy (at the very least, keeping backups with a second provider).
This is the situation that we’ve worked really hard on with the Queensland State Archives and delivered their archive as a service, in the cloud, for coming up on seven years.
For those of you that were still wondering about those hard drives - get them backed up to the cloud in a small “storage only” account, and do it soon. This is a really easy way to create a longer term backup, and the cost of storage is relatively low - but more importantly it removes the “single point of failure” when that hard drive gets wiped from sitting on a magnet, or baked in the sun coming in the window. Depending on how cluttered your office is, using some of the cloud providers’ specific “big data” upload options like Microsoft Azure Data Box or AWS Snowball are potential ways to get started.

The considerations you need to think about when deciding on the infrastructure are varied (e.g. what’s the projected size of your collection, are you also digitally preserving the files that you digitise*, etc) but if you stick to the three general principles above then the cloud can provide an excellent infrastructure for your archive.
If you want to know more about infrastructure and archives from our own journey, then reach out for a chat, or start a conversation with us on one of our social media channels, or drop me an email.
Piers
* This is a bugbear of mine: digitisation IS NOT digital preservation. More on that to come later.
Photo credit for featured image - Photo by <a href="https://stockcake.com/i/digital-infrastructure-corridor_1531492_1174295">Stockcake</a>