Managing archives with PROV

Over the last few months we have been working with the Public Record Office Victoria (PROV) to migrate their archival management from a legacy system into CollectiveAccess (CA). This migration forms part of a wider Digital Archive renewal program and is also opportunity to implement an updated Access Control Model for archival data management.

While the project is still very much under way, we thought it was time to introduce you to the work we’re doing, and share some insights we’ve gained in the project that may inspire some others looking at archival data management.

Why CollectiveAccess?

Whenever you choose or recommend a system, the first question people usually ask is “Why choose System Y”, or often more frequently, “Why not choose System Z?”. The answer in this scenario is that CA is highly configurable. It has a base of 11 content bundles that can be re-purposed into managing all kinds of data models and configurations. Importantly, configuring these bundles to match data requirements does not cause issues with version updates. Further, it has a web services API that means it can interact with other systems quite easily.

The data model we are supporting is novel and the system has to be interoperable with a range of other specialist systems. For these reasons, we believed CA was the best fit of the available packages that we often recommend to archives.

Preparing yields dividends

What makes this project so exciting for us is the preparation that’s gone into it from the client side. A lot of time and effort had gone into thinking about how they manage data, what the future requirements are, and what each system in their digital archive needs to perform. As a result, it’s a well-defined program of work. This project sees us working in the Archival Management System (AMS) space so we have a clear idea of what it needs to do, and how it needs to communicate with other components.

The preparation on the client side has included assessing the purpose of all individual data fields, what additional data needs to be captured into the future, and how to retain legacy data. This high level of preparation has greatly assisted the project run smoothly, and in creating a logical plan for approaching the migration and configuration of the system. In short, a huge thanks to PROV for all the planning and foresight!

Facade of the Victorian Archives Centre

Entry to the Victorian Archives Centre in North Melbourne

Rapid iteration to build a base profile, iterate on top of that

One way we’ve got the process moving quickly was to develop a draft, “rapid build” installation profile (this contains all the information about the fields, relationships, displays, report templates etc), and iterate against that on a content type by content type basis. The initial rapid build was very rough, probably a lot rougher than our team was comfortable with (sorry Mieke), but it allowed us to get moving quickly and have a base that covered the entire data model within a few weeks of starting the project.

Having a base to iterate from means that instead of trying to perfect the data model and migration scripts for a big bang release and testing round, we constantly re-run migrations and do minor changes so we are constantly building and improving. Within a week we might be able to re-run a full content import of a data subset, make half a dozen changes to the fields and layouts after that import, and re-run the import incorporating feedback at the start of the next week. All the while, the system remains up for testing and everyone can see all the changes happening as we implement them.

To support this workflow, we’ve introduced a technical process that logs incremental changes to the data model so we can build “mini-profile” updates and not need to re-install the system and re-import the data each time we make a significant change (though this is sometimes still required). I personally recommend this approach as it allows feedback to be reported and implemented in extremely tight feedback loops and ensures that assumptions or issues are uncovered very quickly.

Above I’ve shared a few brief insights into a project that’s now well underway. I do look forward to writing a few more as the project progresses and we get into the meaty components of the big item level imports and configuring the displays of relationships between all the content types.

If you would like to know more about how Gaia Resources can help you with your metadata or with system choice, design and implementation, then feel free to get in touch with me directly via, or feel free to start a conversation with us on FacebookTwitter or LinkedIn.


Comments are closed.