GLAM Wiki » Gaia Resources

I’m just back from a trip to Melbourne and Canberra to kick off a few new projects for Gaia Resources, to catch up with some friends and attend the Galleries, Libraries, Archives and Museums (GLAM) Wikimedia event in Canberra (GLAM-WIKI). I’ll eventually put up a couple of additional blog posts about the other projects, but I want to focus on the GLAM-WIKI event in this one.

Rather than try to explain the fun we had, I’ll simply point you at the Twitter feed, where you can browse to your heart’s content and see it for yourself (more and more, Twitter is becoming a solid part of my work practices).

I felt like a bit of an outsider at GLAM-WIKI. I’m not employed by a GLAM organization as a staff member, but Gaia Resources undertakes projects with a range of GLAMs around Australia. I don’t edit Wikipedia much, but use it occasionally in our projects as an additional source of information. I was observing these two groups rather than participating much of the time, but I think that “neutrality” has helped my understanding of the possibilities.

One of the big benefits of GLAM-WIKI for me was getting two days to think about how this can help the bioinformatics areas of Museums and Herbaria, where I have a lot of professional (and personal) interest. I drew a lot of diagrams and took a lot of notes during the time there, so here’s a summary of what I learned in two days of thought-provoking, fun discussions…

Question 0: What’s in a GLAM, and what’s in Wikipedia?
The GLAM information and the Wikipedia information are very different things. This sets the scene for much of the rest of the discussion points below, as context (hence Question 0).

GLAM information/data:
- On all objects in their collections (assuming they have the resources!)
- Authoritative data
Wikipedia information/data
- On “notable” objects/things/people (not everything)
- Crowd-sourced (i.e. not necessarily authoritative)

So that’s the big difference I saw, but it is by no means a killer of more collaboration. I think it is a difference that both groups should keep in mind, so as to better understand each other. That learning process was interesting to watch during the event (and the tension disappeared to be replaced by enthusiasm, which was great to see).

Question 1: Does GLAM information belong in Wikipedia, Flickr and other “free” places?
GLAM-WIKI highlighted for me that there are benefits of putting collections data out “where the people are” so that they can interact with it. Seb Chan and Lynda Kelly have been talking about this for a while, so it was great to see them during the event.

As the public sees more information and data from GLAMs at their fingertips (though any means, really, but primarily here we are talking about putting it “on the web”), there was a recognition that they become interested and will start interacting more with the GLAM organization and start building personal relationships with it (more about this in Question 2).

There is a whole discussion about free data, copyright, creative commons licences and so on to be had in this space. There were a bunch of people at the GLAM WIKI event that are far more qualified to talk on this than I am – but there was quite a bit of support by attendees for this data to be made open, and I think there were some options for institutions worried about this. One thing that is important is attribution under any method – the source information has to be attributed as being from the GLAM (which can be done in most of these web sites like Wikipedia and Flickr).

And something I didn’t think of until the event: a lot of the GLAM authoritative content is already out there on Wikipedia. Wikipedians who are interested in the “things” in a collection will take information about these objects from GLAM web sites and load them to a Wikipedia page or as a Wikimedia Commons image themselves. If you work in a GLAM and have an iconic object, try searching Google for it. Where do you appear in the ranking? Where does Wikipedia appear?

A couple of quick pre-canned examples:

Phar Lap which is at Museum Victoria – Google Search (wiki #1, MV #2)
G for George at the Australian War Memorial – Google Search (wiki #4, AWM #2)

Once a piece of data hits Wikipedia and is open to editing, then it no longer is the authoritative source of information. That’s something to remember for when we get to question 3.

Question 2: Will opening GLAM data to the public hurt GLAMs?
The problem I’ve often dealt with is the GLAMs having the means to be able to put their data online. In the bioinformatics area, we’ve tried to help with HermesLite – and by making it open source I hope it will help. Our current work on developing an open source collections management system will also be something I hope will help. I don’t think there are any technical issues still not dealt with about putting data on-line.

However, no matter which tools we can create that will help, there are still many more issues about resources to digitize and capture that information that the GLAMs also have to face.

In addition, revenue loss is one concern from the GLAMs that I come across regularly, mainly in regards to bioinformatics data. I became convinced after hearding Seb Chan from the PowerHouse Museum (PHM) speak that there are fewer downsides than I thought to this. Seb had a few figures about the percentage of the PHM’s sources of revenue– 76% of their funds come from government. This led to a range of discussions about revenue, funding and profit – and as was pointed out with a grin by Tim Hart, GLAMs are not about making money (nor should they be).

Sources of revenue for a GLAM include:

Visitations,
Shops,
Venue hire,
Image sales,
Data sales,
Grants, and
Government (the 76% Seb mentioned)

So really, the amount of funding coming in from things like image and data sales is not a significant proportion of the budget for a GLAM (but any revenue is important).

So here’s the bit to consider. Seb’s experience with making a small subset of their images (1,200) freely available in a decent (not best quality) format has demonstrated an increase in their revenue from image sales. In the same way, I think I’m starting to see the same thing happen in my role at the WA Museum with data – publishing data through the Online Zoological Collections of Australian Museums (OZCAM) has led to more people being interested in the data that the WA Museum is the custodian of.

Engaging more people in the data, images, etc, and putting the information out there may also lead to more visitors. So, in the experience of PHM (and someone from Queensland Library echoed this in a comment after Seb’s talk), there was an increase in image sales, and there might also be more visitors to the PHM as a result (although that is hard to measure). So there should be no real loss of revenue – and possibly an increase – to putting data out there, which should enable GLAMs to continue meet their financial Key Performance Indicators (KPIs).

Speaking of KPIs, there was a bit of discussion about KPIs in a business model session. Usually, GLAM KPIs are financial or research based – there are few (if any?) that are about access to information online. I wrote in my notes that this could be changed: GLAMs could have KPIs that include things like:

“how many records are online”,
“how many comments made online”,
“how many hits on the web site”, or
“how high in the page rank for our iconic objects are we in Google” (see previous searches!).

These KPIs would encourage making data more available; however, it does require changes to occur in the operation and management of GLAMs and their funding bodies. This is something that the gov 2.0 task force could look at? I’ll leave that for them to ruminate on.

Question 3: Does Wikipedia information belong in a GLAM?
This is the reverse of Question 1, and the answer in my mind isn’t as clear cut. Effectively, the GLAMs have to ensure that their data is the authoritative version.

However, as the public interact with GLAM data, some of them will find that they have some really valuable information about objects in a collection. PHM staff talked about the public improving the geo-locations of imagery. While I was wandering around the Australian War Memorial, I heard a guide explain that an unfinished painting of a soldier was identified by a visitor as her father, and a whole treasure trove of information about the painting was determined.

In these cases, though, do you take the person’s word for it on face value? There has to be some background checks on this information, which means curators have to review this data for veracity before they accept it and act on it. This could be an overhead for curators, but effectively this is a lot of what they already do when they consider a donation to their organisation.

This applies to Wikipedia information; how do we know the information typed onto a wiki page is legitimate? We don’t, without further investigation. So it is possible to consider a Wiki page like that curator-donator discussion, but preserved for when a curator has time to deal with it (which probably won’t happen for a while in the current resourcing climate for GLAMs).

If a Wikipedia page has something in it that a curator – the “arbiter of authority” for an object in the collection – decides is true and verifiable, then they can add it to their own information.

Question 4: So, how do we move forward?
I’ve tried to lay out these last three questions in the order that the GLAMs themselves would need to consider them, and these are my “short” answers:

GLAMs should put their data on line
It won’t hurt – in fact, it could help
You only need to take data back that you want to

So we need clear ways to move forward with this.

Putting data online?
In the bioinformatics field, there are tools to put data online, like HermesLite, and projects that have developed portals, like OZCAM and Australia’s Virtual Herbarium. As I’ve already mentioned, I don’t think there are any technical issues to putting data on-line, just ones relating to having the will to do so.

There are some lessons the rest of the GLAM sector could learn from these bioinformatics projects, that are up and currently delivering data on-line. At the same time there are a whole heap of projects out there in the rest of the GLAM community that I didn’t know about when some of these projects started, and I’m sure we could learn a whole heap of stuff from them, too.

I can only encourage the GLAM groups out there – like Museums Australia, Collections Australia Network, Council of the Heads of Australian Faunal Collections (CHAFC), Council of the Heads of Australian Herbaria (CHAH) and others to get together and start learning more from each other about how to put their data on-line. When the technical groups of CHAH and CHAFC started to get together, we learned a lot from each other in a short time. Collaboration works.

Hopefully I’ll see more GLAM events in the future. That would be a great start.

Getting data back?
Well, this is much more difficult. Rod Page had a go at trying to automatically harvest information from Wikipedia pages, and it wasn’t a big success. This is something that Wikipedia could take on board.

Right now, the only method that would easily and reliably work is to copy and paste data from the pages back into a collections management system in chunks. That’s not particularly sustainable. I’ve been wondering if Wikipedia could implement some automated feeds out of their pages or underlying databases. This is done in bioinformatics through methods including TAPIRLink XML feeds using the TDWG (Biodiversity Information Standards) standards such as Darwin Core.

Schemas and standards are not easy things to generate and implement. I am currently involved with standards via my various volunteer roles with TDWG, and I know it’s not an easy job for any group. This is the sort of thing that GLAM groups should be making a collected effort towards, I think – standards for all aspects of collections should be developed and implemented (I’m sure there are a few out there that are underway and being used already).

So there’s some ways forward and suggestions for the GLAM-WIKI attendees to think about. That said, I go back to one of my earlier points – I’m not really in a GLAM or a WIKI group, and this is just what I got out of the couple of days. I’m sure there will be a lot more ideas from GLAM-WIKI attendees over the next few weeks, so keep an eye out for those too.

I would really welcome any discussion on the subject, and if there’s a lot, I’ll do a follow-up post at a later time.

Contact me via email or twitter.

Posted in Blog

« Travel (again!)

New staff and Wikimedia recap »