Handwriting recognition is being used to transform the Sydney Stock Exchange collection, which sheds light on Australia’s economic and social past, into data that will be stored in CloudStor and made available to researchers and the community via the AARNet network.
University Librarian Roxanne Missingham says that with access to the information in data format, researchers will be able to more easily access, interrogate, categorise and manipulate the data to suit their needs.
“The Sydney Stock Exchange records tell interesting stories beyond economics; researchers can look at how local communities grow based on the amount of coal sold, the impact on social welfare, and on migration within the state; there are so many possibilities for cross-disciplinary research.
“But really important research like this gets lost unless we take the opportunity to make the information fully accessible.”
Missingham explains that the project is an example of the way ANU Library has flipped its thinking about collections.
Breaking down archive walls
While the first wave of digitisation of involved images — facsimiles of an original object — available on an access controlled ‘digital shelf’, the new philosophy is determined to break down the walls and make collections available to increasingly technology-savvy researchers.
“The project reflects a wider drive to get as many people as possible to engage with our collections, as what brings them to life is use by researchers,” Missingham said.
“We are also taking advantage of new technologies such as AI, data mining and data visualisation; by using tools like optical character recognition and mapping interfaces we can unleash a treasure trove of information and knowledge that can be used by researchers in different ways.”
Converting handwriting to data
For the Sydney Stock Exchange project, an image facsimile of the paper record will be stored in the university’s digital repository. Alongside this, a cross-disciplinary team is working to convert the handwritten text into digital text and data that will be uploaded to AARNet’s CloudStor, a file storage and transfer service for research and education.
AARNet underpins the connectivity and storage for the Sydney Stock Exchange project, providing high-bandwidth connectivity that supports transfer of large volumes of data between ANU library, CloudStor, and researchers across the globe.
CloudStor is specifically designed to help researchers across Australia share and access large volumes of data. With all data stored on the AARNet network and backed up in three locations, high-bandwidth, on-net access is available to virtually all of Australia’s universities and GLAM institutions and to researchers across the nation.
Missingham explains that CloudStor is ideally suited to storing research data because of its ease-of-use and efficient workflows.
“I’m not a technologist but from my point of view, everything happens by magic with CloudStor. It’s extremely simple in terms of logging in, the messaging is great, and it’s hyper-efficient.”
But beyond the network and storage, Missingham says that AARNet’s role as a technology expert is just as crucial to the project’s success.
“Our workflows and the way we do things needs to be quite different. We’re dealing with a lot of new technologies and processes associated with liberating the information held within the collection.
“Doing this together with AARNet is a great partnership because of its enormous technical expertise. AARNet plays a wonderful role as an interpreter between technology, researchers and collections, and this is helping us find new ways to provide better access to resources.”
Where digitisation previously meant dealing with metadata and accessibility, Missingham and her team are now learning new languages around data and text mining, as well as working with new and diverse teams.
“Previously we often did all of the digitisation ourselves or worked with a couple of academics to help interpret the material. The Sydney Stock Exchange project is a genuine partnership between digital humanities technologists, IT people, library people and scholars, so it’s quite a different approach.”
Turn to the community
Between learning new skills and the time required to engage with researchers, transforming collections into research data is not a simple process. But Missingham recommends that collections managers turn to the support of AARNet and the community to get started.
“Don’t wait until you have fully-fledged collection management systems before you start digitising; here at ANU Library, a lot of the materials we are digitising wouldn’t be digitisable in another five years, often because the paper is breaking down.
“Don’t wait for fancy systems. Talking to potential users, pick up the telephone and talk to AARNet. Use the magic of the AARNet network and CloudStor, and the power of the group to put material up and learn through doing.”
Find out more about CloudStor.
Visit the ANU Library website.
Image: A record from the Sydney Stock Exchange collection. Image: ANU Library.