Open Access Code Breakers

If standards can be established across industries for emojis, books and music — it can be done in the realm of cultural heritage and the visual arts

Rachel Wright
CultureTech

--

In the world of cultural heritage, many Open Access initiatives aim to connect siloed records that span galleries, libraries, archives and museums. The hope is that an interdisciplinary approach between scholars, technologists and entrepreneurs will surface new insights across a broad fabric of freely shared information. Although the vision may seem daunting or overly aspirational, the possibility of connectivity is not far-fetched.

Databases rely on the language of code — the translation of human readable words and concepts into machine readable identifiers. Think of that smiling emoji you can add to a text message or an email. You can confidently send 🙂 without knowing what type of device the recipient uses. That’s possible thanks to the little emoji’s unique code — its identifier. Tech companies around the world agree to follow the standards of Unicode to allow for the language interoperability between devices. Your phone and computer recognize 🙂 as the unique code U+1F642.

The book publishing industry follows a similar protocol. Libraries, publishers, archives and retail distributors rely on ISBN (International Standard Book Identification) as the standard identifier for books. This is why Amazon could bring book sales (and a marketplace of book sellers) to the internet. No matter who is selling the book — whether used or new — humans and machines can identify the correct book due to the cross referenced ISBN code.

“At Harvard Art Museums, we record and share public identifiers as part of our constituent records. Using these identifiers, we signal the ability to link entities across databases. When a researcher identifies this sign, they are free to traverse sources to get more information, knowing that a record represents the same person across datasets. What they find in our record may be unique to Harvard, or may reinforce findings from another source. The point is that the researcher can be confident in navigating between multiple sources of information to support scholarship and analysis.”

Jeff Steward, Harvard Art Museums Director of Digital Infrastructure and Emerging Technology

Efforts to bring an international standard forward for people and organizations involved in creative activities emerged in 2010 with the creation of the International Standard Name Identifier (ISNI). A performer may have a range of “names” including given name, married name, public personas, pseudonyms or stage names. The ISNI ID links all of these names together as one entity. Humans and machines can agree that all of these names are related to the same person:

https://isni.org/isni/000000010782262X
Madonna (American singer, songwriter, and actress)
Ciccone (Maria Louis; 1958-)
Ciccone, Madonna Louise
Ciccone-Ritchie, Madonna
….and many more names

Within the music industry, ISNI adoption enables streamlined royalty tracking and payments between artists, performers, songwriters, music labels and platforms such as YouTube and Apple. These codes trail through music cataloging and distribution across systems and platforms — and ultimately streamline how the individual artist is paid when we listen to their song. Capturing that unique ISNI ID as a part of the digital record clarifies and disambiguates the artist across systems: the correct performers and songwriters connected to specific songs, all tied back to contracts with deal terms and the accounting details for royalty calculations and payments.

If standards can be established across industries for language and emojis, books and music — it can be done in the realm of cultural heritage and the visual arts.

For any organization, a small, yet significant step towards sharing and interoperability is to begin recording unique machine readable identifiers for authors, visual artists, scholars, and arts organizations. When two institutions capture the same identifier they signal records in common about the same person.

https://isni.org/isni/0000000121235624
Kandinsky, Vassily (Russian painter, 1866–1944)
Kandinsky, Vasili
Kandinsky, Vasili Vasilevich
Kandinsky (Wassily; 1866–1944)
Wassily Kandinsky (Russian painter)
….and many more names

It doesn’t matter if one institution refers to Vasily Kandinsky and the other spells his name as Wassily. The ID allows for flexibility at the level of name display, yet we can be assured each variant of the name is tied to the intended creator.

Information and objects move quickly and efficiently through systems and supply chains when we can leverage shared language within technology. When data from multiple sources is woven together to create a whole that is greater than the sum of its parts, we call this a data fabric. Public identifiers act as the thread within our fabric, providing a machine readable language that can help researchers better connect, discover, and analyze relationships.

Utilizing constituent records from more than 20 museums, researchers at CultureTech and Harvard Art Museums explored the impacts and success rates of matching records on artists across museums and public data sources. These institutions’ efforts to record and share unique, machine readable identifiers, such as Wikidata and Getty Vocabularies Union List of Artist Names, allow for rapid connectivity across museums and public data sources.

Our research demonstrates that with a small range of data points about a constituent, successful auto-matching algorithms can be established within a knowledge graph to fuel collaborative research and analysis across institutions. Through our research creating and exploring a knowledge graph of linked constituent records, we’ve established methods to validate data accuracy, inform categories of data that need the attention of experts, and identify areas where expanding a category of data can propel dramatic gains for data enhancement.

Museums leading the charge to open up access to information can add in public identifiers such as ISNI to help digital humanists, data scientists, developers and entrepreneurs connect information about artists confidently and fluently. Recording and sharing a standard ID code provides a simple, yet powerful translation to enable smooth communication, increase accessibility, and amplify interdisciplinary innovation.

Rachel Wright leads Business Strategy at CultureTech, where we’re on a mission to Open Up Art through technology.

--

--