Thursday, March 30, 2017

#Wikidata - Librarians and Mrs Carla Hayden

As a group librarians are not very visible. At the same time librarians are the people that have provided people with information before there was an Internet. In this day and age, they are still taking care that much of the published information is there for us and the generations to come.

Mrs Carla Hayden will address a Wikipedia edit-a-thon in Washington, DC hosted by the Library of Congress and the US National Archives. Mrs Hayden is the Librarian of Congress. It is always fun to update Wikidata information that is in the news.

It is amazing how little information there is for librarians. Of the "Librarians of the year", there seem to be only two with a Wikipedia article. Anyway, adding information for Mrs Hayden is a privilege and adding information on many librarians is easy to do.

What I am not sure about is if giving a lecture like the Jean E. Coleman Library Outreach Lecture may be seen as an award. Mrs Hayden gave this lecture twice.

#Quality - #DBpedia and Kappa Alpha Psi

Kappa Alpha Psi is a fraternity of students and alumni. There is a Wikipedia article in English, a Commons category and a Wikidata item.

The information about Kappa Alpha Psi at Wikidata is based on the Wikipedia article. Information was added to the items for the members. This was done because in a related item it was found that the influence of fraternities and sororities is considerable. Concentrating for a moment on Kappa Alpha Psi has a secondary quality impact on what is of primary concern but when this is done for three such organisations, it quickly affects thousands of notable people.

When people find it of interest to add information about a membership to a Wikipedia article it has some impact. Having a category helps more to make the relevance of a Kappa Alphi Psi more visible. Adding this information to Wikidata is easy and it may show up in any language when membership information is part of a template.

DBpedia is a project similar to Wikidata. It harvests data from Wikipedias more consistently than Wikidata. Wikidata items are mapped to its internal items making it is possible to compare Wikidata with DBpedia.

When quality is an objective, when quality is to be improved effectively, the differences between DBpedia and Wikidata are an easy and one of the more obvious starting points. For some Wikipedias DBpedia updates are based on the RSS feed of the changes. So once a difference has been curated and changed in either Wikipedia or Wikidata, it results in an improved DBpedia entry and the desired improvement in quality.  It does not need any math to understand this.

What we needed is a tool that uses these differences as input for a subset that is of interest to a Wikidata volunteer. That might be the Kappa Alpha Psi, The Black Lunch Table or whatever. Whatever can be defined with a query.

Sunday, March 26, 2017

#Wikidata - Gladys and Reginald Laubin

According to the documentation of the Capezio award, both Gladys and Reginald Laubin are awardees. The Capezio award is a dance award and it got some attention because a person of interest received the award in 2007. Wikipedia information was available until 2006.

Adding information for Mrs Laubin makes sense; she is as notable as her husband. She has her own VIAF registration and it completes the Capezio award information.

When you add an award and its awardees, some quality is expected. Adding what Wikipedia knows borrows from the sources at Wikipedia but new information is authoritative when it is from the associated website. When you then seek later information, it becomes more fuzzy; it becomes less obvious. It may not even be correct,

That is however how the cookie crumbles; like Wikipedia also relies on the interpretation of sources.

Friday, March 24, 2017

#Wikipedia - Professor Joseph Torgesen

The article on Professor Joseph Torgesen is a stub. The cool thing is that the information on a minimal article allows for improvements in the data at Wikidata. The author of the article included information on education and employment. This was done through categories.

Petscan was used and as a result 244 staff members of the university of Florida State University and 107 alumni of the University of Michican were added including Mr Torgesen.

As Mr Torgesen is a professor and "must" publish, finding a VIAF registration was possible. Adding the {{authority control}} to the article enriched the article. One fact not in the article; Mr Torgesen was awarded the Samuel Torrey Orton award in 2006. This is why there was already an item in Wikidata for Mr Torgesen.

Thursday, March 23, 2017

#Wikipedia vs #Wikidata - Quality and low hanging fruit

When Wikipedia is to be the best, it has to understand and preserve its quality. When Wikidata is to be the best, it has to understand and preserve its quality. Both Wikipedia and Wikidata are wikis but their quality and how it manifests itself are utterly different. At the same time they intersect and this is where we find low hanging fruit.

In Wikidata we have "Author"s and subclasses of author. Many of them have a VIAF identifier and this means that libraries know about them. Information like VIAF is shown in the English Wikipedia when there is an {{authority control}} template. It shows nothing when there is nothing to show but it will update Wikipedia when the information is added to Wikidata.

The low hanging fruit:
  • English Wikipedia - All articles about someone who is known as an author of any kind gets the template.
  • Wikidata - For all the items for someone who is known as an author of any kind we seek the VIAF identifier.
  • OCLC - All the libraries in the world will be updated with a link to Wikidata within a month. This will make it easy for a librarian to find Wikipedia articles in any language.
  • Open Archive - It has a project called "Open Library" and it has freely licensed e-books. Wikidata includes Open Library identifiers. OCLC and OL have links combined with Wikidata identifiers. As these numbers include, people in libraries or from Wikipedia could find authors with free books.
  • other Wikipedias - they could include VIAF and OL identifiers as well. Open Library has books in languages other than English..
We live in an interconnected world. Wikimedia quality is in not being on an island but increasing the reach and enabling our readers.

Tuesday, March 21, 2017

#Wikidata and #activism

When you care about something, you want to make sure that when you do something, it has an impact. There are many ways a difference can be made, you can protest, you can write in a blog, you can write Wikipedia articles and you can try to connect things in Wikidata.

For Wikimedians like me, sharing the sum of all knowledge, is why we are involved. As knowledge is key, it is important to make sure that facts are registered and access to knowledge becomes enabled.

The problem is that it is not obvious how and where a difference can be made. When the BBC gives diversity a prominent place because of its 100 women program, it seems obvious that we will write articles about these women. It is however not the first time that the BBC runs this program. We have written articles for women celebrated in 2013, 2014, 2015 and 2016. But in what language are these articles written? How much are they read? How well connected are these women to universities, to political parties to organisations and what countries are they from?

For a Wikimedian these are interesting questions. For an organiser of editathons they are what measures success. Is this activism? Sure. How does it affect the legitimate concern of impartiality? Not really as Wikimedia has always been about what people fancy to work on.

Saturday, March 18, 2017

#Wikidata - the #Rome Prize

The Rome Prize is given to a high number of Americans artists. It is awarded every year to 15 artists and 15 scholars, they stay for an extended period in Rome. The first awards were given in 1905.

The award winners are mentioned in many articles, when there is no article yet, there is a red link. New articles are written all the time so problems can be anticipated.

The problem is in names; different people bearing the same name. When new articles are written, there is no consideration for these red links. Articles are written. When an article is written for a Rome Prize winner, he or she may be included on the category for Rome Prize winners and that works well.

Some will say that Red Links are bad. They have a point. However it is all in the delivery. When there is no article, it does not follow that there is no information. The information could already be in Wikidata and I added a few statements for 2016 winners..

Authors, the #OpenLibrary, #Wikidata and libraries

The Open Library is part of the Internet Archive. It makes books available for you to read. That is awesome and that is why Open Library is a natural ally of the Wikimedia community.

At our end we can do more of the things that we do anyway and share what we do. The good news is that Wikidata has a CC-0 license. The people at Open Library can use everything that we do and they do not even have to bother to say thanks.

When we add more Open Library identifiers and VIAF identifier to Wikidata we connect them, us and all the libraries in the world. Yes, individual libraries may have different ways of spelling an author's name but using these connections disambiguation slowly but surely becomes a thing of the past for Open Librarians.

What will we have in return? All the books at Open Library of these authors become available to our readers and editors. We are already in the process of adding identifiers to Wikidata for Open Library. For all the authors that have been connected, we can provide our identifiers to Open Library. This helps them with their outreach and disambiguation.

Through Wikidata more and more authors become connected to VIAF. This allows the librarians of the world to share these freely licensed books with their readers. A clear win-win situation don't you think?

Friday, March 17, 2017

#Wikimedia - Professor Chuck Stone, Tuskegee airman and member of Alpha Phi Alpha

Professor Stone is the founding NABJ President, he was included in the National Association of Black Journalists Hall of Fame in 2004 and he received the Congressional Gold Medal from President Bush.

The description for the Wikidata item for Mr Stone is "American air force officer". This will not change; it is based on a bot that at one time decided that this would do. The automated description is: "US-American journalist (1924–2014); National Association of Black Journalists Hall of Fame and Congressional Gold Medal; member of Tuskegee AirmenAlpha Phi Alpha, and World Policy Council ♂" and the beauty is that this is updated as more information becomes available.

When you consider the quality of the information for Mr Stone in Wikidata, today 10 statements were added to the item. He has been added to the hall of fame with many others including some people Wikipedia does not know about. The World Policy Council is connected to Alpha Phi Alpha. The data is not complete; there is more to add.

When we consider quality, most of the data was added thanks to information available in the English article of Wikipedia. Yet there is information available that could find its way from Wikidata; how do we inform Wikipedia about the people who became part of the hall of fame for instance. Quality for Wikidata is not in single items, it is in how it connects and how it is used. With this realisation we learn from where some say Wikidata and Wikipedia fails and achieve the success that our combined data offers.