Interactive function that presents alternative possible QID matches for a list of text strings and provides options for choosing between alternatives, rejecting all presented alternatives, or creating new items. Useful in cases where a list of text strings may have either missing Wikidata items or multiple alternative potential matches that need to be manually disambiguated. Can also used on lists of lists (see examples). For long lists of items, the process can be stopped partway through and the returned vector will indicate where the process was stopped.
Usage
disambiguate_QIDs(
list,
variablename = "variables",
variableinfo = NULL,
filter_property = NULL,
filter_variable = NULL,
filter_firsthit = FALSE,
Q_min = NULL,
auto_create = FALSE,
limit = 10
)Arguments
- list
a list or vector of text strings to find potential QID matches to. Can also be a list of lists (see examples)
- variablename
type of items in the list that are being disambiguated (used in messages)
- variableinfo
additional information about items that are being disambiguated (used in messages)
- filter_property
property to filter on (e.g. "P31" to filter on "instance of")
- filter_variable
values of that property to use to filter out (e.g. "Q571" to filter out books)
- filter_firsthit
apply filter to the first match presented or only if alternatives requested? (default = FALSE, note: true is slower if filter not needed on most matches)
- Q_min
return only possible hits with QIDs above the provided value
- auto_create
if no match found, automatically assign "CREATE"
- limit
number of alternative possible Wikidata items to present if multiple potential matches
Value
a vector of:
- QID
Selected QID (for when an appropriate Wikidata match exists)
- CREATE
Mark that a new Wikidata item should be created (for when no appropriate Wikidata match yet exists)
- NA
Mark that no Wikidata item is needed
- STOP
Mark that the process was halted at this point (so that output can be used as input to the function later)
Examples
if (FALSE) { # \dontrun{
#Disambiguating possible QID matches for these music genres
#Results should be:
# "Q22731" as the first match
# "Q147538" as the first match
# "Q3947" as the second alternative match
disambiguate_QIDs(list=c("Rock","Pop","House"),
variablename="music genre")
#Disambiguating possible QID matches for these three words, but not the music genres
#This will take longer as the filtering step is slower
#Results should be:
# "Q22731" (the material) as the first match
# "Q147538" (the soft drink) as the second alternative match
# "Q3947" (the building) as the first match
disambiguate_QIDs(list=c("Rock","Pop","House"),
filter_property="instance of",
filter_variable="music genre",
filter_firsthit=TRUE,
variablename="concept, not the music genre")
#Disambiguating possible QID matches for the multiple expertise of
#these three people as list of lists
disambiguate_QIDs(list=list(alice=list("physics","chemistry","maths"),
barry=list("history"),
clair=list("law","genetics","ethics")),
variablename="expertise")
} # }