Prevent loading data from Wikipedia pages for specific wikidata pages #29
Open
opened 2025-02-13 22:40:41 +01:00 by jrenslin
·
5 comments
Labels
Clear labels
kind/breaking
kind/bug
kind/docs
kind/enhancement
kind/feature
kind/lint
kind/proposal
kind/question
kind/refactor
kind/security
kind/testing
kind/translation
kind/ui
priority/critical
priority/high
priority/low
priority/medium
reviewed/duplicate
reviewed/invalid
reviewed/wontfix
status/done
status/needs-feedback
kind/breaking
kind/bug
kind/docs
kind/enhancement
kind/feature
kind/lint
kind/proposal
kind/question
kind/refactor
kind/security
kind/testing
kind/translation
kind/ui
priority/critical
priority/high
priority/low
priority/medium
reviewed/duplicate
reviewed/invalid
reviewed/wontfix
status/done
status/needs-feedback
Critical issue that breaks a page
Something is not working
This concerns the documentation
Improvements on existing features
New features
Code linting error
Suggestion or proposal
A question
Refactoring code
Security issue
Concerns the test setup
Concerns multilinguality
Concerns the user interface
Critical issue, highest priority
High priority issue
Low priority issue
Medium priority issue
This is a duplicate
This is an invalid issue
This is issue will not be fixed
This issue has been fixed
This issue needs feedback
Critical issue that breaks a page
Something is not working
This concerns the documentation
Improvements on existing features
New features
Code linting error
Suggestion or proposal
A question
Refactoring code
Security issue
Concerns the test setup
Concerns multilinguality
Concerns the user interface
Critical issue, highest priority
High priority issue
Low priority issue
Medium priority issue
This is a duplicate
This is an invalid issue
This is issue will not be fixed
This issue has been fixed
This issue needs feedback
No Label
Milestone
No items
No Milestone
Assignees
abecker
adamm (Ádám Magyarosi)
agoll
akoch (Anne-Katrin Koch)
aminnig
anowicki (Anna-Lena Nowicki)
arnel (Arne Lindemann)
awinkler (Alexander Winkler)
bbaumecker
bednarikj (Bednárik János)
cmagdo (Csaba)
cotte
cpitzen
dyanc
emalygina
fvhagel
hkuper (Heinz-Günter Kuper)
hwarth-geraci
jjuergens (Johanna Jürgens)
jrenslin (Joshua Ramon Enslin)
jvpilgrim (Jens von Pilgrim)
korilo (Korinna Lorz)
krifo (Krisztian Fonyodi)
lluethi
manders (Miriam Anders)
mhartmann (Manfred Hartmann)
mkarbe (Matthias Karbe)
mkisser (Mirko Kisser)
mportius (Martin Portius)
neikermann
nfuelbier
nklingspor
nyakubovich
sfusetti
shollmann
slorbeer
sopfermann
stefan
swassermann
szunkel (Stefan Zunkel)
ufladerer
Clear assignees
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: museum-digital/MDNodaHelpers#29
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Some times, a Wikipedia page is trying to be more specific than wikidata. Wikidata links to the target entity in almost all languages, but to a disambiguation page in others. In such cases, the data from Wikidata is altogether better than whatever Wikipedia offers.
From a wikidata point of view, the only viable solution seems to be to ask for wikidata uris / QIDs instead of wikipedia urls. Every wikipedia page has a wikidata item attached, but that's not the case the other way round. As to the label (cf #28 ), you can always use the wikipedia labels if need be.
Well, point is that we can handle both (and need to be able to). But that's essentially a different issue. Here I'm concerned with cases where one Wikidata page essentially has two Wikipedia pages in one language. It's certainly an edge case, but essentially what I described above: Coming from the Wikidata item, all languages except for one (usually the German Wikipedia) reference a precise match in Wikidata. That single outlier references a disambiguation page.
@stefan
At least in Wikidata I can't seem to find an item with two wikipedia pages in the same language: https://qlever.cs.uni-freiburg.de/wikidata/Zkub6F
There are, however, 180 cases where more than one Wikipedia pages refer to the same wikidata item (https://qlever.cs.uni-freiburg.de/wikidata/FbVWQW). Is thas what you mean? But maybe @stefan has a concrete example?
Don't know if it is a different problem ...
Wikidata knows that the Wikipedia-Link is one to a disambiguation page in this example: https://www.wikidata.org/wiki/Q487772. It is given in the main box for german, english, french and it is given in the wikipedia-link-box (on the right) NOT for german, but for english and other languages
Yeah, that one's the case already solved in #23 , since the whole Wikidata item is marked as a Wikimedia disambiguation entity. Problem is only if there is a Wikipedia disambiguation page linked to a valid Wikidata entity (I chanced upon that only once, so it's really an extreme edge case, but we should keep it in mind).