356 Commits

Author SHA1 Message Date
7a2856ffad
Split times in more cases (300-20 BC, 300-4000 CE) 2025-04-08 15:18:32 +02:00
00638152cf
Prevent splitting of non-existing exact dates (e.g. 31.04.XXXX)
Close #35
2025-04-08 03:48:04 +02:00
dba60dbce6
Fix order of split days and months within a single year BCE
Close #32
2025-04-07 18:32:14 +02:00
f84fe1bca5
Fix type error / reference to values now not consistently existing
anymore
2025-04-06 22:56:36 +02:00
423959ac94
Stop early if autotranslation cannot proceed after validation 2025-04-05 00:11:03 +02:00
e8edb4a459
Time splitter: Handle first/second half
Close #31
2025-04-05 00:09:39 +02:00
8491b62a83
Validate against time errors in autogenerating translations for times
Close #30
2025-04-04 20:03:59 +02:00
bb2b1c2c32
Update NodaGroup 2025-03-13 00:30:33 +01:00
5054d3c62f
Use more rigurous trimming in NodaConsolidatedNamesForPersinst 2025-03-10 04:18:00 +01:00
beba838c0d
Correctly handle multibype hyphens in XXXX-XXXX 2025-03-10 04:13:59 +01:00
54dd958073
See before 2025-03-10 04:05:00 +01:00
5b99304b5c
Accept an additional type of hyphen / dash in time splitting 2025-03-10 03:59:44 +01:00
5cce98f15b
Extend tests 2025-03-10 03:20:46 +01:00
5036c77f32
Extend test for getting actor ID by life dates + name 2025-03-10 02:18:28 +01:00
e95415be8f
Add test for getting actor ID by name with life dates 2025-03-10 01:48:09 +01:00
5192781494
Use Wikipedia API for getting descriptions from Wikipedia rather than
parsing HTML in Wikidata fetcher

Thanks @awinkler
2025-03-09 02:08:26 +01:00
d9d9f7fcdc
Continue refactoring tests for time splitter to run provider-based 2025-02-24 14:02:42 +01:00
dbfa0df17f
Begin restructuring NodaTimeSplitterTest to use data providers 2025-02-21 10:32:07 +01:00
3409ec7afe
Begin adding autotranslation language CRH / Crimean Tatar
Some formatting is still unclear. See https://forum.museum-digital.info/d/52-additional-languages-for-translations-crimean-tatar/9
2025-02-18 17:51:36 +01:00
27ac3f255a
Minor typing improvements 2025-02-15 13:36:50 +01:00
9d7d53a858
Disallow fetching from Wikidata disambiguation pages
Close #23
2025-02-13 22:37:17 +01:00
28f6db67ff
Disable XML error warnings when parsing unclean inputs from Wikidata 2025-02-13 21:48:07 +01:00
2f3bc5f2fa
Prefer wikipedia page titles over wikidata labels
Close #28
2025-02-13 21:38:13 +01:00
39362f537a Merge branch 'master' of gitea:museum-digital/MDNodaHelpers 2025-02-13 17:19:43 +01:00
de0357473a
Make constant for test language in NodaWikidataFetcherTest public, allowing reuse 2025-02-13 17:19:06 +01:00
ef43270fb2
Map suffixes material and technique to their respective tag relation
types
2025-02-13 14:04:38 +01:00
338e09f001
Add kannada to list of languages fetched from wikidata 2025-02-13 13:10:45 +01:00
4cf9eaf4fa
Remove superfluous params passed to function 2025-02-13 13:10:30 +01:00
18438251a7
Add functions for getting IDs by any translated entry irrespective of
the language
2025-02-12 17:15:19 +01:00
1cf0f9858a
Add tests for loading translations in NodaWikidataFetcher 2025-02-12 16:02:04 +01:00
1d50027809
Make function getWikidataEntity public 2025-02-12 15:48:52 +01:00
d1cee17ef5
Add Telugu to list of languages to fetch in Wikidata fetcher
Close #24
2025-02-12 12:47:02 +01:00
baf7905e0b
Map gender Q207959
Q207959 is androgyny, mapping is a preliminary solution
2025-02-03 09:41:16 +01:00
9bf14d7d91
Add search function for getting entries in NodaIDGetter across vocabs 2025-01-31 23:25:40 +01:00
a621534136
Update NodaBlacklistedTerms 2025-01-24 13:45:28 +01:00
51fe9a5e45
Cover more edge cases for splitting time names 2025-01-15 11:49:20 +01:00
9c2eaa2929
Allow splitting 1945-48 2025-01-15 10:35:35 +01:00
546c17031a
Make NodaImportLogger more resilient, prevent error in case of duplicate import names 2024-12-12 12:43:11 +01:00
bf22f5541d
Retrieve "displayed subject" relationship from suffix "<Motiv>", "[Motiv]" 2024-12-03 16:07:41 +01:00
e036d7881a
Add missing strict typing in function params 2024-12-01 22:11:17 +01:00
d8db941485
Disallow tags of name "Nichtmünzliches" (de) 2024-11-24 16:08:14 +01:00
b7bb7364d4
Ensure duplicate time names can be parsed in NodaTimeSplitter (e.g.
1.1.2024-1.1.2024)
2024-11-20 10:02:10 +01:00
4dcd93b947
Better validate input JSON fetched from Wikipedia 2024-11-12 15:36:32 +01:00
c72ad51dda Merge branch 'master' of gitea:museum-digital/MDNodaHelpers 2024-11-11 09:11:35 +01:00
d6dea3e280
Remove use of SESSION in NodaWikidataFetcher 2024-11-11 09:11:15 +01:00
6f7ad13c4e
Add class NodaTagRelationIdentifier for parsing tag relation types from
input tag names
2024-11-09 19:44:09 +01:00
48355a6a36
Identify uncertainty before brackets ("Berlin ? (Germany)" > "Berlin
(Germany)" + Uncertain)
2024-11-09 18:42:18 +01:00
7cfe752c94
Handle commas when guessing time certainty 2024-11-09 15:40:27 +01:00
29ca05f552
Properly handle commas at the end of names when guessing certainty 2024-11-09 15:33:49 +01:00
eb371d4270
Ensure times can be split despite spaces at random points in given name 2024-10-23 18:02:23 +02:00