Commit Graph

363 Commits

Author SHA1 Message Date
119f216907 Merge branch 'master' of gitea:museum-digital/MDNodaHelpers 2025-06-08 17:20:24 +02:00
25668b7b16 Ping and reconnect DB in fulltext sync for actors fulltext index 2025-06-08 17:19:47 +02:00
8a31cf216e Add shortened 100x A to list of blacklisted tags 2025-05-22 16:25:27 +02:00
ff474341ed Add iconclass terms BB, CC, DD, to blacklist 2025-05-08 16:18:05 +02:00
1051e10732 Prevent ambigious splitting of [0-9]{4}-[0-9]{2} 2025-05-06 22:32:00 +02:00
057cac0f1b Ensure 1903/1904 cannot be split 2025-05-05 17:05:47 +02:00
0053fbe030 Support splitting times like "1. Hälfte des 19. Jahrhunderts" 2025-04-28 17:00:32 +02:00
7a2856ffad Split times in more cases (300-20 BC, 300-4000 CE) 2025-04-08 15:18:32 +02:00
00638152cf Prevent splitting of non-existing exact dates (e.g. 31.04.XXXX)
Close #35
2025-04-08 03:48:04 +02:00
dba60dbce6 Fix order of split days and months within a single year BCE
Close #32
2025-04-07 18:32:14 +02:00
f84fe1bca5 Fix type error / reference to values now not consistently existing
anymore
2025-04-06 22:56:36 +02:00
423959ac94 Stop early if autotranslation cannot proceed after validation 2025-04-05 00:11:03 +02:00
e8edb4a459 Time splitter: Handle first/second half
Close #31
2025-04-05 00:09:39 +02:00
8491b62a83 Validate against time errors in autogenerating translations for times
Close #30
2025-04-04 20:03:59 +02:00
bb2b1c2c32 Update NodaGroup 2025-03-13 00:30:33 +01:00
5054d3c62f Use more rigurous trimming in NodaConsolidatedNamesForPersinst 2025-03-10 04:18:00 +01:00
beba838c0d Correctly handle multibype hyphens in XXXX-XXXX 2025-03-10 04:13:59 +01:00
54dd958073 See before 2025-03-10 04:05:00 +01:00
5b99304b5c Accept an additional type of hyphen / dash in time splitting 2025-03-10 03:59:44 +01:00
5cce98f15b Extend tests 2025-03-10 03:20:46 +01:00
5036c77f32 Extend test for getting actor ID by life dates + name 2025-03-10 02:18:28 +01:00
e95415be8f Add test for getting actor ID by name with life dates 2025-03-10 01:48:09 +01:00
5192781494 Use Wikipedia API for getting descriptions from Wikipedia rather than
parsing HTML in Wikidata fetcher

Thanks @awinkler
2025-03-09 02:08:26 +01:00
d9d9f7fcdc Continue refactoring tests for time splitter to run provider-based 2025-02-24 14:02:42 +01:00
dbfa0df17f Begin restructuring NodaTimeSplitterTest to use data providers 2025-02-21 10:32:07 +01:00
3409ec7afe Begin adding autotranslation language CRH / Crimean Tatar
Some formatting is still unclear. See https://forum.museum-digital.info/d/52-additional-languages-for-translations-crimean-tatar/9
2025-02-18 17:51:36 +01:00
27ac3f255a Minor typing improvements 2025-02-15 13:36:50 +01:00
9d7d53a858 Disallow fetching from Wikidata disambiguation pages
Close #23
2025-02-13 22:37:17 +01:00
28f6db67ff Disable XML error warnings when parsing unclean inputs from Wikidata 2025-02-13 21:48:07 +01:00
2f3bc5f2fa Prefer wikipedia page titles over wikidata labels
Close #28
2025-02-13 21:38:13 +01:00
39362f537a Merge branch 'master' of gitea:museum-digital/MDNodaHelpers 2025-02-13 17:19:43 +01:00
de0357473a Make constant for test language in NodaWikidataFetcherTest public, allowing reuse 2025-02-13 17:19:06 +01:00
ef43270fb2 Map suffixes material and technique to their respective tag relation
types
2025-02-13 14:04:38 +01:00
338e09f001 Add kannada to list of languages fetched from wikidata 2025-02-13 13:10:45 +01:00
4cf9eaf4fa Remove superfluous params passed to function 2025-02-13 13:10:30 +01:00
18438251a7 Add functions for getting IDs by any translated entry irrespective of
the language
2025-02-12 17:15:19 +01:00
1cf0f9858a Add tests for loading translations in NodaWikidataFetcher 2025-02-12 16:02:04 +01:00
1d50027809 Make function getWikidataEntity public 2025-02-12 15:48:52 +01:00
d1cee17ef5 Add Telugu to list of languages to fetch in Wikidata fetcher
Close #24
2025-02-12 12:47:02 +01:00
baf7905e0b Map gender Q207959
Q207959 is androgyny, mapping is a preliminary solution
2025-02-03 09:41:16 +01:00
9bf14d7d91 Add search function for getting entries in NodaIDGetter across vocabs 2025-01-31 23:25:40 +01:00
a621534136 Update NodaBlacklistedTerms 2025-01-24 13:45:28 +01:00
51fe9a5e45 Cover more edge cases for splitting time names 2025-01-15 11:49:20 +01:00
9c2eaa2929 Allow splitting 1945-48 2025-01-15 10:35:35 +01:00
546c17031a Make NodaImportLogger more resilient, prevent error in case of duplicate import names 2024-12-12 12:43:11 +01:00
bf22f5541d Retrieve "displayed subject" relationship from suffix "<Motiv>", "[Motiv]" 2024-12-03 16:07:41 +01:00
e036d7881a Add missing strict typing in function params 2024-12-01 22:11:17 +01:00
d8db941485 Disallow tags of name "Nichtmünzliches" (de) 2024-11-24 16:08:14 +01:00
b7bb7364d4 Ensure duplicate time names can be parsed in NodaTimeSplitter (e.g.
1.1.2024-1.1.2024)
2024-11-20 10:02:10 +01:00
4dcd93b947 Better validate input JSON fetched from Wikipedia 2024-11-12 15:36:32 +01:00