Commit Graph

287 Commits

Author SHA1 Message Date
4582f6a697
Fix another edge case in time splitter 2023-11-14 03:32:17 +01:00
54a30e683e
Add class for loading info from distinctly_typed_strings table 2023-11-13 00:11:56 +01:00
1a7dbcd6f6
Fix edge cases in time splitter where inputs start with many digits but
are not dates
2023-11-07 00:27:20 +01:00
53c645b132
Add "vermutl." to list of uncertainty indicators 2023-10-28 21:17:54 +02:00
95de1615ef
Identify, parse and remove some more uncertainty indicators 2023-10-27 19:06:08 +02:00
bbbc84015b
Fix handling of misassigned lcsh / loc links in NodaWikidataFetcher 2023-10-18 02:46:11 +02:00
d55361e29b
Add function to check if a time name is blacklisted 2023-10-18 01:54:40 +02:00
37715bc3e8
Support BCE / CE times 2023-10-15 19:20:16 +02:00
9942c58b12
Improve parsing of LOC / LCSH from Wikidata 2023-09-29 16:20:53 +02:00
0a18449e06
Re-enable infix length in search indexes 2023-09-17 10:59:01 +02:00
efc67b57d3
Remove infix length, increase memory consumed by search indexes 2023-09-17 00:06:42 +02:00
835da05c38
Use wikidata description as fallback if wikipedia description is not
parsable in Wikidata fetcher

Close #16
2023-09-01 12:43:24 +02:00
12a7937218
Comment out debugging lines in NodaWikidataFetcher 2023-08-31 16:11:37 +02:00
a68a03e628
Improve wikidata fetcher 2023-08-31 16:09:21 +02:00
107a4cd640
Improve NodaWikidataFetcher's loading of descriptions
Close #15
2023-08-31 15:38:12 +02:00
0b5d5bdd12
Add functions for getting main synonym in list of synonyms 2023-08-31 03:29:04 +02:00
05fb965d8c
Add class NodaLinkedEntityGetter for getting linked entries 2023-08-30 17:39:25 +02:00
2720adf9ed
Limit linking norm data repositories via NodaBatchInserter to those
applicable for a given target vocabulary
2023-08-29 20:14:15 +02:00
67f7bf9fab
Add new functions for linking norm data repositories in batch and use
them in Wikidata fetcher
2023-08-29 17:32:22 +02:00
f27d0900ae
Further modularize syncing of tags with fulltext search index 2023-08-15 15:55:48 +02:00
cb6d0d7b06
Add class NodaDbAdmin 2023-08-15 14:42:07 +02:00
831dbca091
Fix indentation in comment 2023-05-24 03:38:22 +02:00
5906ddd97a
Add additional disallowed time names 2023-04-27 17:44:34 +02:00
574c9cf005
Add "o. D." (with spaces) to list of banned time terms 2023-04-17 18:41:43 +02:00
b6a5b44103
Add "vermutlich um" to list of uncertainty prefixes for time 2023-04-17 00:45:06 +02:00
838a991256
Except new class MDNodaLink for parameter in NodaIDGetter 2023-04-16 02:08:14 +02:00
d63f811367
Add "Ca. " as an uncertainty prefix for times 2023-04-14 22:56:55 +02:00
c5a7a62eb0
Add "Vermutlich" as an uncertainty indicator for places 2023-03-24 16:03:27 +01:00
b6d229eed9
Add functions for logging to import log 2023-03-01 11:43:01 +01:00
446c5d26f4
Extend uncertainty helper with more terms 2023-02-01 15:01:24 +01:00
6d40ae4c83
Fix bug in generating Indonesian date names and add Ukrainian as a
language for autogenerating time translations
2022-12-11 17:18:12 +01:00
d0e11c323e
Further modularize fetching of translations, add new class
NodaBatchInserter for batch inserting translations
2022-11-18 00:26:23 +01:00
b318b5b471
Better modularize NodaWikidataFetcher's loading of translations 2022-11-14 00:51:56 +01:00
511304b6f2
Fix bug in setting names for months 2022-11-03 16:02:10 +01:00
d641b64630
Extend list of uncertainty prefixes for times 2022-10-29 17:13:07 +02:00
1a9b195067
Fix type safety error 2022-09-15 21:35:36 +02:00
5819caff91
Remove superfluous variable assignments 2022-09-15 21:29:07 +02:00
7c0ad9fa37
Set time in autotranslater's use of IntlCalendar to prevent issues with
DST
2022-09-13 10:58:38 +02:00
a1ef24afba
Fix: Set function check_is_translatable to static 2022-09-13 00:17:49 +02:00
3246a37c63
Add simpler to use function check_is_translatable to autotranslater 2022-09-12 23:51:25 +02:00
59c858fb0f
Transfer remaining class constants in autotranslater to enum 2022-09-10 13:23:35 +02:00
ea832e2160
Remove use of deprecated strfttime, use enums instead of class constants
for many parts of NodaTimeAutotranslater
2022-09-10 02:13:35 +02:00
df1d2c10eb
Remove superfluous checks 2022-09-08 16:41:34 +02:00
c8d0292ca8
Fix bug in German dates like "1 November 1921" 2022-09-08 16:28:27 +02:00
6f41ffeb9f
Remove deprecated strfttime in NodaTimeSplitter 2022-09-08 16:07:30 +02:00
1ed2959b62
Prevent noda time splitter from suggesting times with months beyond 12,
days beyond 31
2022-08-25 22:06:15 +02:00
3c67d425fb
Fix broken function comment 2022-08-16 11:04:15 +02:00
3ca4e3fb23
Add nomen nominandum to black list for actors 2022-08-11 01:02:18 +02:00
1b702bfe31
Add logically missing functions to NodaIDGetter 2022-07-21 01:15:26 +02:00
b7a49eb9b4
Add "evtl." to list of uncertainty markers for places 2022-07-20 21:36:06 +02:00
ece1e44a9e
Skip importing uncertain birth years in Wikidata fetcher 2022-07-20 15:50:42 +02:00
027d4b3506
Throw a dedicated exception for fully empty actor descriptions in
NodaValidationHelper
2022-05-26 18:02:05 +02:00
c00cb6b629
Add class NodaValidationHelper, for now for validating actor
descriptions
2022-05-17 23:27:40 +02:00
ac79f421ff
Add Ukranian to list of languages for which to fetch translations 2022-05-12 16:40:37 +02:00
47226b6538
Fix bug caused by missing handling of different retrieval modes for
Wikidata fetcher
2022-04-18 20:45:32 +02:00
d5b593c334
Expect usage of function setRetrivalMode instead of a GET param for
setting retrieval modes in NodaWikidataFetcher
2022-04-18 13:19:00 +02:00
7ff986bdd8
Add function getPersinstIDByNodaLink for getting actor IDs by their noda
IDs in NodaIDGetter
2022-04-16 23:05:36 +02:00
7dde870afb
Improve type-safety of wikidata fetcher 2022-04-13 00:16:05 +02:00
3f26666123
Extend lists of uncertainty prefixes and suffixes
Close #10, close #11, close #12
2022-04-02 19:31:53 +02:00
ad6b4728f5
Add "vermutlich" to list of uncertainty prefixes for actors 2022-04-02 15:56:26 +02:00
885f05a2d3
Add "unbekannter Künstler" to list of disallowed actor names 2022-04-01 21:12:37 +02:00
ab11e30d59
Add "verm." to list of signifiers for uncertain time names 2022-03-31 18:05:19 +02:00
8689d2f45f
Add some German uncertainty prefixes for time 2022-03-31 15:45:03 +02:00
7379d768df
Allow logging updates to logged import concordances 2022-03-31 14:34:20 +02:00
f7e49e67a0
Add function for getting time IDs by their logged import concordances 2022-03-31 14:33:03 +02:00
2912b4a206
Add "ohne Jahr" (no year) to list of disallowed time names 2022-03-30 16:14:28 +02:00
b3f7845023
Handle Jhdt as a form of century in time splitter
Close #9
2022-03-12 19:32:28 +01:00
3001976b1b
Merge branch 'master' of gitea:museum-digital/MDNodaHelpers 2022-03-05 13:58:57 +01:00
6347de2635
Validate Wikidata IDs before attempting to fetch from Wikidata
Close #8
2022-03-05 13:58:18 +01:00
a7a9da4718
Add "Dinge" and "Objekt" to list of blacklisted tags 2022-02-27 22:26:46 +01:00
b1bd14bc56
Add functions for checking if places or actor names are blacklisted
Close #6, close #7
2022-02-18 22:09:26 +01:00
e877e4edf1
Add "wohl" and "vermutlich" as uncertainty specifiers for places 2022-02-18 16:34:59 +01:00
39c7585e91
Merge branch 'master' of gitea:museum-digital/MDNodaHelpers 2022-02-06 23:30:55 +01:00
c52550c789
Extend NodaNameGetter with functions for getting single entries' correct
place, tag, time, and actor names
2022-02-06 23:30:22 +01:00
747a516a49
Add class NodaMailChecker for checking mail address validity and caching that 2022-02-05 01:32:06 +01:00
a4be5e876c
Add tól to list of handled suffixes in NodaTimeSplitter
"tól" is a suffix equivalent to "after" in Hungarian.
2022-02-04 02:49:44 +01:00
d28618bb14
Try / catch invalid dates in NodaTimeSplitter 2022-02-03 21:12:54 +01:00
09a5096588
Remove superfluous check for yet undescribed external noda repos 2022-01-18 00:48:32 +01:00
e7f1515227
Use strict comparisons in NodaWikidataFetcher in remaining places 2022-01-16 15:18:04 +01:00
03330a933c
Add "sonstiges" to tag blacklist 2022-01-13 18:46:51 +01:00
9132745631
Fix bug in time splitter, make code more explicit 2022-01-09 22:19:22 +01:00
109f18e63c
Use a more explicit !empty for checking string contents 2022-01-08 14:15:51 +01:00
52a90d669c
Validate geonames and TGN IDs fetched from Wikidata 2021-12-14 15:40:07 +01:00
20f609f6d0
Use integers for geonames and TGN IDs 2021-12-14 15:38:44 +01:00
93cd09ed23
Fix bug in preventing impossible noda relations 2021-12-12 03:36:23 +01:00
340bfac96c
Prevent attempts to write link noda repositories for the incorrect
linkable types (e.g. iconclass for places)
2021-12-11 15:33:31 +01:00
4a26ab60ca
Fix missing URL prefix for iconclass 2021-12-11 15:15:31 +01:00
e00dd08c23
Use ON DUPLICATE KEY update instead of checking value existence with a
separate query
2021-12-11 01:29:48 +01:00
9471a030d5
Remove disabled noda repositories to link 2021-12-11 01:19:57 +01:00
97341cd466
Reduce number of entries synced into manticore per commit 2021-12-09 02:11:40 +01:00
5dc2ef0862
Fix bug in removing entries by ID (wrong ID column name) 2021-12-09 00:08:56 +01:00
55b2b61ef7
Add classes for syncing fulltext indexes in manticore, not mysql
directly
2021-12-08 23:00:58 +01:00
24714265c2
Prevent error if wikidata doesn't return a search result 2021-11-30 17:53:24 +01:00
ea280fc144
Add non-empty-string phpdoc types to NodaWikidataFetcher 2021-11-29 22:31:17 +01:00
0ab6f5e608
Add "Anonym" and "Anonymus" to list of disallowed actor names 2021-11-21 01:39:57 +01:00
054dc731f1
Disable libxml errors when parsing Wikipedia information 2021-11-18 23:53:11 +01:00
d0c4bfcf1f
Add more blacklisted tag names (e.g. "weitere", "other") 2021-10-16 19:27:43 +02:00
99a5303773
Merge branch 'master' of gitea:museum-digital/MDNodaHelpers 2021-10-10 13:22:44 +02:00
790dcd92fa
Add [vermutlich] to list of uncertainty suffixes for actors 2021-10-10 13:22:13 +02:00
581d9c7079
Use "d" for coordinates fetched through wikidata, remove useless
parentheses
2021-10-10 12:32:55 +02:00