|
3ca4e3fb23
|
Add nomen nominandum to black list for actors
|
2022-08-11 01:02:18 +02:00 |
|
|
1b702bfe31
|
Add logically missing functions to NodaIDGetter
|
2022-07-21 01:15:26 +02:00 |
|
|
b7a49eb9b4
|
Add "evtl." to list of uncertainty markers for places
|
2022-07-20 21:36:06 +02:00 |
|
|
ece1e44a9e
|
Skip importing uncertain birth years in Wikidata fetcher
|
2022-07-20 15:50:42 +02:00 |
|
|
027d4b3506
|
Throw a dedicated exception for fully empty actor descriptions in
NodaValidationHelper
|
2022-05-26 18:02:05 +02:00 |
|
|
c00cb6b629
|
Add class NodaValidationHelper, for now for validating actor
descriptions
|
2022-05-17 23:27:40 +02:00 |
|
|
ac79f421ff
|
Add Ukranian to list of languages for which to fetch translations
|
2022-05-12 16:40:37 +02:00 |
|
|
47226b6538
|
Fix bug caused by missing handling of different retrieval modes for
Wikidata fetcher
|
2022-04-18 20:45:32 +02:00 |
|
|
d5b593c334
|
Expect usage of function setRetrivalMode instead of a GET param for
setting retrieval modes in NodaWikidataFetcher
|
2022-04-18 13:19:00 +02:00 |
|
|
7ff986bdd8
|
Add function getPersinstIDByNodaLink for getting actor IDs by their noda
IDs in NodaIDGetter
|
2022-04-16 23:05:36 +02:00 |
|
|
7dde870afb
|
Improve type-safety of wikidata fetcher
|
2022-04-13 00:16:05 +02:00 |
|
|
3f26666123
|
Extend lists of uncertainty prefixes and suffixes
Close #10, close #11, close #12
|
2022-04-02 19:31:53 +02:00 |
|
|
ad6b4728f5
|
Add "vermutlich" to list of uncertainty prefixes for actors
|
2022-04-02 15:56:26 +02:00 |
|
|
885f05a2d3
|
Add "unbekannter Künstler" to list of disallowed actor names
|
2022-04-01 21:12:37 +02:00 |
|
|
ab11e30d59
|
Add "verm." to list of signifiers for uncertain time names
|
2022-03-31 18:05:19 +02:00 |
|
|
8689d2f45f
|
Add some German uncertainty prefixes for time
|
2022-03-31 15:45:03 +02:00 |
|
|
7379d768df
|
Allow logging updates to logged import concordances
|
2022-03-31 14:34:20 +02:00 |
|
|
f7e49e67a0
|
Add function for getting time IDs by their logged import concordances
|
2022-03-31 14:33:03 +02:00 |
|
|
2912b4a206
|
Add "ohne Jahr" (no year) to list of disallowed time names
|
2022-03-30 16:14:28 +02:00 |
|
|
b3f7845023
|
Handle Jhdt as a form of century in time splitter
Close #9
|
2022-03-12 19:32:28 +01:00 |
|
|
3001976b1b
|
Merge branch 'master' of gitea:museum-digital/MDNodaHelpers
|
2022-03-05 13:58:57 +01:00 |
|
|
6347de2635
|
Validate Wikidata IDs before attempting to fetch from Wikidata
Close #8
|
2022-03-05 13:58:18 +01:00 |
|
|
a7a9da4718
|
Add "Dinge" and "Objekt" to list of blacklisted tags
|
2022-02-27 22:26:46 +01:00 |
|
|
b1bd14bc56
|
Add functions for checking if places or actor names are blacklisted
Close #6, close #7
|
2022-02-18 22:09:26 +01:00 |
|
|
e877e4edf1
|
Add "wohl" and "vermutlich" as uncertainty specifiers for places
|
2022-02-18 16:34:59 +01:00 |
|
|
39c7585e91
|
Merge branch 'master' of gitea:museum-digital/MDNodaHelpers
|
2022-02-06 23:30:55 +01:00 |
|
|
c52550c789
|
Extend NodaNameGetter with functions for getting single entries' correct
place, tag, time, and actor names
|
2022-02-06 23:30:22 +01:00 |
|
|
747a516a49
|
Add class NodaMailChecker for checking mail address validity and caching that
|
2022-02-05 01:32:06 +01:00 |
|
|
a4be5e876c
|
Add tól to list of handled suffixes in NodaTimeSplitter
"tól" is a suffix equivalent to "after" in Hungarian.
|
2022-02-04 02:49:44 +01:00 |
|
|
d28618bb14
|
Try / catch invalid dates in NodaTimeSplitter
|
2022-02-03 21:12:54 +01:00 |
|
|
09a5096588
|
Remove superfluous check for yet undescribed external noda repos
|
2022-01-18 00:48:32 +01:00 |
|
|
e7f1515227
|
Use strict comparisons in NodaWikidataFetcher in remaining places
|
2022-01-16 15:18:04 +01:00 |
|
|
03330a933c
|
Add "sonstiges" to tag blacklist
|
2022-01-13 18:46:51 +01:00 |
|
|
9132745631
|
Fix bug in time splitter, make code more explicit
|
2022-01-09 22:19:22 +01:00 |
|
|
109f18e63c
|
Use a more explicit !empty for checking string contents
|
2022-01-08 14:15:51 +01:00 |
|
|
52a90d669c
|
Validate geonames and TGN IDs fetched from Wikidata
|
2021-12-14 15:40:07 +01:00 |
|
|
20f609f6d0
|
Use integers for geonames and TGN IDs
|
2021-12-14 15:38:44 +01:00 |
|
|
93cd09ed23
|
Fix bug in preventing impossible noda relations
|
2021-12-12 03:36:23 +01:00 |
|
|
340bfac96c
|
Prevent attempts to write link noda repositories for the incorrect
linkable types (e.g. iconclass for places)
|
2021-12-11 15:33:31 +01:00 |
|
|
4a26ab60ca
|
Fix missing URL prefix for iconclass
|
2021-12-11 15:15:31 +01:00 |
|
|
e00dd08c23
|
Use ON DUPLICATE KEY update instead of checking value existence with a
separate query
|
2021-12-11 01:29:48 +01:00 |
|
|
9471a030d5
|
Remove disabled noda repositories to link
|
2021-12-11 01:19:57 +01:00 |
|
|
97341cd466
|
Reduce number of entries synced into manticore per commit
|
2021-12-09 02:11:40 +01:00 |
|
|
5dc2ef0862
|
Fix bug in removing entries by ID (wrong ID column name)
|
2021-12-09 00:08:56 +01:00 |
|
|
55b2b61ef7
|
Add classes for syncing fulltext indexes in manticore, not mysql
directly
|
2021-12-08 23:00:58 +01:00 |
|
|
24714265c2
|
Prevent error if wikidata doesn't return a search result
|
2021-11-30 17:53:24 +01:00 |
|
|
ea280fc144
|
Add non-empty-string phpdoc types to NodaWikidataFetcher
|
2021-11-29 22:31:17 +01:00 |
|
|
0ab6f5e608
|
Add "Anonym" and "Anonymus" to list of disallowed actor names
|
2021-11-21 01:39:57 +01:00 |
|
|
054dc731f1
|
Disable libxml errors when parsing Wikipedia information
|
2021-11-18 23:53:11 +01:00 |
|
|
d0c4bfcf1f
|
Add more blacklisted tag names (e.g. "weitere", "other")
|
2021-10-16 19:27:43 +02:00 |
|
|
99a5303773
|
Merge branch 'master' of gitea:museum-digital/MDNodaHelpers
|
2021-10-10 13:22:44 +02:00 |
|
|
790dcd92fa
|
Add [vermutlich] to list of uncertainty suffixes for actors
|
2021-10-10 13:22:13 +02:00 |
|
|
581d9c7079
|
Use "d" for coordinates fetched through wikidata, remove useless
parentheses
|
2021-10-10 12:32:55 +02:00 |
|
|
edbe0230af
|
Add "research_note" to list of accepted edited sections in noda edit log
|
2021-10-09 14:10:10 +02:00 |
|
|
ca7424b043
|
Add NodaNameGetter for batch retrieval of names
|
2021-09-16 01:09:10 +02:00 |
|
|
fe31a29159
|
Add "vermtl. " to list of uncertainty prefixes
|
2021-08-27 18:50:51 +02:00 |
|
|
fb327762dc
|
Add capability to split english decade terms (1920s)
|
2021-08-27 16:19:19 +02:00 |
|
|
bb4e2a727a
|
Add "c. " to list of uncertainty prefixes
|
2021-08-27 15:24:50 +02:00 |
|
|
7d89596286
|
Add option to save edits to name variants (for actors) in edit log
|
2021-08-19 12:23:08 +02:00 |
|
|
a0b6207f81
|
Add missing htmlspecialchars in Wikidata results list
|
2021-08-15 20:03:25 +02:00 |
|
|
6d60d9eec7
|
Significantly extend the timeout for SPARQL queries to Wikidata
|
2021-08-13 13:07:29 +02:00 |
|
|
87fd2a25df
|
Add functions for identifying Wikidata IDs by external IDs
|
2021-08-12 15:33:48 +02:00 |
|
|
8eb576f43d
|
Extend list of known genders to parse from Wikidata
|
2021-08-12 13:59:20 +02:00 |
|
|
aa9f307c55
|
Expect minimized JS file in injecting js to wikidata results pages
|
2021-08-10 14:35:25 +02:00 |
|
|
0167890147
|
Add option to inject JS on wikidata results lists
|
2021-08-08 17:38:21 +02:00 |
|
|
e773bab7ce
|
Allow a special wikidata results list for actors, making suggestions
based on birth and death dates
|
2021-08-07 17:38:49 +02:00 |
|
|
e69be5b2b1
|
Use === over == in more cases
|
2021-07-24 23:21:00 +02:00 |
|
|
d269b6644b
|
Extend blacklist of disallowed tag names
|
2021-07-14 22:01:46 +02:00 |
|
|
0f2f7b2787
|
Add the different variants of "verschiedenes" ("various" in German) to
tag blacklist
|
2021-07-14 21:58:14 +02:00 |
|
|
7e27f15515
|
Remove specific blacklist file for tags
|
2021-07-14 13:46:27 +02:00 |
|
|
f930ca794e
|
Add a list of blacklisted tags
|
2021-07-14 13:37:18 +02:00 |
|
|
0fa759c604
|
Add check against empty wikidata / wikipedia descriptions
|
2021-07-06 12:50:33 +02:00 |
|
|
fba4706b67
|
Add check against empty source ID in references to controlled
vocabularies in wikidata fetcher
|
2021-07-06 12:15:31 +02:00 |
|
|
af13f747b7
|
Allow listed and searched Wikidata entries to be without descriptions
|
2021-07-03 15:18:02 +02:00 |
|
|
c56ae6ce66
|
Merge branch 'master' of gitea:museum-digital/MDNodaHelpers
|
2021-07-01 15:36:55 +02:00 |
|
|
3a5790853c
|
Add ", um" to the list of suffixes to indicate a time entry being
uncertain
|
2021-07-01 15:35:18 +02:00 |
|
|
bd3851ccf4
|
Use separate function for generating overview lists in
NodaWikidataFetcher
|
2021-06-30 22:55:37 +02:00 |
|
|
2c0d8e041e
|
Add "nicht benannt" to list of unwanted place and actor names
|
2021-06-30 14:36:11 +02:00 |
|
|
062c0d12dc
|
Extend list of known unwanted actor names
|
2021-06-29 15:37:05 +02:00 |
|
|
7bd315aa0c
|
Add "ism." as a shorthand for ismeretlen to list of disallowed place
name
Ismeretlen means unknown / no information in Hungarian and is thus a
non-value.
|
2021-06-29 13:50:09 +02:00 |
|
|
b6b2bbccff
|
Extend list of empty place names to remove
|
2021-06-18 23:30:03 +02:00 |
|
|
d244065dbe
|
Update base update time of places when editing a place
|
2021-05-27 02:33:51 +02:00 |
|
|
ce453a3d69
|
Allow noda_link as a loggable section in NodaLogEdit
|
2021-05-27 00:35:15 +02:00 |
|
|
a50610b640
|
Fix wrong table name
|
2021-05-27 00:34:56 +02:00 |
|
|
bc3f2a94d6
|
Add function for getting tags by base name, log base edits for tags in
NodaLogEdit
|
2021-05-26 22:36:02 +02:00 |
|
|
a4f24e5478
|
Add logging of synchronization with wikidata
|
2021-05-26 17:12:15 +02:00 |
|
|
8a30cf2c2a
|
Add class NodaLogEdit for easily logging updates to the main noda tables
|
2021-05-26 16:24:20 +02:00 |
|
|
6a91e31f41
|
Improve handling of timespans
|
2021-05-13 23:00:23 +02:00 |
|
|
2a8ba31410
|
Import place hierarchy from Wikidata
Close #5
|
2021-05-11 01:37:49 +02:00 |
|
|
874cfb8a6f
|
Extend time splitter to handle e.g. "17./18. Jh."
|
2021-05-07 16:25:50 +02:00 |
|
|
db6953ca51
|
Move syncers to src/Sync subdirectory
|
2021-05-06 23:37:31 +02:00 |
|
|
74caf06280
|
Use lowercase typing for instanceof > instanceOf
|
2021-05-06 23:34:55 +02:00 |
|
|
c84e3401ff
|
Add classes for keeping the fulltext (search) tables in sync
|
2021-05-06 23:24:43 +02:00 |
|
|
041d3598eb
|
Allow using Wikidata links for fetching information for actors
|
2021-05-05 01:26:32 +02:00 |
|
|
6dcdb3aff6
|
Add function for assembling display names by given name and family name
|
2021-05-04 23:57:30 +02:00 |
|
|
bde3c2cb9e
|
Add a class NodaNameSplitterTest, for now splitting names into given
name and family name
|
2021-05-04 23:04:41 +02:00 |
|
|
1dd05a3822
|
Add blacklist for unwanted tag names
Close #4
|
2021-04-25 00:16:53 +02:00 |
|
|
9157e8a0f1
|
Add fix for empty noda references in fetching tags from Wikidata
|
2021-04-24 01:13:27 +02:00 |
|
|
e1a9a99797
|
Use ++$i over $i++ outside of loops in Wikidata fetcher
This is a slightly more performant way of incrementing an integer.
|
2021-04-12 12:54:07 +02:00 |
|
|
792754c20c
|
Fetch orcid IDs in wikidata fetcher
|
2021-04-07 11:33:49 +02:00 |
|
|
e957db4210
|
Add condition to split times like "xxxx bis yyyy"
|
2021-03-26 12:32:27 +01:00 |
|
|
c964053c91
|
Add function for reading Wikidata ID from a Wikipedia page
|
2021-03-18 01:23:45 +01:00 |
|
|
1fd87c7e6d
|
Simplify NodaWikidataFetcher, unify list of langs, simplify linking to noda sources
Close #2
|
2021-03-17 22:06:08 +01:00 |
|
|
f0b5a08cdf
|
Move NodaWikidataFetcher to this repository
|
2021-03-17 16:11:06 +01:00 |
|
|
1fe795d219
|
Use mysqli->autocommit(false) to speed up autotranslating
|
2021-03-08 21:23:38 +01:00 |
|
|
668477f199
|
Add missing check in NodaTimeAutotranslater
|
2021-01-31 19:39:09 +01:00 |
|
|
7ccdfd4659
|
Fix function comment in NodaTimeSplitter
|
2021-01-31 01:50:25 +01:00 |
|
|
aca4f86da5
|
Add "Neu" und "Neu hergestellt" to list of disallowed time entries
|
2021-01-29 20:03:06 +01:00 |
|
|
a761a9dfd7
|
Stop time splitter for start / end, if common time splitter can be used
|
2021-01-07 11:43:20 +01:00 |
|
|
c02165df7b
|
Add exception catching in splitting times / dates
|
2021-01-06 23:11:05 +01:00 |
|
|
54764e741a
|
Add option to split and translate times with start and end dates
Close #1
|
2021-01-06 23:05:26 +01:00 |
|
|
fcc63c4ea0
|
Merge branch 'master' of gitea:museum-digital/MDNodaHelpers
|
2021-01-06 16:07:46 +01:00 |
|
|
a6030e4a5f
|
Fix bug in month names similar in English and German
|
2021-01-06 16:07:21 +01:00 |
|
|
7ef09db72c
|
Add static function in NodaIDGetter to get tag ID by import log
|
2021-01-04 23:06:36 +01:00 |
|
|
9f67d253da
|
Add functions to get actor and place IDs by import logs
|
2021-01-04 22:50:26 +01:00 |
|
|
8f612dede1
|
Read 1917-ig. as similar to 1917-ig in time splitter
|
2020-12-28 14:40:04 +01:00 |
|
|
6e910cd676
|
Add English month names for splitting time terms
|
2020-12-22 12:22:14 +01:00 |
|
|
8ac22165fc
|
Add "ohne angabe" to list of disallowed terms
|
2020-12-21 15:32:16 +01:00 |
|
|
d8e44550fc
|
Add "Ohne Datum" to list of disallowed time terms
|
2020-12-21 15:15:00 +01:00 |
|
|
af454ec013
|
Setup ID getter by rewrite for tags to return arrays
Tag rewrites can now be set for multiple target tags.
|
2020-12-20 23:24:08 +01:00 |
|
|
fce933c12a
|
Extend list of disallowed noda terms
|
2020-12-20 15:40:30 +01:00 |
|
|
b27f0ec918
|
Add "Keine Angaben" to list of disallowed inputs for places
|
2020-12-19 02:37:38 +01:00 |
|
|
a070970554
|
Remove empty newlines in class defs
|
2020-12-19 02:36:38 +01:00 |
|
|
ca13f36c0d
|
Add function to get tag IDs by their translated names
|
2020-12-07 13:43:07 +01:00 |
|
|
50ff1a2339
|
Add script to get highest related tag
|
2020-10-30 16:30:23 +01:00 |
|
|
0ea9c31845
|
Explicitly use global namespace in function calls
|
2020-10-23 17:03:51 +02:00 |
|
|
14e82826ae
|
Fix bug in getting place IDs by noda links
|
2020-10-05 12:05:48 +02:00 |
|
|
99aa1d74ad
|
Improve / make more explicit: type safety
|
2020-10-04 23:59:40 +02:00 |
|
|
97566ea2d9
|
Split more time variations
|
2020-10-04 23:57:59 +02:00 |
|
|
8a4a8f7ed8
|
Split more variations of dots in dates, century ranges
|
2020-10-04 23:20:58 +02:00 |
|
|
d0fe1e89ed
|
Improve trimming inputs when cleaning certainty indicators
|
2020-10-04 22:52:15 +02:00 |
|
|
1f4d692fb5
|
Enable automatic translations of times "before" a given date
|
2020-10-04 19:34:17 +02:00 |
|
|
1685d78f65
|
Allow splitting times "before <X>"
|
2020-10-04 19:27:23 +02:00 |
|
|
a0037c9883
|
Allow splitting times after <year><month>
|
2020-10-04 19:17:18 +02:00 |
|
|
be46c39efd
|
Fix wrong assumption on handling counting times when autotranslating
"after <month>"
|
2020-10-04 18:36:03 +02:00 |
|
|
c9a1a74bce
|
Enable autotranslating of times 'after' a certain date
|
2020-10-04 18:21:33 +02:00 |
|
|
5e90e5d3f2
|
Add strings for expressing times 'after' and 'before'
|
2020-10-04 17:40:51 +02:00 |
|
|
2a57537436
|
Allow splitting times "Nach 1905" ("Nach " followed by 4 digit time
number)
|
2020-10-04 17:39:34 +02:00 |
|
|
36d27e0f73
|
Remove / disallow certain input names in NodaUncertaintyHelper
|
2020-10-04 02:40:21 +02:00 |
|
|
4e934e380c
|
Use [0-9]{4} spelling time
|
2020-10-03 19:13:27 +02:00 |
|
|
ff35ca7bd9
|
Enable time splitter to deal with some roman numbers
|
2020-10-03 16:10:43 +02:00 |
|
|
80cd88222d
|
Enable time splitter to recognize sz as abbr. for század
|
2020-10-03 15:59:49 +02:00 |
|
|
3664bcf3f6
|
Add getting places by noda links to NodaIDGetter
|
2020-10-01 12:53:27 +02:00 |
|
|
67cc76cff9
|
Allow splitting of German short decade names: 20er or 1920er
|
2020-09-27 17:12:34 +02:00 |
|
|
91f435a2e4
|
Enable parsing of months: 2020-01
|
2020-09-27 17:10:17 +02:00 |
|
|
de7968fbbd
|
Only allow splitting by international format if month < 13
|
2020-09-27 12:38:38 +02:00 |
|
|
48f3bd2c3f
|
Allow splitting international dates (2020-12-20)
|
2020-09-27 12:36:34 +02:00 |
|
|
830b37f547
|
Improve autotranslating of times before 1.1.1000
|
2020-09-26 16:10:26 +02:00 |
|
|
c9d8d4bdbd
|
Allow automatic translations of days before 1000 CE
|
2020-09-26 16:02:18 +02:00 |
|
|
b405855fc2
|
Disallow translating as decade before 1000 CE
|
2020-09-26 15:30:30 +02:00 |
|