From b8ae4f8b3ead85ad175cb5acc0d57d2efa009f9b Mon Sep 17 00:00:00 2001 From: Joshua Ramon Enslin Date: Sat, 6 Feb 2021 13:40:54 +0100 Subject: [PATCH] Add README --- README.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..40bc995 --- /dev/null +++ b/README.md @@ -0,0 +1,15 @@ +# Tools to automatically clean and enrich vocabulary entries at museum-digital + +This repository contains a set of tools, that can be hooked into an existing application working with museum-digital's structures and libraries, to simplify the handling of vocabulary entries. + +## General applicability + +While most scripts in this repository require a DB connection to a museum-digital vocabulary database, and are thus likely useful outside of museum-digital's own ecosystem. An exception are `src/NodaTimeSplitter.php` and `src/NodaUncertaintyHelper.php`. + +## NodaTimeSplitter + +`src/NodaTimeSplitter.php` contains a list of rules to reformulate and parse entered time names into an array. + +## NodaUncertaintyHelper + +`src/NodaUncertaintyHelper.php` contains lists of indicators for invalid or uncertain inputs and functions to use those lists to clean inputs . If, e.g., "Berlin?" has been entered as a place, this actually means that the entered place is "Berlin" and the entry is uncertain.