For our VAULT migration. This page uses Mermaid flowcharts. The left-side nodes are EQUELLA XML paths or a JSON path for the non-XML diagram. They progress to the Invenio field on the right, which has its cardinality in parentheses.
Dotted lines / rounded nodes represent fields we're not using (see item.version
in the non-XML diagram, for example).
--- title: MODS (/xml/mods) Mappings --- flowchart LR TITLE[titleInfo/title] --> |titles after the 1st, type based on @type attribute| ATITLETYPE["type defaults to other"] SUBTITLE[titleInfo/subtitle] -->|type = subtitle| ATITLETYPE ATITLETYPE --> ADTITLE["Additional Titles (0-n)"] ABSTRACTS[abstract] ---> |1st one| DESCRIPTION["Description (0-1)"] ABSTRACTS --> |abstracts after the 1st, type = abstract| ADDLDESC["Additional Descriptions (0-n)"] NOTES[noteWrapper/note] -->|See MODS note types, type = other| ADDLDESC RTYPE[typeOfResourceWrapper/typeOfResource] -->|1st one, map| TYPE["Resource Type (1)"] NAMES[mods/name/namePart] -->|Parse, org or person, 1 or many| NAMEDETAILS["roleTerm -> role, subName affiliations"] NAMEDETAILS --> CREATORS["Creators (1-n)"] DATECREATED[origininfo/dateCreatedWrapper/dateCreated] -->DATELOGIC["Use MODS if present else item.dateCreated"] SEMCREATED[origininfo/semesterCreated] -->DATELOGIC DATELOGIC -->PUBDATE["Publication Date (1)"] DATECAPT["origininfo/dateCaptured"] -->|"date.type = collected"| DATES["Dates (0-n)"] DATEOTHER["origininfo/dateOtherWrapper/dateOther"] -->|"date.type = other"| DATES %% CONTRIBUTORS["Contributors (0-n)"] ACCESSCONDITION["mods/accessCondition"] -->|default to copyright if no license| RIGHTS["Rights (Licenses) (0-n)"] MODSUBJECT["mods/subject"] -->|look up in subjects map| SUBJECTS["Subjects (0-n)"] GENRE["mods/genreWrapper/genre"] -->|look up in subjects map| SUBJECTS OIPUB["originInfo/publisher"] --> PUBLISHER["Publisher (0-1)"] DBR["relatedItem/title = Design Book Review"] --> |different depending on date| PUBLISHER EXTENT["physicalDescription/extent"] --> SIZES["Sizes (0-n)"] %% We don't have structured location information & it does not display in Invenio, skip %% LOCATIONS["Locations. We only have place names, no IDs or coordinates. (0-n)"] ID[identifier] -->|@type = DOI in Faculty Research collection| ALTID["Alternate Identifiers (0-n)"]
We are not using the Funders or References fields. Few (no?) VAULT items have funding information and none have the identifiers that Invenio expects. We don't have References lists for any items.
--- title: Local (/xml/local) Mappings --- flowchart LR ASERIESV["archivesWrapper/series, archivesWrapper/subseries"] --> ASERIESI["cca:archive_series custom field"] viewlevel[viewLevel] --> |TODO Many, many translations| ACCESS["Access restricted/public"] courseworktype[courseWorkWrapper/courseWorkType] --> |TODO if we have no mods/typeOfResource, map 1st| TYPE["Resource Type (1)"]
--- title: Non-XML Mappings --- flowchart LR NAME["item.name"] --> |Use 'Untitled' if absent| TITLE["Title (1)"] VAULTID["item.uuid + item.version"] -->|"'is new version of' VAULT URL"| REL["Related Identifiers/Works (0-n)"] OWNER["item.owner.id"] -->|TODO map CCA username to Invenio numeric ID| PARENT["parent.owned_by.id"] DATECREATED["item.dateCreated"] -->|Use item creation timestamp if no date in MODS| PUBDATE["Publication Date (1)"] ATTACHMENTS[item.attachments] -->|guess MIME type from filenames| FORMATS["Formats (0-n)"] ATTACHMENTS -->|File, HTML, & Zip attachments| FILES["Files (0-n)"] ATTACHMENTS -->|URL, YouTube attachments are related works| REL
There's additional, mostly administrative, metadata in the EQUELLA item JSON outside the XML. We can represent file/attachment operations here, too.
While VAULT has version information, we plan only to migrate the most recent (live) versions of items, so displaying the version number with no way to access prior iterations can only lead to confusion. The version number will still be accessible in the copies of VAULT metadata we store on migrated items.