Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce usage of EtikettMaker #2106 #2123

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
599fd79
Map marcRelator with lookup #2106
TobiasNx Jan 10, 2025
f70ad41
Only create contributions if marcRel exists #2106
TobiasNx Jan 10, 2025
3167793
Map collections label with lookup #2106
TobiasNx Jan 10, 2025
bf844c0
Map publication frequency label with lookup #2106
TobiasNx Jan 10, 2025
aa1eee3
Delete labels that are already provided by fix #2106
TobiasNx Jan 10, 2025
39be2c4
Delete non marcRelator label since alma only has marc relator roles #…
TobiasNx Jan 10, 2025
03ee067
Delete license label that is already provided in fix #2106
TobiasNx Jan 10, 2025
d80e576
Delete culturegraph label since it is already provided in fix #2106
TobiasNx Jan 10, 2025
21e2186
Adjust macro and fallback for provenance metadata #2106
TobiasNx Jan 10, 2025
9a749bc
Delete item label since it is provided in the fix #2106
TobiasNx Jan 10, 2025
d2a1bce
Delete Klappentext fallback since it is provided by the source marc d…
TobiasNx Jan 10, 2025
643b49f
Delete digitool as label #2016
TobiasNx Jan 13, 2025
063558f
Delete edoweb label since "Archivierte Online-Ressource" is provided …
TobiasNx Jan 13, 2025
dd889c9
Use fallback label for related lobid resources in fix instead of etti…
TobiasNx Jan 13, 2025
348dd38
Replace lobid Organisation Fallback for ZDB Collection wih fix mappin…
TobiasNx Jan 13, 2025
8d178d9
Remove dewqy.json since we already map the dewey in the fix #2106
TobiasNx Jan 13, 2025
52f8484
Adjust transformation to use https instead of http #2106
TobiasNx Jan 13, 2025
f7fefbb
Add labels for ebookcentral and ebrary #2106
TobiasNx Jan 14, 2025
c5cbb69
Set sameAs urn and doi labels #2106
TobiasNx Jan 14, 2025
9cee215
Add fallback collection labels #2106
TobiasNx Jan 14, 2025
6ebdf7a
Set url without http as label #2106
TobiasNx Jan 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,9 @@ public void run() {
fixVariables.put("isil2opac_issn.tsv", "../../../../../../lookup-tables/data/opacLinks/isil2opac_issn.tsv");
fixVariables.put("isil2opac_zdbId.tsv", "../../../../../../lookup-tables/data/opacLinks/isil2opac_zdbId.tsv");
fixVariables.put("isil2opac_almaMmsId.tsv", "../../../../../../lookup-tables/data/opacLinks/isil2opac_almaMmsId.tsv");
fixVariables.put("marcRel.tsv", "./maps/marcRel.tsv");
fixVariables.put("collectionLabels.tsv", "./maps/collectionLabels.tsv");



XmlElementSplitter xmlElementSplitter = new XmlElementSplitter();
Expand Down
300 changes: 155 additions & 145 deletions src/main/resources/alma/fix/contribution.fix

Large diffs are not rendered by default.

12 changes: 3 additions & 9 deletions src/main/resources/alma/fix/describedBy.fix
Original file line number Diff line number Diff line change
Expand Up @@ -107,16 +107,10 @@ add_array("describedBy.resultOf.object.modifiedBy[]")

end

call_macro("provenanceLinks",field: "describedBy.resultOf.object.sourceOrganization.id")
copy_field("describedBy.resultOf.object.sourceOrganization.id","describedBy.resultOf.object.sourceOrganization.label")
lookup("describedBy.resultOf.object.sourceOrganization.label","lobidOrgLabels",delete:"true")
call_macro("provenanceLinks",field: "describedBy.resultOf.object.provider.id")
copy_field("describedBy.resultOf.object.provider.id","describedBy.resultOf.object.provider.label")
lookup("describedBy.resultOf.object.provider.label","lobidOrgLabels",delete:"true")
call_macro("provenanceLinks",field: "describedBy.resultOf.object.sourceOrganization.id",label: "describedBy.resultOf.object.sourceOrganization.label")
call_macro("provenanceLinks",field: "describedBy.resultOf.object.provider.id",label: "describedBy.resultOf.object.provider.label")
do list(path:"describedBy.resultOf.object.modifiedBy[]","var":"$i")
call_macro("provenanceLinks",field: "$i.id")
copy_field("$i.id","$i.label")
call_macro("provenanceLinks",field: "$i.id",label:"$i.label")
end
lookup("describedBy.resultOf.object.modifiedBy[].*.label","lobidOrgLabels",delete:"true")

uniq("describedBy.resultOf.object.modifiedBy[]")
25 changes: 25 additions & 0 deletions src/main/resources/alma/fix/macros.fix
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,18 @@ do put_macro("provenanceLinks")
end
prepend("$[field]", "http://lobid.org/organisations/")
append("$[field]", "#!")
copy_field("$[field]","$[label]")
lookup("$[label]","lobidOrgLabels",delete:"true")
end

unless exists("$[label]")
if any_contain("$[field]","lobid")
add_field("$[label]","lobid Organisation")
elsif any_contain("$[field]","ebookcentral")
add_field("$[label]","Ebookcentral Proquest")
elsif any_contain("$[field]","ebrary")
add_field("$[label]","Ebrary")
end
end


Expand Down Expand Up @@ -560,3 +572,16 @@ do put_macro("manufacture")
end
end
end


# lobid resources label
do put_macro("lobidResourcesFallbackLabel")
do list(path:"$[field]","var":"$array")
unless exists("$array.label")
if any_contain("$array.id","lobid")
add_field("$array.label","lobid Ressource")
end
end
end
end

25 changes: 23 additions & 2 deletions src/main/resources/alma/fix/maps.fix
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,13 @@ put_filemap("$[isil2opac_issn.tsv]","isil2opac_issn", sep_char:"\t")
put_filemap("$[isil2opac_zdbId.tsv]","isil2opac_zdbId", sep_char:"\t")
put_filemap("$[isil2opac_almaMmsId.tsv]","isil2opac_almaMmsId", sep_char:"\t")

# marcRel
put_filemap("$[marcRel.tsv]","marcRel", sep_char:"\t",key_column:"0",value_column:"1",expected_columns:"-1")

# collection labels
put_filemap("$[collectionLabels.tsv]","collectionLabels", sep_char:"\t",key_column:"0",value_column:"1",expected_columns:"-1")



put_map("rswk-indicator",
"p": "Person",
Expand All @@ -65,8 +72,22 @@ put_map("rswk-indicator",
"s": "SubjectHeading"
)



put_map("marc-publication-frequency-label",
"http://marc21rdf.info/terms/continuingfre#d" : "täglich",
"http://marc21rdf.info/terms/continuingfre#i" : "dreimal wöchentlich",
"http://marc21rdf.info/terms/continuingfre#c" : "zweimal wöchentlich",
"http://marc21rdf.info/terms/continuingfre#w" : "wöchentlich",
"http://marc21rdf.info/terms/continuingfre#e" : "vierzehntägig",
"http://marc21rdf.info/terms/continuingfre#s" : "halbmonatlich",
"http://marc21rdf.info/terms/continuingfre#m" : "monatlich",
"http://marc21rdf.info/terms/continuingfre#b" : "alle zwei Monate",
"http://marc21rdf.info/terms/continuingfre#q" : "vierteljährlich",
"http://marc21rdf.info/terms/continuingfre#f" : "halbjährlich",
"http://marc21rdf.info/terms/continuingfre#a" : "jährlich",
"http://marc21rdf.info/terms/continuingfre#g" : "alle zwei Jahre",
"http://marc21rdf.info/terms/continuingfre#h" : "alle drei Jahre",
"http://marc21rdf.info/terms/continuingfre#z" : "unregelmäßig oder sonstige Erscheinungsfrequenz"
)

put_map("medium-id-to-label",
"Audio-Dokument": "http://purl.org/ontology/bibo/AudioDocument",
Expand Down
2 changes: 2 additions & 0 deletions src/main/resources/alma/fix/mediumAndType.fix
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,8 @@ if any_equal("natureOfContent[].*.label","Website")
do list(path: "856??", "var": "$i")
unless any_contain("$i.u", "edoweb")
copy_field("$i.u","webPageArchived[].$append.id")
copy_field("$i.u","webPageArchived[].$last.label")
replace_all("webPageArchived[].$last.label","http[s]?://(.*?)[/]?$","$1")
end
end
end
Expand Down
29 changes: 24 additions & 5 deletions src/main/resources/alma/fix/relatedRessourcesAndLinks.fix
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ end
replace_all("supplement[].*.id","^\\(DE-605\\)(.*)$","http://lobid.org/resources/$1#!")
replace_all("supplement[].*.id","^\\(DE-600\\)(.*)$","http://lobid.org/resources/ZDB-$1#!")
replace_all("supplement[].*.label","<<|>>","")
call_macro("lobidResourcesFallbackLabel",field:"supplement[]")

# isPartOf
# it describes the relation between a published ressource and its superordinate series or collection.
Expand Down Expand Up @@ -201,10 +202,15 @@ replace_all("isPartOf[].*.hasSuperordinate[].*.id", "^\\(DE-605\\)(.*)$", "http:
replace_all("isPartOf[].*.hasSuperordinate[].*.id", "^\\(DE-600\\)(.*)$", "http://lobid.org/resources/ZDB-$1#!")

replace_all("isPartOf[].*.numbering", "^[©]|\\s?[,.:;/=]?$", "")
do list(path:"isPartOf[]","var":"$i")
call_macro("lobidResourcesFallbackLabel",field:"$i.hasSuperordinate[]")
end

uniq("isPartOf[]")
replace_all("containedIn[].*.id", "^\\(DE-605\\)(.*)$", "http://lobid.org/resources/$1#!")
replace_all("containedIn[].*.id", "^\\(DE-600\\)(.*)$", "http://lobid.org/resources/ZDB-$1#!")
replace_all("containedIn[].*.label","<<|>>","")
call_macro("lobidResourcesFallbackLabel",field:"containedIn[]")

uniq("containedIn[]")

Expand All @@ -231,6 +237,7 @@ end

replace_all("primaryForm[].*.id", "^\\(DE-605\\)(.*)$", "http://lobid.org/resources/$1#!")
replace_all("primaryForm[].*.id", "^\\(DE-600\\)(.*)$", "http://lobid.org/resources/ZDB-$1#!")
call_macro("lobidResourcesFallbackLabel",field:"primaryForm[]")

# secondaryForm

Expand All @@ -250,6 +257,9 @@ do list(path: "77608", "var":"$i")
end
end

call_macro("lobidResourcesFallbackLabel",field:"secondaryForm[]")



# 856 - Electronic Location and Access (R) - Subfield: $u (R) $3 (NR)
# 1. Indicator: 4 = HTTP
Expand Down Expand Up @@ -311,6 +321,7 @@ do list(path:"doi[]","var":"$i")
prepend("fulltextOnline[].$last.id","https://doi.org/")
copy_field("fulltextOnline[].$last.id", "sameAs[].$append.id")
add_field("fulltextOnline[].$last.label", "DOI-Link")
add_field("sameAs[].$last.label", "DOI-Link")
end

# urn for fullTextOnline and sameAs
Expand All @@ -319,6 +330,7 @@ do list(path:"@urnLinks","var":"$i")
copy_field("$i", "fulltextOnline[].$append.id")
copy_field("fulltextOnline[].$last.id", "sameAs[].$append.id")
add_field("fulltextOnline[].$last.label", "URN-Link")
add_field("sameAs[].$last.label", "URN-Link")
end

if is_empty("@urnLinks")
Expand All @@ -327,6 +339,7 @@ if is_empty("@urnLinks")
prepend("fulltextOnline[].$last.id","https://nbn-resolving.org/")
copy_field("fulltextOnline[].$last.id", "sameAs[].$append.id")
add_field("fulltextOnline[].$last.label", "URN-Link")
add_field("sameAs[].$last.label", "URN-Link")
end
end

Expand Down Expand Up @@ -406,6 +419,7 @@ end
replace_all("related[].*.id", "^\\(DE-605\\)(.*)$", "http://lobid.org/resources/$1#!")
replace_all("related[].*.id", "^\\(DE-600\\)(.*)$", "http://lobid.org/resources/ZDB-$1#!")
replace_all("related[].*.label","<<|>>","")
call_macro("lobidResourcesFallbackLabel",field:"related[]")

add_array("inCollection[]")

Expand Down Expand Up @@ -544,6 +558,9 @@ do list(path:"912 ", "var":"$i")
replace_all("inCollection[].$last.id", "(ZDB-[0-9]{1,6}-[a-zA-Z|0-9\\-]*).*", "http://lobid.org/organisations/$1#!")
copy_field("inCollection[].$last.id","$i.@label")
lookup("$i.@label","lobidOrgLabels",delete:"true")
unless exists("$i.@label")
add_field("$i.@label","lobid Organisation ZDB Collection")
end
move_field("$i.@label","inCollection[].$last.label")
end
end
Expand All @@ -558,28 +575,30 @@ do list(path:"962 ", "var":"$i")
unless any_match("$j", "^ZDB.*")
copy_field("$j", "inCollection[].$append.id")
replace_all("inCollection[].$last.id", "^(.*)$", "https://lobid.org/collections#$1")
# TODO: Do we need a label?
copy_field("inCollection[].$last.id","inCollection[].$last.label")
lookup("inCollection[].$last.label","collectionLabels")
if any_match("inCollection[].$last.label","https://lobid.org/collections.*") #Fallback label for hbz collections
replace_all("inCollection[].$last.label","https://lobid.org/collections#(.*)","$1 Collection")
end
end
end
end


# 960 ## no Information about repeatability
# TODO: This needs further inspection if we need a collection fr all subfields: https://service-wiki.hbz-nrw.de/display/VDBE/960+-+Selektionskennzeichen+NZ
# Values from r can be invalid.

# do list(path:"960??", "var":"$i")
# do list(path:"$i.?", "var": "$j")
# copy_field("$j", "inCollection[].$append.id")
# replace_all("inCollection[].$last.id", "^(.*)$", "https://lobid.org/collections#$1")
# replace_all("inCollection[].$last.id", "^(.*)$", "http://lobid.org/collections#$1")
# # TODO: Do we need a label? https://github.com/hbz/lobid-resources/issues/1305#issuecomment-912312471, also labels seem wrong.
# end
# end


add_array("inCollection[].*.type[]","Collection")


# predecessor

# 780 - Preceding Entry (R) - Subfield: $t (NR), $w (R)
Expand All @@ -603,7 +622,7 @@ end

replace_all("predecessor[].*.id", "^\\(DE-605\\)(.*)$", "http://lobid.org/resources/$1#!")
replace_all("predecessor[].*.id", "^\\(DE-600\\)(.*)$", "http://lobid.org/resources/ZDB-$1#!")

call_macro("lobidResourcesFallbackLabel",field:"predecessor[]")

replace_all("predecessor[].*.label","Vorg. ---> ","")

Expand Down
4 changes: 4 additions & 0 deletions src/main/resources/alma/fix/titleRelatedFields.fix
Original file line number Diff line number Diff line change
Expand Up @@ -216,13 +216,17 @@ if exists("publication[].$first")
unless any_match("008","^.{18}[#\\| u].*$") # filters out not matching values and also the value unknown
copy_field("008","publication[].$first.frequency[].$append.id")
replace_all("publication[].$first.frequency[].$last.id", "^.{18}(.).*$", "http://marc21rdf.info/terms/continuingfre#$1")
copy_field("publication[].$first.frequency[].$last.id","publication[].$first.frequency[].$last.label")
lookup("publication[].$first.frequency[].$last.label","marc-publication-frequency-label")
end
elsif any_match("006","^s.*$")
do list(path: "006", "var":"$z")
if any_match("$z","^s.*$")
unless any_match("$z","^.[#\\| u].*$")
copy_field("$z","publication[].$first.frequency[].$append.id")
replace_all("publication[].$first.frequency[].$last.id", "^.(.).*$", "http://marc21rdf.info/terms/continuingfre#$1")
copy_field("publication[].$first.frequency[].$last.id","publication[].$first.frequency[].$last.label")
lookup("publication[].$first.frequency[].$last.label","marc-publication-frequency-label")
end
end
end
Expand Down
Loading
Loading