- remove rows 1,2 (title, empty)
- remove empty columns A and G
- save both DE and EN sheets into a CSV (to allow the following operations) 5.1 CSV export: Check "Quote all text cells" so that we avoid issues with commas within the cells 5.2 from this point onward we shall only work on the CSVs and not the .xlsx)
- add headers EN:
Subject Number
andSubject
for column A, B . DE:Fachnummer
,Fach
- add to header (row 1) "Subject Area" and "Scientific Discipline" in columns D, E
- remove header rows (except row 1): 57, 137, 169
- remove empty rows (search in column A)
- fill-in the missing values (in Review Board, Subject Area, Scientific Discipline columns) - this is tedious but important, as we cannot reply on merged cells in the CSV. And it is at the core of the tree structure
- simplified the number notation removing the dots, ie. '1.22-01' -> '122-01'. Used regex
^(\d)\.(\d)
with replacement$1$2
. And regex,"(\d)\.(\d\d)
with replacement,"$1$2
for references to parent nodes
- just a copy-pasta
- ensure that EN comes before the DE terms
- headers should be in the following sequence:
Subject Number
Subject
Review Board
Subject Area
Scientific Discipline
Fachnummer
Fach
Fachkollegium
Fachgebiet
Wissenschaftsbereich