-
Notifications
You must be signed in to change notification settings - Fork 0
Kielipankki Korp frontend upgrade to 2023 MM DD (9.3.0 )
- Goal: Merge our additions or modifications feature by feature and create pull requests.
- Our features are in separate branches, merged to the current master branch. Many of them are based on a single base commit in Språkbanken’s code, but some depend on each other
- Språkbanken’s code has been refactored at places, so simply merging (or rebasing) our existing feature branches to the new code most likely won’t work in general. The changes could perhaps be rebased interactively (to retain original commit information), but it would in most cases require moving and modifying the changes. In that case, it might also be better to make some changes (such as converting configuration variable access) in a separate commit.
- Some of our modifications have become obsolete or unnecessary by the changes and fixes in Språkbanken’s code.
- Our frontend configuration also contains code that will need to be changed because of changes in Språkbanken's code. At least the following need to be done:
-
config.js
needs to be replaced withconfig.yml
. - Extended search components should be defined differently: they should not be in-line in attribute definitions but in
custom/extended.js
and referred to by name in attribute definitions.
-
- Corpus configuration has moved to the backend, so most of the configuration will be moved there.
- However, some JavaScript code will remain in the frontend configuration.
- Because of this, the backend corpus configurations will need to be functional before the updated frontend can be used.
- Configuration variables should be accessed as
settings["config_var"]
, notsettings.configVar
as previously. - Some code has been converted from AngularJS controllers to components, in preparation for moving to Vue.js.
- The corpus selector has been completely reimplemented.
- Language codes now use ISO 639-2 (three-letter codes) instead of two-letter codes.
- Authentication has been reimplemented apparently with Shibboleth support, so we probably should check if we could also use the new implementation.
Branches have several different prefixes, which often carry some meaning:
-
dev-v9-kp-
: Strictly Kielipankki-specific branches. We should find a different way of implementing (maybe as plugins or via configuration), so that these would not be needed. -
f-v9-
: These branches may need more work. Some of them may have become obsolete and many features should be discussed with Språkbanken first, as the present approach might not work any more or be the best one. -
fix-
: These contain bug fixes, but they might have been fixed at Språkbanken. -
t-
: In general, these might be most relevant to Språkbanken and perhaps also easiest to port.
-
dev-v9-kp-mods
The only thing this branch currently does is to disable GitHub workflow “Korp with SB config”, which requires Språkbankens configuration. It would be better if we could make do without this, but how?
-
dev-v9-kp-ui-style
This currently only adds Kielipankki logo images. They should probably be moved to the frontend, but using images from the configuration will probably require changes elsewhere. It should be discussed with Språkbanken how that could be done.
-
f-v9-allow-no-preselected-corpora
Allow selecting no corpora initially (configurable).
-
f-v9-configurable-lemgram-completion
Support configurable lemgram completion, instead of using only Karp.
-
f-v9-corpusfolder-extra-nonfolder-props
Specify a list of property names in a corpus folder that do not designate a subfolder. This has most probably been obsoleted by changes in the corpus configuration format.
-
f-v9-generate-ui-lang-menu
Specify UI languages in the configuration instead of editing
index.pug
. This is most probably obsolete, as similar functionality has been implemented by Språkbanken. -
f-v9-grouped-select
Add a
datasetSelectController
with value grouping to the extended search. -
f-v9-handle-unavail-corpora
Support handling configured but unavailable corpora (configurable behaviour). This requires that the backend
/corpus_info
endpoint recognizes the parameterreport_undefined_corpora
. -
f-v9-load-customstyles
Load custom styles (modifications) in the configuration. There might be better ways to implement this kind of functionality, so this could be discussed (in GitHub) with Språkbanken.
-
f-v9-main-menu-include
This will need to be implemented differently, as Pug is no longer used as the source format for
index.html
. This should be discussed with Språkbanken. -
f-v9-merge-transl-files
A Webpack configuration modification to merge
locale-*.json
translation files from configuration and plugins, to allow overriding some translations locally. There might be better ways to do this; maybe ask in a GitHub discussion? -
f-v9-plugin-facility
A plugin facility for the Korp frontend. In addition to
plugins.js
, the changes include hook points in many parts of the code that may have been refactored. Also the Webpack configuration contains code for discovering plugins. Some hook points may have become obsolete.This incorporates
f-v9-merge-transl-files
(see the previous item) andf-v9-pug-multiple-paths-plugin
(see the next item), so some major changes are likely to be required for the plugins to work as currently. -
f-v9-pug-multiple-paths-plugin
It seems that as of 2023-02-08, Pug is no longer used as the source format for
index.html
, so this is apparently no longer relevant.We should then find another way to override e.g. the main menu without having having to modify the main code for that. We should probably ask in a GitHub discussion what would be a good way to do that.
-
fix-backend-based-kwic-download
Fix the KWIC download that uses the backend CGI script
korp_download.cgi
. Språkbanken has implemented KWIC download in the frontend, so they do not currently use the backend-based approach, and some changes elsewhere had made it unworking.The backend CGI script should be converted to an endpoint plugin for the current Korp backend.
-
fix-corpuschooser-intermediate
Fix the corpus chooser to mark correctly with folders with both locked and unlocked corpora. This may have become obsolete or at least it probably needs to be rewritten because the corpus chooser has been compeletely reimplemented.
-
fix-corpuschooser-multiple-linked-langs
Fix the corpus chooser to support multiple linked languages in parallel corpora. We should check if the functionality has been implemented at Språkbanken, but at least this has to be rewritten as the corpus chooser has been reimplemented.
-
fix-require-geo-map-attr-name
Fix to actually require
geo
in the names of map attributes. This has been fixed at Språkbanken, so this is obsolete. -
fix-sidebar-dataset-attr-mapping
Map values of attributes with non-array datasets through the dataset mapping to show their values in the sidebar. It might be good to ask if this is the desired behaviour or not.
-
t-calc-map-center-dynamically
Calculate the map centre coordinates dynamically based on the points on the map.
-
t-config-attr-sidebarHideLabel
Support attribute property
sidebarHideLabel
to hide a label in the sidebar. At least the name of the property should be converted tosidebar_hide_label
. -
t-copy-markup-from-config
A Webpack configuration change to copy markup files from the configuration directory. This might be obsolete, or at least we should discuss how to implement it.
-
t-default-transl-langs
Configure languages to fall back to if a translation is not found for a UI item or attribute value. We should check how this currently works in Språkbanken’s Korp.
-
t-faster-attribute-mapping
Speed up some functions used in the corpus selector. We should check if this is still relevant.
-
t-localize-kwic
Allow localizing “KWIC” in the Korp UI. This is a minor change that will have to be rewritten because Pug is no longer used.
-
t-localize-support-within-postposition
Support adding a postposition after the “within” selection list. A tiny change that will have to be partly rewritten because Pug is no longer used.
-
t-mode-switch-restore-params
Support saving and restoring hash parameters when switching modes in Korp (configurable).
-
t-reduce-attr-select-width-in-stylesheet
Move the width of the statistics attribute selection list to the stylesheet instead of having it fixed in the Pug/HTML code.
-
t-remove-fixed-sv
Remove fixed references to Swedish as the default UI language, and allow configuring the locales used by different UI languages. We should check if this is needed any more.
-
t-settings-SimpleSearchGetLemgramCQP
Support a configurable way to construct lemgrams in the simple search, instead of using Språkbanken’s default. This probably requires changes because of changes in the configuration system: we should find out a where the function for constructing lemgrams should be defined.
-
t-settings-isKorpLabsURL
Specify a function in the configuration to test whether using the “Korp Labs” version of Korp (a test version) or not. This probably requires changes because of changes in the configuration system: we should find out a where the function for constructing lemgrams should be defined.
-
t-settings-logoKorpVersion
Allow specifying in the configuration the version number shown with the Korp logo. This is not very essential: it was implemented so that the Korp Labs would also show “v9” instead of “v10” as in Språkbanken’s Korp.
-
t-sidebar-video-configurable-size
Allow configuring the sidebar video size, instead of using
100%
. -
t-stats_cqp-pass-attribute-name
Pass the attribute name to a
stats_cqp
function to allow the same custom function to be used for different attributes. -
t-tokens-in-custom-attr-patterns
Allow referring to all tokens in the hit in custom attribute patterns.
Some features in Kielipankki Korp 5 have not yet been ported to Kielipankki Korp frontend 9. These features are in most cases not in separate branches, so the relevant changes should be extracted based on commit information and diffs. At least some of them could or should perhaps be implemented as plugins.
The features include at least the following:
-
Restricted corpora modal: When the user tries to access a restricted corpus, a modal dialogue opens informing them of that and offering login.
-
Support a “prequery” in the simple search: Search only within structures (texts, paragraphs or sentences) containing the given word forms or lemmas. This should perhaps be implemented as a plugin or at least be configurable.
-
Support implicitly adding a given CQP expression between any two tokens in the extended search, defined corpus-wise. This is useful for corpora containing some kind of markup represented as tokens.
- Plugins are in the Kielipankki-korp-frontend repository as branches
plugins/*
, merged toplugins/master
.
-
about_modifier
-
config_augment_info
-
config_corpus_features
-
config_corpusinfo_copier
-
config_corpus_settings_modifier
-
config_licence_category
-
config_logical_corpora
-
config_sidebar_order
-
config_url_opts
-
corpus_aliases
-
corpuschooser_item_formatter
-
corpuschooser_prompt_empty
-
corpusinfo_formatter
-
news_banner
-
shibboleth_auth
-
sidebar_link_section
- Frontend configuration is in the Kielipankki-korp-frontend repository as
config/master
, with test instances asconfig/*
. - Some of the JavaScript code that has been in the mode files need to be moved to
custom
and referenced by name from the corpus configuration in the backend.- Some custom code has not been ported even to Korp 9.1.0 yet (e.g. ScotsCorr).
- Configuration variables should be changed from camelCase to snake_case.