-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add natural translation for DSL #574
base: master
Are you sure you want to change the base?
Conversation
Maybe we could support translations of a testplan (and rollout of templates?) as
with an extra option (or argument) to pass the natural language for the translation. |
This should work. |
@pdawyndt in #559 it also says that translations for files should be provided. In what sense? - files: !natural_language
en:
- name: "file.txt"
url: "media/workdir/file.txt"
nl:
- name: "fileNL.txt"
url: "media/workdir/fileNL.txt" This seems pointless since I could also just do: - files:
- name: "file.txt"
url: "media/workdir/file.txt"
- name: "fileNL.txt"
url: "media/workdir/fileNL.txt" |
Not really pointless as TESTed will show all "linked files" to the students. For each file, TESTed will try to find its name in the expression/statement and then turn that into a hyperlink. If it doesn't find the filename, it will add it to a list of files that is displayed for the testcase. |
@pdawyndt , I started looking for adding a translation table like translation:
animal:
en: "animal"
nl: "dier"
result:
en: "result"
nl: "resultaat" Is it even usefull to then also add support in a statement like the following: - statement: !natural_language
en: 'result = Trying(10, "{animal}")'
nl: 'resultaat = Proberen(10, "{animal}")' I would suggest not even searching lookingany deeper when a natural_language map is already found and only using translation map when the expected (like a string) is given. |
I definitely have many exercises where this (the combination of translation and template strings) is useful. So I would say yes. If we use Python format strings, then we could even write your example as - statement: !natural_language
en: 'result = Trying(10, {animal!r})'
nl: 'resultaat = Proberen(10, {animal!r})' And not even bother about using single or double quotes or escaping any quotes in the thing you put in the placeholders (which otherwise adds a lot of complication on the side of the DSL-author). If you have a variable statement = statement.format(**translation, **data) If we also allow data to be an YAML-array instead of a YAML map (positional instead of named placeholders), then formatting is done by statement = statement.format(*data, **translation) For example, if '{} + {} = {}' or with explicit positions (which would also allow reodering and reusing the array values) '{0} + {1} = {2}' |
Currently I've implemented support for The
|
|
||
|
||
def create_enviroment() -> Environment: | ||
enviroment = Environment() |
Check warning
Code scanning / CodeQL
Jinja2 templating with autoescape=False Medium test
Copilot Autofix AI 26 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be relevant for our case. Since, we're not passing any html or xml through this.
# def represent_str(dumper, data): | ||
# return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='"') |
Check notice
Code scanning / CodeQL
Commented-out code Note test
Documentation Natural language translationIn this documentation 2 things will be discussed. GloballyYou can start with a list of tabs or a directory. translations:
animals:
en: "animals"
nl: "dieren"
humans:
en: "humans"
nl: "mensen"
tabs:
- tab: "{{animals}}"
...
- tab: "{{humans}}"
... This defined a tabs:
- tab: "dieren"
...
- tab: "mensen"
... TESTed would be able to understand this again.
Inside a tabAs discussed above, a tab could have a title that can be - tab: !natural_language
en: 'animal/{{animals}}'
nl: 'dier/{{animals}}'
... In this case, the In a tab you can also define a Inside a tab, context and testcase, you can also define files: !natural_language
en:
- name: "file_{{animal}}.txt"
url: "media/workdir/file_{{animal}}.txt"
nl:
- name: "bestand_{{animal}}.txt"
url: "media/workdir/bestand_{{animal}}.txt" Inside a contextA tab can contain a context object that contains all the Inside a testcaseA testcase can contain all kinds of different things. One of those - statement: !natural_language
en: '{{result}} = Trying(10)'
nl: '{{result}} = Proberen(10)'
- expression: !natural_language
en: 'count_words({{result}})'
nl: 'tel_woorden({{result}})' For - expression: !programming_language
javascript: !natural_language
en: "{{animal}}_javascript_en(1 + 1)"
nl: "{{animal}}_javascript_nl(1 + 1)"
typescript: !natural_language
en: "{{animal}}_typescript_en(1 + 1)"
nl: "{{animal}}_typescript_nl(1 + 1)"
java: !natural_language
en: "Submission.{{animal}}_java_en(1 + 1)"
nl: "Submission.{{animal}}_java_nl(1 + 1)"
python: !natural_language
en: "{{animal}}_python_en(1 + 1)"
nl: "{{animal}}_python_nl(1 + 1)" An equivalent of this would be: - expression: !natural_language
en: !programming_language
javascript: "{{animal}}_javascript_en(1 + 1)"
typescript: "{{animal}}_typescript_en(1 + 1)"
java: "Submission.{{animal}}_java_en(1 + 1)"
python: "{{animal}}_python_en(1 + 1)"
nl: !programming_language
javascript: "{{animal}}_javascript_nl(1 + 1)"
typescript: "{{animal}}_typescript_nl(1 + 1)"
java: "Submission.{{animal}}_java_nl(1 + 1)"
python: "{{animal}}_python_nl(1 + 1)" If no For both of these, you can specify a For Next up, you can also specify
An example could be the following: stderr:
data: !natural_language
en: "Nothing to see here {{User}}"
nl: "Hier is niets te zien {{User}}"
config:
ignoreWhitespace: true You can also add a When specifying the
An example of this would be the following: return: !oracle
value: !natural_language
en: "The {{result}} 10 is OK!"
nl: "Het {{result}} 10 is OK!"
oracle: "custom_check"
file: "test.py"
name: "evaluate_test"
arguments: !natural_language
en: ["The value", "is OK!", "is not OK!"]
nl: ["Het {{result}}", "is OK!", "is niet OK!"] Lastly, there is a
An example of this would be the following: description:
description: !natural_language
en: "Eleven_{{elf}}"
nl: "Elf_{{elf}}"
format: "code" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not looked at all the implementation code itself, I mostly read the "documentation".
- I don't see any immediate conflicts in the DSL with other features, so I think the proposal is good.
- I think the choice for explicit !natural_language is a good choice, even if it adds some verbosity to the test suites. It is always easier to add shorthands later if the experience of using the feature indicates that they are needed (the reverse is much more difficult).
So in summary, I think this is a good way of adding translation to the DSL, while keeping simple things simple but still providing a fairly flexible approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all sorry for the late review.
As Niko already reviewed the docs, and I agree with his take, I have mostly looked at the code.
I have left more questions then actual comments, sorry for that, I am not very familiar with this part of the code.
I think it would be a big improvement should it be possible to write the code a bit more agnostic of the actually TESTed DSL. But it might be best to discuss this with @pdawyndt before you start a rewrite, as he might deem it more worthwhile for you to focus on other features instead of over-optimizing this one.
@@ -148,6 +184,8 @@ def _parse_yaml(yaml_stream: str) -> YamlObject: | |||
yaml.add_constructor("!" + actual_type, _custom_type_constructors, loader) | |||
yaml.add_constructor("!expression", _expression_string, loader) | |||
yaml.add_constructor("!oracle", _return_oracle, loader) | |||
yaml.add_constructor("!natural_language", _natural_language_map, loader) | |||
yaml.add_constructor("!programming_language", _programming_language_map, loader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused why !programming_language
is added in this pr, as I assumed this to already work before this pr. But I am not very familiar with this part of the code.
Could you explain it to me?
@@ -20,7 +20,14 @@ | |||
StringTypes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is easy to do, I would put these new tests in a separate file.
(But do keep them here if creating a second testfile requires a lot of code duplication)
@@ -0,0 +1,412 @@ | |||
import sys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must say I expected this file to be much simpler.
In principle the !natural_language
should simply be replaced by the content of the specified language in the map.
So in my mind, this code should not know whether it is working within a tab, context, testcase,...
The fact that this preprocess script is so heavily linked to the precise TESTed DSL, will make it harder to maintain in the future. Any change to the TESTed DSL will also have to be verified here.
Do you think it is possible to write a more abstract solution, or have I missed some potential issues?
|
||
|
||
def wrap_in_braces(value): | ||
return f"{{{value}}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a standard way of doing this in jinja2?
To me it felt a bit odd. But I assume this was an explicit feature request?
You can run the pre-processor by using
python -m tested.nat_translation ./exercise/simple-example/program_language_map/suite.yaml en # English translation