[WIP] Convert private WebUI strings to i18next #20610

sledgehammer999 · 2024-03-26T00:13:15Z

This draft PR is meant for collaboration for the i18next transition of the WebUI private part. The public part was done with PR #20520

You can use my commits into your own work without any kind of attribution.
Each commit provides the regex I used for each converstion from QBT_TR(string)QBT_TR to the i18next way.

There are .html files with nested elements that the regex hasn't converted. I think they need manucal conversion. About 124 instances.
Almost the same issue with the 3 remaining instances in the .js files.

@Chocobo1
For translating HTML attributes we need to do some like the following. For example:

<textarea placeholder="Format: IPv4:port / [IPv6]:port"></textarea>

<textarea class="qbt-translatable" data-i18n="[placeholder]Format: IPv4:port / [IPv6]:port" placeholder="Format: IPv4:port / [IPv6]:port"></textarea>

Aka we set the data-i18n attribute's value to [attribute-name]string. And if we have multiple attributes to translate for the same HTML element we separate it with ; eg [attribute-name1]string1;[attribute-name2]string2
However, I haven't tested if i18next-parser can correctly break out the different strings.
Source: https://stackoverflow.com/a/22822236/10260481

@ OTHER contributors
If you have time please take look at the diffs and try to verify the conversion.
Example:
This HTML

<p class="random">QBT_TR(Some text)QBT_TR</P>

becomes this:

<p class="random qbt-translatable" data-i18n="Some text">Some text</P>

and this Javascript:

let a = 'QBT_TR(Some text)QBT_TR';

becomes this:

let a = i18next.t('Some text');

Search: (<((?!class=).)*?)>QBT_TR$(.+?)$QBT_TR(\[CONTEXT=.+?\])? Replace: $1 class="qbt-translatable" data-i18n="$3">$3 File glob: src/webui/www/private/**/*.html

Search: (<.+?class=".+?)(".*?)>QBT_TR$(.+?)$QBT_TR(\[CONTEXT=.+?\])? Replace: $1 qbt-translatable$2 data-i18n="$3">$3 File glob: src/webui/www/private/**/*.html

Search: (<(?:(?!class=).)+?)(\S+?)="QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?" Replace: $1class="qbt-translatable" data-i18n="[$2]$3" $2="$3" File glob: src/webui/www/private/**/*.html

Search: (<.+?class=".+?)(".*?)(\S+?)="QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?" Replace: $1 qbt-translatable$2data-i18n="[$3]$4" $3="$4" File glob: src/webui/www/private/**/*.html

Search: (<.+?data-i18n=".+?)(".*?)(\S+?)="QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?" Replace: $1;[$3]$4$2$3="$4" File glob: src/webui/www/private/**/*.html

Search: (["'])QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?\1 Replace: i18next.t($1$2$1) File glob: src/webui/www/private/**/*.html

Manual conversion to template literal Search: (["'])QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?\s*?\1 Replace: i18next.t($1$2$1) File glob: src/webui/www/private/**/*.html

Search: `(.*?)QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])? Replace: `$1\${i18next.t('$2')} File glob: src/webui/www/private/**/*.html

Search: (["'])(\s*? )?QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?((\\n)*?)\1 Replace: `$2${i18next.t($1$3$1)}$4` File glob: src/webui/www/private/**/*.html

Search: (["'])QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?\1 Replace: i18next.t($1$2$1) File glob: src/webui/www/private/**/*.js

Search: (["'])(\s*? )?QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?\1 Replace: `$2${i18next.t($1$3$1)}` File glob: src/webui/www/private/**/*.js

Manual conversion

sledgehammer999 · 2024-03-26T00:15:23Z

Damn it. I worked on an older state of the master branch.
I'll update it tomorrow.
You get the idea. Please comment if you any comment for the methodology.

sledgehammer999 · 2024-03-26T00:50:52Z

Aka we set the data-i18n attribute's value to [attribute-name]string. And if we have multiple attributes to translate for the same HTML element we separate it with ; eg [attribute-name1]string1;[attribute-name2]string2
However, I haven't tested if i18next-parser can correctly break out the different strings.

@Chocobo1
I looked at the source of the HTMLLexer. Very simple code. It does indeed what I describe above.
It also has this tidbit in the end:

// remove any leading [] in the key
key = key.replace(/^\[[a-zA-Z0-9_-]*\]/, '')

// if empty grab innerHTML from regex
key = key || $node.text()

So we can probably de-duplicate the HTML files. Instead of doing this:

<p class="random qbt-translatable" data-i18n="Some text">Some text</P>

we can do this:

<p class="random qbt-translatable">Some text</P>

And our loading function will use the .text() as key if the data-i18n attribute isn't present.

Chocobo1 · 2024-03-26T04:45:37Z

src/webui/www/private/index.html

@@ -43,7 +43,7 @@

 <body>
    <noscript id="noscript">
-        <h1>QBT_TR(JavaScript Required! You must enable JavaScript for the WebUI to work properly)QBT_TR[CONTEXT=HttpServer]</h1>
+        <h1 class="qbt-translatable" data-i18n="JavaScript Required! You must enable JavaScript for the WebUI to work properly">JavaScript Required! You must enable JavaScript for the WebUI to work properly</h1>


this is noscript, no point in providing translation strings.

Chocobo1 · 2024-03-26T08:07:33Z

Aka we set the data-i18n attribute's value to [attribute-name]string. And if we have multiple attributes to translate for the same HTML element we separate it with ; eg [attribute-name1]string1;[attribute-name2]string2
Source: https://stackoverflow.com/a/22822236/10260481

It seems the link was using jquery-i18next which we don't use and therefore we don't have .i18n(). This function seems to be capable of parsing the [attribute] part and applying the translated text (to various element types). We probably need to write our own version.

So we can probably de-duplicate the HTML files. Instead of doing this:
Some text
we can do this:
Some text
And our loading function will use the .text() as key if the data-i18n attribute isn't present.

I tried to apply it to the public login page. There are a few issues with it:

i18next-parser couldn't extract the the translation string.
Probably it only recognize data-i18n by default. Using a empty data-i18n would work though:
```
Some text
```
We would need to manually handle and apply the [attribute] part.
i18next itself doesn't seem to provide function for it.

I quickly skimmed through the changes. It seems that the CONTEXT part was thrown away. Would it be better to preserve it in some way?
i18next supports it but I reckon it might be confusing for the translators. I mean the context string would be appended to the key name and that might confuse the translators. Also I didn't found a way to have context on html.
https://www.i18next.com/translation-function/context#basic

Or perhaps we really need to write our own custom string extractor...

sledgehammer999 · 2024-03-26T08:31:04Z

We would need to manually handle and apply the [attribute] part

Yes, because we use the vanilla version. We are responsible for applying the translated strings.
It should be fairly easy to implement in our replaceI18nText() function though.

Using a empty data-i18n would work though:

I wonder if we could further optimize replaceI18nText() to filter nodes that have the data-i18n attr and drop the requirement for adding the qbt-translatable class.

I quickly skimmed through the changes. It seems that the CONTEXT part was thrown away. Would it be better to preserve it in some way?
i18next supports it but I reckon it might be confusing for the translators. I mean the context string would be appended to the key name and that might confuse the translators. Also I didn't found a way to have context on html.

I wanted to also talk about this.
I am not sure what to do here. I understand the value of having context but the ergonomics don't fit our workflow. i18next is designed around having a single variable-like key and not a string-like key.
i18next-parser has a config setting contextSeparator: '_'. But I am not sure what the hell Transifex translators will be presented with.

Or perhaps we really need to write our own custom string extractor...

The regex I have in each commit can probably be made to also capture the CONTEXT value. However, i18next's JSON format saves it as "key_context: translated string".

Or you mean having our own format for runtime and extraction? This might work. I don't know how hard the implementation will be.
In any case, we also need to investigate how it will work with Transifex.

Chocobo1 · 2024-03-28T03:48:13Z

I wonder if we could further optimize replaceI18nText() to filter nodes that have the data-i18n attr and drop the requirement for adding the qbt-translatable class.

Maybe this, I haven't test: https://stackoverflow.com/questions/2694640/find-an-element-in-dom-based-on-an-attribute-value

I quickly skimmed through the changes. It seems that the CONTEXT part was thrown away. Would it be better to preserve it in some way?
i18next supports it but I reckon it might be confusing for the translators. I mean the context string would be appended to the key name and that might confuse the translators. Also I didn't found a way to have context on html.

I wanted to also talk about this. I am not sure what to do here. I understand the value of having context but the ergonomics don't fit our workflow. i18next is designed around having a single variable-like key and not a string-like key. i18next-parser has a config setting contextSeparator: '_'. But I am not sure what the hell Transifex translators will be presented with.

Transifex would very likely to present the full string which is not what we want for the translators.

I thought about this and I reckon that the CONTEXT isn't that useful. It would be useful if the translation strings is too many to hold it in one js Object/Map and needed to use CONTEXT to access different Object/Map.
What is really useful is the disambiguation field from Qt (https://doc.qt.io/qt-6/qobject.html#tr). We do encounter cases in the past that required this field.
However JSON on Transifex doesn't support it and we would need other file format such as Gettext (.po) or Qt Linguist (.ts).
So the workflow would look like this:
HTML source:

<p class="qbt-translatable" data-i18n="Some text{disambiguation}">Some text</P>

Extracted JSON:

"Some text{disambiguation}": ""

translations_qt2i18next.py would convert the JSON file into .po (or .ts) and the {disambiguation} string will be removed from key field and moved into comment field. Then upload to Transifex.
Downloading from Transifex would reverse the process.

sledgehammer999 · 2024-03-28T08:00:00Z

I wonder if we could further optimize replaceI18nText() to filter nodes that have the data-i18n attr and drop the requirement for adding the qbt-translatable class.

Maybe this, I haven't test: https://stackoverflow.com/questions/2694640/find-an-element-in-dom-based-on-an-attribute-value

I quickly skimmed through the changes. It seems that the CONTEXT part was thrown away. Would it be better to preserve it in some way?
i18next supports it but I reckon it might be confusing for the translators. I mean the context string would be appended to the key name and that might confuse the translators. Also I didn't found a way to have context on html.

I wanted to also talk about this. I am not sure what to do here. I understand the value of having context but the ergonomics don't fit our workflow. i18next is designed around having a single variable-like key and not a string-like key. i18next-parser has a config setting contextSeparator: '_'. But I am not sure what the hell Transifex translators will be presented with.

Transifex would very likely to present the full string which is not what we want for the translators.

I thought about this and I reckon that the CONTEXT isn't that useful. It would be useful if the translation strings is too many to hold it in one js Object/Map and needed to use CONTEXT to access different Object/Map. What is really useful is the disambiguation field from Qt (https://doc.qt.io/qt-6/qobject.html#tr). We do encounter cases in the past that required this field. However JSON on Transifex doesn't support it and we would need other file format such as Gettext (.po) or Qt Linguist (.ts). So the workflow would look like this: HTML source:
Some text
Extracted JSON:
"Some text{disambiguation}": ""
translations_qt2i18next.py would convert the JSON file into .po (or .ts) and the {disambiguation} string will be removed from key field and moved into comment field. Then upload to Transifex. Downloading from Transifex would reverse the process.

I was thinking about the problem this past day and I came to a similar conclusion.
Optimize replaceI18nText() according to the SO answer. Rework the tstool.py to merge/split JSON from Qt .ts files, therefore not needing to create a new Transifex resource.
Same thoughts about disambugation/context.
Currently our QBT_TR(string)QBT_TR template doesn't support disambugation so I thought we might not need it after all. However after reading your thoughts how about this: Use i18next's context as Qt's disambugation?

Chocobo1 · 2024-03-29T07:08:42Z

Optimize replaceI18nText() according to the SO answer.

I'll experiment on public login page first and submit a PR if the results are satisfactory.

However after reading your thoughts how about this: Use i18next's context as Qt's disambugation?

Unless you mean to create our own string extractor, I think there are some problems with i18next context (and probably i18next-parser):

The context will prepend an underscore and appended to the key. The resulting key would be ambiguous to decide which is which. Imagine there are underscores in both key and disambugation field. This might introduce problems to our auxiliary python scripts.
The resulting key will confuse translators on Transifex. They can't tell which is disambugation and will probably translate the string as a whole.
The data-i18n attribute doesn't have a field for disambugation. Only i18next.t() have it.

Which is why I used the form: string{disambugation} (or string{{disambugation}}).
https://www.i18next.com/translation-function/context#basic

sledgehammer999 · 2024-03-29T08:10:25Z

Optimize replaceI18nText() according to the SO answer.

I'll experiment on public login page first and submit a PR if the results are satisfactory.

Yeah, I wanted to introduce a PR this weekend too.

However after reading your thoughts how about this: Use i18next's context as Qt's disambugation?

Unless you mean to create our own string extractor, I think there are some problems with i18next context (and probably i18next-parser):
1. The `context` will prepend an underscore and appended to the key. The resulting key would be ambiguous to decide which is which. Imagine there are underscores in both `key` and `disambugation` field. This might introduce problems to our auxiliary python scripts.

2. The resulting key will confuse translators on Transifex. They can't tell which is disambugation and will probably translate the string as a whole.

3. The `data-i18n` attribute doesn't have a field for `disambugation`. Only `i18next.t()` have it.
Which is why I used the form: string{disambugation} (or string{{disambugation}}). https://www.i18next.com/translation-function/context#basic

I have slightly different understanding on how these work and how I intend them to work:

The context separator character (aka underscore) is configurable in both i18next and i18next-parser. We can choose another character that is highly unlikely to appear as a legit character in the english string.
The data-i18n attribute will need extra parsing logic anyway on our end to support multiple keys separated via ; and prefixed with [attr-name]. We just add logic to separate out the context too.
Our python script will separate the context too, and insert it into the <comment> tag of Qt's .ts files. The .ts file will be on Transifex and Transifex will correctly present the main string and the comment as separate things.

Maybe my thoughts will be conveyed better in code. I'll try to have a PR ready today/tomorrow.

sledgehammer999 · 2024-03-31T02:53:49Z

Take a look at #20643 first.

sledgehammer999 added 13 commits March 25, 2024 23:15

Convert translatable strings part1

93a4b98

Search: (<((?!class=).)*?)>QBT_TR$(.+?)$QBT_TR(\[CONTEXT=.+?\])? Replace: $1 class="qbt-translatable" data-i18n="$3">$3 File glob: src/webui/www/private/**/*.html

Convert translatable strings part2

ec528f2

Search: (<.+?class=".+?)(".*?)>QBT_TR$(.+?)$QBT_TR(\[CONTEXT=.+?\])? Replace: $1 qbt-translatable$2 data-i18n="$3">$3 File glob: src/webui/www/private/**/*.html

Convert translatable strings part3

2c6f63d

Search: (<(?:(?!class=).)+?)(\S+?)="QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?" Replace: $1class="qbt-translatable" data-i18n="[$2]$3" $2="$3" File glob: src/webui/www/private/**/*.html

Convert translatable strings part4

74274e7

Search: (<.+?class=".+?)(".*?)(\S+?)="QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?" Replace: $1 qbt-translatable$2data-i18n="[$3]$4" $3="$4" File glob: src/webui/www/private/**/*.html

Convert translatable strings part5

45f1891

Search: (<.+?data-i18n=".+?)(".*?)(\S+?)="QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?" Replace: $1;[$3]$4$2$3="$4" File glob: src/webui/www/private/**/*.html

Convert translatable strings part6

e45c36d

Search: (["'])QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?\1 Replace: i18next.t($1$2$1) File glob: src/webui/www/private/**/*.html

Convert translatable strings part7

9cc3f16

Manual conversion to template literal Search: (["'])QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?\s*?\1 Replace: i18next.t($1$2$1) File glob: src/webui/www/private/**/*.html

Convert translatable strings part8

709bee0

Search: `(.*?)QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])? Replace: `$1\${i18next.t('$2')} File glob: src/webui/www/private/**/*.html

Convert translatable strings part9

ccb2cce

Search: (["'])(\s*? )?QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?((\\n)*?)\1 Replace: `$2${i18next.t($1$3$1)}$4` File glob: src/webui/www/private/**/*.html

Convert translatable strings part10

c4d67dd

Search: (["'])QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?\1 Replace: i18next.t($1$2$1) File glob: src/webui/www/private/**/*.js

Convert translatable strings part11

e16ca08

Search: (["'])(\s*? )?QBT_TR$(.+?)$QBT_TR(?:\[CONTEXT=.+?\])?\1 Replace: `$2${i18next.t($1$3$1)}` File glob: src/webui/www/private/**/*.js

Convert translatable strings part12

3179e29

Manual conversion

Manually replace double classes

e205291

sledgehammer999 added WebUI WebUI-related issues/changes Translations Related to i18n or Transifex labels Mar 26, 2024

sledgehammer999 requested a review from Chocobo1 March 26, 2024 00:13

Chocobo1 reviewed Mar 26, 2024

View reviewed changes

This comment was marked as off-topic.

Sign in to view

Chocobo1 mentioned this pull request Apr 6, 2024

WebUI: Optimize i18next stuff #20643

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Convert private WebUI strings to i18next #20610

[WIP] Convert private WebUI strings to i18next #20610

sledgehammer999 commented Mar 26, 2024

sledgehammer999 commented Mar 26, 2024 •

edited

sledgehammer999 commented Mar 26, 2024

Chocobo1 Mar 26, 2024

Chocobo1 commented Mar 26, 2024

sledgehammer999 commented Mar 26, 2024

Chocobo1 commented Mar 28, 2024

sledgehammer999 commented Mar 28, 2024 •

edited

Chocobo1 commented Mar 29, 2024 •

edited

sledgehammer999 commented Mar 29, 2024 •

edited

This comment was marked as off-topic.

sledgehammer999 commented Mar 31, 2024

[WIP] Convert private WebUI strings to i18next #20610

Are you sure you want to change the base?

[WIP] Convert private WebUI strings to i18next #20610

Conversation

sledgehammer999 commented Mar 26, 2024

sledgehammer999 commented Mar 26, 2024 • edited

sledgehammer999 commented Mar 26, 2024

Chocobo1 Mar 26, 2024

Choose a reason for hiding this comment

Chocobo1 commented Mar 26, 2024

sledgehammer999 commented Mar 26, 2024

Chocobo1 commented Mar 28, 2024

sledgehammer999 commented Mar 28, 2024 • edited

Chocobo1 commented Mar 29, 2024 • edited

sledgehammer999 commented Mar 29, 2024 • edited

This comment was marked as off-topic.

sledgehammer999 commented Mar 31, 2024

sledgehammer999 commented Mar 26, 2024 •

edited

sledgehammer999 commented Mar 28, 2024 •

edited

Chocobo1 commented Mar 29, 2024 •

edited

sledgehammer999 commented Mar 29, 2024 •

edited