-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow extracting (deeply) nested calls in Python and Javascript #1127
base: master
Are you sure you want to change the base?
Allow extracting (deeply) nested calls in Python and Javascript #1127
Conversation
During the refactor, the order of extraction was also changed, as you can see in this test: It is now the same as the extraction order of e.g. # NOTE: Main Comment
_("Hello %s",
# NOTE: Nested Comment
_("Nested Gettext")
) Both terms would get their right comment extracted. |
Not saying this is not worth fixing, but out of curiosity, do nested gettext calls actually come up often? I don't think I've ever come across one.. |
@tomasr8 In our own codebase with lots of developers, people assume it works and it happens from time to time that they add in nested gettext calls. Even the deeply nested ones happen, like this example: https://github.com/odoo/odoo/pull/149921/files#diff-e073b7fa9d45d46ba8d7f011257b0e77e1f87bf47982abc63dd618ff05dddb1aL267-L268 |
e6995c9
to
9131a83
Compare
UPDATE: I added an extra commit to also allow nested calls in the Javascript extractor. If it's better to open a separate PR for that, no problem. |
9131a83
to
4df7e66
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments within, including some that would make this easier to review for me 😄
babel/messages/extract.py
Outdated
function_stack.append({ | ||
'function_line_no': line_no, | ||
'function_name': last_name, | ||
'message_line_no': None, | ||
'messages': [], | ||
'translator_comments': cur_translator_comments, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a typing.NamedTuple
or a dataclass would be more appropriate than a dict for this state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I changed it into a dataclass, since it's supposed to be mutable.
babel/messages/extract.py
Outdated
# Keep track of the (split) strings encountered | ||
message_buffer = [] | ||
|
||
for token, value, (line_no, _), _, _ in tokens: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny thing, but could the local line_no
be renamed back to lineno
? It would make reviewing easier since the diff is smaller 😅
(Similarly, line_no
elsewhere should maybe be lineno
for consistency and compat.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right. I was a bit too eager with the renaming here. I changed everything back to their original name.
babel/messages/extract.py
Outdated
jsx=options.get('jsx', True), | ||
template_string=options.get('template_string', True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spurious changes, please revert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted
tests/messages/test_extract.py
Outdated
assert messages[0][2] == ('Hello, {name}!', None) | ||
assert messages[0][2] == 'Foo Bar' | ||
assert messages[0][3] == ['NOTE: First'] | ||
assert messages[1][2] == 'Foo Bar' | ||
assert messages[1][3] == [] | ||
assert messages[2][2] == ('Hello, {name1} and {name2}!', None) | ||
assert messages[1][2] == ('Hello, {name}!', None) | ||
assert messages[1][3] == ['NOTE: First'] | ||
assert messages[2][2] == 'Heungsub' | ||
assert messages[2][3] == ['NOTE: Second'] | ||
assert messages[3][2] == 'Heungsub' | ||
assert messages[3][2] == 'Armin' | ||
assert messages[3][3] == [] | ||
assert messages[4][2] == 'Armin' | ||
assert messages[4][3] == [] | ||
assert messages[5][2] == ('Hello, {0} and {1}!', None) | ||
assert messages[4][2] == ('Hello, {name1} and {name2}!', None, None) | ||
assert messages[4][3] == ['NOTE: Second'] | ||
assert messages[5][2] == 'Heungsub' | ||
assert messages[5][3] == ['NOTE: Third'] | ||
assert messages[6][2] == 'Heungsub' | ||
assert messages[6][2] == 'Armin' | ||
assert messages[6][3] == [] | ||
assert messages[7][2] == 'Armin' | ||
assert messages[7][3] == [] | ||
assert messages[7][2] == ('Hello, {0} and {1}!', None, None) | ||
assert messages[7][3] == ['NOTE: Third'] | ||
assert messages[8][2] == 'Person' | ||
assert messages[8][3] == ['NOTE: Fourth'] | ||
assert messages[9][2] == ('Hello %(person)', None) | ||
assert messages[9][3] == ['NOTE: Fourth'] | ||
assert messages[10][2] == 'Person 1' | ||
assert messages[10][3] == [] | ||
assert messages[11][2] == 'Person 2' | ||
assert messages[11][3] == [] | ||
assert messages[12][2] == ('Hello %(people)', None) | ||
assert messages[12][3] == ['NOTE: Fifth'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this test be rewritten in a... less verbose way? Looks like it's only looking at indices 2 and 3 of each message, so maybe redo it as something like
assert [(m[2], m[3]) for m in messages] == [
(..., ...),
(..., ...),
(..., ...),
...
]
?
I reckon it would be easy to generate the ... segment by doing assert [(m[2], m[3]) for m in messages] == 8
or similar and copy-pasting the complaint pytest -vv
would inevitably throw :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point. I changed it like that, and it looks a lot cleaner.
I'm not a big fan of the token-based extractor getting even more complex. I'm thinking we might be able to replace the python extractor with a |
So I did some investigation and an AST-based extractor cuts down the complexity quite a bit. However, it's about twice as slow compared to the current extractor. @akx Given the slowdown, is this something worth pursuing in your opinion? |
friendly ping @akx :) |
4df7e66
to
43fed4c
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1127 +/- ##
==========================================
+ Coverage 91.37% 91.46% +0.08%
==========================================
Files 27 27
Lines 4672 4673 +1
==========================================
+ Hits 4269 4274 +5
+ Misses 403 399 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Currently the Python extractor does not support deeply nested gettext calls (deeper than as a direct argument to the top-level gettext call). e.g. ```py _("Hello %s", _("Person")) _("Hello %s", random_function(", ".join([_("Person 1"), _("Person 2")]))) ``` The extraction code was refactored quite a bit to simplify the flow and support this use-case. Fixes python-babel#1125 (meanwhile also fixes python-babel#1123)
43fed4c
to
54d6dd9
Compare
54d6dd9
to
dc6908f
Compare
Currently the Javascript extractor does not support nested gettext calls at all. The extraction code was refactored a bit to resemble the Python code as much as possible and support this use-case.
1942e74
to
50be29e
Compare
Currently the Python extractor does not support deeply nested gettext calls (deeper than as a direct argument to the top-level gettext call).
e.g.
The extraction code was refactored quite a bit to simplify the flow and support this use-case.
Currently the Javascript extractor does not support nested gettext calls at all.
The extraction code was refactored a bit to resemble the Python code as much as possible and support this use-case.
Fixes #1125 (meanwhile also fixes #1123)