Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(treesitter): get_node_text() newline handling consistency #28841

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dmehala
Copy link

@dmehala dmehala commented May 18, 2024

Description

This pull request addresses the issue where get_node_text() behaves inconsistently between buffer and string sources.

Resolves #28677

Changes

  • Modified buf_range_get_text to insert the trailing newline char.

Rationale

My understanding is taht the range represents the part of the source to be captured. When set on a grid:

  0 1 2 3 4 5 6 7
0 " " " t e s t \n
1 " " "

With the range being Range = { 0, 3, 1, 0 }. I would then expect \n to be part of the output. Hence, I modified the behavior of the buffer source.

nvim_buf_get_text replaces \n with \0, and they are artificially recreated with table.concat(lines, "\n"). This function does not add the newline if there's only one input or if the last input is a newline. I added logic to remember if the part of the buffer to release ends with a newline to add it later.

Note to reviewers

Please feel free to correct me if my understanding is incorrect. I would be happy to discuss it further.

- Modified `buf_range_get_text` to insert the trailing newline char.
@wookayin
Copy link
Member

I think the fix makes sense, but please review failing tests and fix accordingly.

@cshuaimin
Copy link

I don't understand what the if end_col == 0 condition is doing. Seems it's the same logic as in ts_utils.get_vim_range() function? Can anyone explain it to me?

Comment on lines 191 to 199
if end_col == 0 then
if start_row == end_row then
start_col = -1
start_row = start_row - 1
end
end_col = -1
end_row = end_row - 1
insert_nl = true
end
Copy link
Member

@wookayin wookayin May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we could also get rid of this if end_col == 0 then ... end block completely to achieve the same behavior, and use directly what nvim_buf_get_text() returns. But we should make sure the behavior is exactly the same for every possible path w.r.t {start_end}_{row,col}.

That said, one might wonder why get_node_text was designed in the current way. It was introduced in 58bbc2e --- but it's not clear why or if it was deliberate. Maybe to match the behavior as the initial implementation 1f3c059 getting text with rows only (i.e. using nvim_buf_get_lines).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ack the message. I will have a deeper look this week. Thanks for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

get_node_text() behaves differently between buffer and string sources on EOL char
3 participants