Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown table fixes #601

Merged
merged 7 commits into from
May 30, 2024
Merged

Markdown table fixes #601

merged 7 commits into from
May 30, 2024

Conversation

naktinis
Copy link
Contributor

@naktinis naktinis commented May 17, 2024

Contains the following fixes:

  • do not add new lines in markdown cells
  • markdown tables can only have one header
  • add a space after text before a paragraph in a cell
  • match maximum cell count on each row
  • cells always need to append vertical bars

Partly fixes #599.

Paragraph is a block level element so we would normally have a new line before
it. However, here we are in a markdown cell and can't have new lines, so add a
space. Otherwise words will get concatenated.
Markdown does not support colspan, but at least this way we don't lose any cell
data.
Currently there was a scenario where if a cell only contains a single <p> with
some text but no text directly, then vertical bars would not get appended for
that cells.
@adbar
Copy link
Owner

adbar commented May 21, 2024

@naktinis Thanks for the PR, the code looks good but tests fails in test_table_processing() (part of unit_tests.py).

Could you please have a look, change the code (or the test if it doesn't work as intended), and add tests there for your PR if necessary?

Added the following table tests in text format:
- removing new lines in cells,
- only allowing a single header row,
- handling colspan by appending columns.
@naktinis
Copy link
Contributor Author

@adbar I believe I have now fixed the tests and added three more relating to these changes.

Copy link

codecov bot commented May 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.94%. Comparing base (1ce0e76) to head (91f7a78).
Report is 4 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #601   +/-   ##
=======================================
  Coverage   97.94%   97.94%           
=======================================
  Files          21       21           
  Lines        3498     3508   +10     
=======================================
+ Hits         3426     3436   +10     
  Misses         72       72           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@adbar
Copy link
Owner

adbar commented May 24, 2024

@naktinis Thanks, it's much better now!

I have a few questions:

  • You're introducing a span attribute because otherwise it would be difficult to keep track of the total number of columns, right?
  • Why does it yield <rowspan="2"> (without space) and <row span="7"> in the tests? I would suggest to remove the attribute from the output once it has been used.

@naktinis
Copy link
Contributor Author

@adbar

You're introducing a span attribute because otherwise it would be difficult to keep track of the total number of columns, right?

Yes, so that each row is aware of any other rows having more columns than it does itself.

Why does it yield <rowspan="2"> (without space) and in the tests? I would suggest to remove the attribute from the output once it has been used.

Because the previous line (that was already there before my changes) removes spaces:

result = processed.replace('\n', '').replace(' ', '')

@adbar
Copy link
Owner

adbar commented May 27, 2024

Then I'd be in favor of removing the attribute from the output:

  • somewhere after for subelement in table_elem.iterdescendants():
  • del newrow.attrib["span"] (if it's present)

Could you please implement the change (or a similar one) and update the tests?

@naktinis
Copy link
Contributor Author

@adbar row attribute cleanup should now be done (also updated the tests).

@adbar adbar merged commit bbf7bec into adbar:master May 30, 2024
15 checks passed
@adbar
Copy link
Owner

adbar commented May 30, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Table markdown syntax incorrect in some cases
2 participants