Support encodings other than UTF-8 in codespan-reporting #187

brendanzab · 2020-03-09T04:50:31Z

Some languages (like Javascript and Haskell) use encodings like UTF-16 for their source files. It would be great if codespan-reporting would allow for use cases such as this in a pretty painless way.

The text was updated successfully, but these errors were encountered:

Johann150 · 2021-01-29T15:21:27Z

Since Rusts string data types are required to be valid UTF-8 encoding, I think it would be reasonable to require implementers to convert source code to UTF-8 before passing it to codespan-reporting, especially because it is trivially doable by using std::string::String::from_utf16. Otherwise to convert the ranges in the Labels, it would be quite bad performance because we would have to iterate through the original source for each index. Alternatively we could keep track of all indices and change them just before rendering, but that would introduce quite a bit of complexity compared to using UTF-8 text in the library and the binary using it.

For other encodings it might be more difficult to convert them to UTF-8 (i.e. the standard library might not have support for it, but there would probably still already exist a way to do it). But I think the added complexity for codespan (and probably working with anything else) would not be justified here.

brendanzab mentioned this issue Jun 17, 2020

support rendering of errors if a file is only partially encoded in UTF-8 #246

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support encodings other than UTF-8 in codespan-reporting #187

Support encodings other than UTF-8 in codespan-reporting #187

brendanzab commented Mar 9, 2020

Johann150 commented Jan 29, 2021

Support encodings other than UTF-8 in codespan-reporting #187

Support encodings other than UTF-8 in codespan-reporting #187

Comments

brendanzab commented Mar 9, 2020

Johann150 commented Jan 29, 2021