Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support encodings other than UTF-8 in codespan-reporting #187

Open
brendanzab opened this issue Mar 9, 2020 · 1 comment
Open

Support encodings other than UTF-8 in codespan-reporting #187

brendanzab opened this issue Mar 9, 2020 · 1 comment

Comments

@brendanzab
Copy link
Owner

Some languages (like Javascript and Haskell) use encodings like UTF-16 for their source files. It would be great if codespan-reporting would allow for use cases such as this in a pretty painless way.

@Johann150
Copy link
Collaborator

Since Rusts string data types are required to be valid UTF-8 encoding, I think it would be reasonable to require implementers to convert source code to UTF-8 before passing it to codespan-reporting, especially because it is trivially doable by using std::string::String::from_utf16. Otherwise to convert the ranges in the Labels, it would be quite bad performance because we would have to iterate through the original source for each index. Alternatively we could keep track of all indices and change them just before rendering, but that would introduce quite a bit of complexity compared to using UTF-8 text in the library and the binary using it.

For other encodings it might be more difficult to convert them to UTF-8 (i.e. the standard library might not have support for it, but there would probably still already exist a way to do it). But I think the added complexity for codespan (and probably working with anything else) would not be justified here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants