Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Correctly detect the encoding #4160

Closed
alexandernst opened this issue Nov 12, 2014 · 10 comments
Closed

Correctly detect the encoding #4160

alexandernst opened this issue Nov 12, 2014 · 10 comments

Comments

@alexandernst
Copy link

Atom should try to guess the encoding when opening a file as best as it can and only if it fails, fallback to the default encoding option (which was implemented a few days ago).

Missing libraries (that would presumably help with this issue) shouldn't be a factor here. If there isn't any library that could be used, one should be created from 0 or maybe patch the one that could get the job done with as little amount of work as possible.

The main issue here is that Atom will try to open files in UTF8 (or the default encoding, if set), which will potentially break any non-UTF8 files. And that, imho, is not an acceptable/valid behavior for any editor.

This should be tagged as 1.0 blocker.

Also, maybe this could help:

@bizoo
Copy link

bizoo commented Dec 23, 2014

IMHO, a highly desirable feature.
Like in SublimeText, an automatic detection of BOM for UTF files, or UTF8 without BOM (by trying to decode them as UTF8) is enough.
If it's not an UTF file, open it with default encoding.

IMHO, encoding detectors library lead to unpredictable result that is worst than the current behavior.

@benogle
Copy link
Contributor

benogle commented Jul 1, 2015

We could add an option to allow for auto-detection as the default as described in atom/encoding-selector#24

@benogle
Copy link
Contributor

benogle commented Jul 9, 2015

This would also be helpful for find/replace: atom/scandal#26

We'd need to bundle libicu which looks to be about ~18meg.

@rugk
Copy link

rugk commented Aug 9, 2015

Also note my problems with ANSI encoding. They should fit in this category.

@rogeriopradoj
Copy link

Same problem for me here. When Microsoft SQL Server generates scripts from database, it creates in Unicode but UTF-16.

When sublime tries to open it in UTF-8 it gets like this:

opera��o

... what should be:

operação

It is the same if if try to create files in ANSI.

@andresmendes
Copy link

  • I use files in UTF-8 and windows-1252.
  • My default character set encoding is UTF-8.
  • I use the package auto-encoding to change automatically the encoding.

When I open a windows-1252 file with the words "aço" and "papelão" Atom changes to the right encoding (windows-1252) and the result is perfect:

aço papelão

But when I open a windows-1252 file with the word "operação" (similar to what @rogeriopradoj did) Atom changes the encoding to windows1251 and exhibits:

operaзгo

When the word is "operações" Atom changes the encoding to ibm855 and exhibits:

operaушes

Two special characters side by side (like çã or çõ) seems to confuse the auto encoding.

@LeonBlade
Copy link

I might as well leave this here but I had a problem with saving a UTF-16 LE file and it defaulted to binary encoding when saving it and wouldn't load properly because of it. I had to save the file with another text editor just to fix the file.

@4llan
Copy link

4llan commented Apr 18, 2016

Sometimes I have to work with files with Western (ISO-8859-1) encoding.

  • Open the file: Atom select UTF-8 by default (� everywhere)
  • "Auto Detect" change the encoding to Windows 1251 (words like inúmeras and básico goes inъmeras and bбsico)

Would be nice if Atom has a smart detection of encoding (never had this problem in Sublime) and default encoding can be set to "Auto Detect" (atom/encoding-selector#24)

@damieng
Copy link
Contributor

damieng commented Apr 23, 2016

Tracking this feature request under the package responsible for it atom/encoding-selector#24

@damieng damieng closed this as completed Apr 23, 2016
@lock
Copy link

lock bot commented Apr 9, 2018

This issue has been automatically locked since there has not been any recent activity after it was closed. If you can still reproduce this issue in Safe Mode then please open a new issue and fill out the entire issue template to ensure that we have enough information to address your issue. Thanks!

@lock lock bot locked and limited conversation to collaborators Apr 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants