GithubHelp home page GithubHelp logo

seanliang / converttoutf8 Goto Github PK

View Code? Open in Web Editor NEW
897.0 68.0 181.0 364 KB

A Sublime Text 2 & 3 plugin for editing and saving files encoded in GBK, BIG5, EUC-KR, EUC-JP, Shift_JIS, etc.

Home Page: https://github.com/seanliang/ConvertToUTF8

License: MIT License

Python 100.00%

converttoutf8's Introduction

Description (中文说明见README.zh_CN.md)

With this plugin, you can edit and save the files which encodings are not supported by Sublime Text currently, especially for those used by CJK users, such as GB2312, GBK, BIG5, EUC-KR, EUC-JP, etc. ConvertToUTF8 supports both Sublime Text 2 and 3.

ConvertToUTF8

If you want to support this plugin, you can donate via Alipay or WeChat. Thanks! :)

Alipay QR code WeChat QR code

Note

** Windows 7 (Sublime Text 3): When Windows DPI Scaling is set to a value higher than 100%, the file name might not be displayed correctly, please try to add "dpi_scale": 1 to User Settings of Sublime Text.

** Linux (Sublime Text 2 & 3) and OSX (Sublime Text 3): You will need to install an extra plugin to make ConvertToUTF8 work properly: Codecs26 for Sublime Text 2 or Codecs33 for Sublime Text 3.

Installation

Using Package Control to find, install and upgrade ConvertToUTF8 is the recommended method to install this plug-in.

Otherwise, you can download this repository as a zip file, unzip it, and rename the new folder to ConvertToUTF8, then move this folder to Packages folder of Sublime Text (You can find the Packages folder by clicking "Preferences > Browse Packages" menu entry in Sublime Text).

Your folder hierarchy should look like this:

Folder Hierarchy

Configuration

Please check ConvertToUTF8.sublime-settings file for details. You should save your personal settings in a file named "ConvertToUTF8.sublime-settings" under "User" folder. You can set project-specific settings (except encoding_list and max_cache_size) in the .sublime-project file which can be opened via "Project > Edit Project" menu.

  • encoding_list: encoding selection list when detection is failed
  • reset_diff_markers: reset diff markers after converting (default: true)
  • max_cache_size: maximum encoding cache size, 0 means no cache (default: 100)
  • max_detect_lines: maximum detection lines, 0 means unlimited (default: 600)
  • preview_action: convert the file's content to UTF-8 when previewing it (default: false)
  • default_encoding_on_create: specific the default encoding for newly created file (such as "GBK"), empty value means using sublime text's "default_encoding" setting (default: "")
  • convert_on_load: convert the file's content to UTF-8 when it is loaded (default: true)
  • convert_on_save: convert the file's content from UTF-8 to its original (or specific) encoding when it is saved (default: true)
  • convert_on_find: convert the text in Find Results view to UTF-8 (default: false)
  • lazy_reload: save file to a temporary location, and reload it in background when switching to other windows or tabs (default: false)
  • confidence: the minimum confidence rate which the converting will be performed automatic. (default: 0.95)

Usage

In most cases, this plug-in will take care of encoding issues automatically.

You can also use the "File > Set File Encoding to" menu entry to transform between different encodings. For example, you can open a UTF-8 file, and save it to GBK, and vice versa.

Note:

  • if convert_on_save is set to false, the file will NEVER be saved to the selected encoding
  • please do not edit the file before the encoding detection process is finished
  • please try either increasing the value of max_detect_lines or set the encoding manually if the detection result is not accurate
  • due to limitation of API, when lazy_reload is set to true, quit Sublime Text immediately after saving a file will cause the file to be saved as UTF-8, the correct content will be reload next time Sublime Text starts

Q & A

  • Q: It is not working after installation, how do I fix it?

    A: Please try the following steps:

    1. Restart Sublime Text
    2. Make sure the plug-in folder is named "ConvertToUTF8" (skip this step if you install via "Package Control")
    3. See Note section above
    4. Disable other encoding related plug-ins
    5. Contact me
  • Q: Which encodings are supported?

    A: Any encoding supported by Python will be fine, other encodings like EUC-TW will not be supported.

  • Q: Why does the content become a mess when the window is re-activated?

    A: This is caused by reloading and has been fixed, please update your ConvertToUTF8 to latest version.

  • Q: Why does ST2 ask me that file "Has changed on disk. Do you want to reload it?" when the window is re-activated.

    A: Same reason as above. Please choose "Cancel" if you have unsaved changes to the file.

  • Q: When saving the file, Sublime Text tells me the file is saved as UTF-8, why?

    A: Don't worry, the plug-in will convert your file to original encoding.

  • Q: My file was saved as UTF-8 and it's in a mess, how can I recover it?

    A: Please open the file and make sure its encoding is UTF-8, then choose the menu entry "File > Save with Encoding > Western (Windows 1252)", close and reopen this file.

Contact me

Please send me your questions or suggestions: sunlxy (at) yahoo.com or http://weibo.com/seanliang

converttoutf8's People

Contributors

allxiao avatar demon386 avatar edwingeng avatar fichtefoll avatar fygul avatar gh640 avatar hashy avatar michaelhl avatar nullnull avatar seanliang avatar yoonian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

converttoutf8's Issues

保存后不恢复 view 位置

长文件,保存之后 view 到文件头部,cursor 位置是不变的,因为前移一下还会回到原来 view 的地方。是不是保存之后没调用 adjust_view?

通过右键用ST3打开文件无法触发on_load事件,不能自动转换

我使用网上流传的如下方法,实现通过右键用ST3打开文件,

hkcr,"\shell\SublimeText3",,,"用 SublimeText3 打开"
hkcr,"
\shell\SublimeText3\command",,,"""D:\Program Files (x86)\Sublime Text 3\sublime_text.exe"" ""%%1"" %%*"
hkcr,"Directory\shell\SublimeText3",,,"用 SublimeText3 打开"
hkcr,"Directory\shell\SublimeText3\command",,,"""D:\Program Files (x86)\Sublime Text 3\sublime_text.exe"" ""%%1"""

但是这样打开文件无法自动转换。估计是因为这样无法触发on_load事件。

重新激活窗口,里面的内容依然变乱码

已经是最新版插件(今天新安装的),修改一个GBK文件后,保存,切换到另一个窗口(比如浏览器),然后再切换回ST2,文件变成乱码了。

如果只是打开文件,没有进行修改和保存的操作,切换窗口不会导致该问题。

ST2的版本是刚刚发布的正式版。

convertToUTF8 and 'goto anything'

Using convertToUTF8 package,

when i pop 'goto anything' overray and move cursor among ( none utf8 ) files ,

focused file just opened and tab created.

'goto anything' funtion is...

It should not be opened , just be shown until enterkey pressed.

so i changed 'on_load listener' to 'on_activated' in py file

and then works right but preview is not.

I think it's just temporary and not a good way.

sorry, my english is terrible..

i hope you can understand my messy sentence

thanks for nice plugin

Error Messages of BIG5-HKSCS

Due to the limitation of embedded Python with Sublime Text, ConvertToUTF8 might not work properly.

You have to install an extra plugin to solve this problem, please kindly send the debug information to sunlxy#yahoo.com to get it:
====== Debug Information ======
Version: 3047
Platform: osx
Arch: x64
Path: ['/Applications/Sublime Text.app/Contents/MacOS', '/Applications/Sublime Text.app/Contents/MacOS/python3.3.zip', '/Users/mac1/Library/Application Support/Sublime Text 3/Packages']
Encoding: BIG5-HKSCS

Caught AttributeError when openning first file

After disabling and enabling the plugin, when I opened some file, I got an error in console:

Traceback (most recent call last):
  File "C:\Users\名字\AppData\Roaming\Sublime Text 3\Packages\ConvertToUTF8\ConvertToUTF8.py", line 50, in save_on_dirty
    sublime.set_timeout(self.save_on_dirty, 10000)
AttributeError: 'NoneType' object has no attribute 'set_timeout'

Then the plugin seems stucked and does not convert any file until I disable an enable it again.

Could you guide me to solve that?

撤回 (undo) 两次才能到未修改状态?

打开一个会发生转码的文件(如 gbk),假设我打一个字符,按一次 undo,这时文件还是被修改状态。要再按一次 undo 才可以。这是否正常呢?如果是插件带来的问题,有没可能解决?

It failed to convert from EUC-KR to UTF8 on sublime text 2.

I've seen the 'EUC-KR' is not supported' message on popup and following messages are shown on the editor.

Currently I'm using Sublime Text 2 'Unregistered' Version 2.0.1 Build 2217.

Due to the limitation of embedded Python with Sublime Text, ConvertToUTF8 might not work properly.

You have to install an extra plugin to solve this problem, please kindly send the debug information to sunlxy#yahoo.com to get it:
====== Debug Information ======
Version: 2217
Platform: linux
Arch: x32
Path: ['/opt/sublime_text2/lib/python26.zip', '/opt/sublime_text2/lib/python2.6', '/opt/sublime_text2/lib/python2.6/plat-linux2', '/opt/sublime_text2/lib/python2.6/lib-tk', '/opt/sublime_text2/lib/python2.6/lib-old', '/opt/sublime_text2/lib/python2.6/lib-dynload', '.', u'/home/jhkang/.config/sublime-text-2/Packages/Package Control/lib/all']
Encoding: EUC-KR

Ctrl+R就会乱码

一个没修改的GBK文件
Ctrl+R弹出函数列表 就乱码了
快捷键Ctrl+P然后输入@就不会

这是什么原因?

ConvertToUTF8出错

Due to the limitation of embedded Python with Sublime Text, ConvertToUTF8 might not work properly.

You have to install an extra plugin to solve this problem, please kindly send the debug information to sunlxy#yahoo.com to get it:
====== Debug Information ======
Version: 2221
Platform: linux
Arch: x64
Path: ['/usr/local/Sublime Text 2/lib/python26.zip', '/usr/local/Sublime Text 2/lib/python2.6', '/usr/local/Sublime Text 2/lib/python2.6/plat-linux2', '/usr/local/Sublime Text 2/lib/python2.6/lib-tk', '/usr/local/Sublime Text 2/lib/python2.6/lib-old', '/usr/local/Sublime Text 2/lib/python2.6/lib-dynload', '.', u'/home/jash/.config/sublime-text-2/Packages/Codecs26/lib', u'/home/jash/.config/sublime-text-2/Packages/Package Control/lib/all']
Encoding: GBK

choose default encoding for files created in Windows

Every time I create an empty txt file in Windows (by right-clicking mouse in empty place in File Explorer and select "new" then "txt file" ) and open it in Sublime, Sublime always pop up an option panel to ask me to select the encoding of this file (from GBK to UTF8).

Can I set any thing to default so that I can avoid the poped up menu? If impossible, could you please add this feature please?

Thx!

For Cuda

Hi; Maybe possible to do this encoding/decoding, for Cudatext [uvviewsoft.com]?

JSON扩展名文件编码自动识别错误

OSX 10.9.3
SublimeText 2.0.2
ConvertToUTF8 1.2.5

发现UTF8格式保存的.json文件当存在中文字符(前十行内的)且无BOM的情况下打开,自动识别过程里文件会被当作GBK处理,然后导致转换乱码,求证是否确实是个问题哦?
有BOM会没问题,但是别处解析JSON就会不正常的。

谢谢你的好工具~

文件夹检索结果 乱码

使用文件夹检索的时候(Find in files),检索结果会乱码。
这个可以解决么?

Sublime Text 3 Build 3049
ConvertToUTF8 1.2.6

增加project specific设置

现在的设置只能是全局或者用户,当不同的项目(.sublime-project)需要用到不同的编码方式时,无法进行设置。

与 GitGutter 兼容

目前 ConvertToUTF8 在 GBK 文件中与 GitGutter 存在不兼容情况。

我提了一个 PR 到 GitGutter 中,jisaacks/GitGutter#202

如果有空,能否看看这个更改是否正确?我在自己电脑上初步验证是没问题的。

How to keep "illegal multibyte sequence" (wrong encode character) when save

Dear seanliang,

I have a source file with SJIS encoding, it has some wrong encode character (when saving file, ConvertToUTF8 said: illegal multibyte sequence). And ConvertToUTF8 will remove it from source. How to keep them when save?
I tried sakura editor and it did keep, but i love sublime text.

Thanks and best regards,

确认下 console 的提示

谢谢提供这个强大的插件,我在使用的时候能正常转换,状态栏也显示正确,但是 console 会有这个提示,想确认下是什么引起的?

Unable to auto detect encoding, using fallback encoding Western (Windows 1252)

2217上无法直接将GBK文件保存成UTF-8

mac的2217版上打开GBK文件后,设置编码为GBK,然后选择另存为UTF-8。文件仍以GBK编码保存。只有选择UTF-8 with BOM格式时才以UTF-8 with BOM格式保存。

这时重新打开UTF-8 with BOM编码的文件,可以另存成UTF-8格式。

centos64下converttoutf8无法工作,安装 Codecs33 x64 无效

提示
====== Debug Information ======
Version: 3065-x64
Platform: Linux-2.6.32-358.el6.x86_64-x86_64-with-centos-6.4-Final
Path: ['/home/my/sublime_text_3', '/home/my/sublime_text_3/python3.3.zip', '/home/my/sublime_text_3/Data/Packages', '/home/my/sublime_text_3/Data/Packages/Codecs33/lib']
Encoding: GB2312

Batch jobs

Dear seanliang.
I have over thousand files need tobe convert to euc-jp.
How to do that?
Does this plugin support convert multible?

转换之后光标位置不准确

解决:

ConvertToUTF8.py的ConvertToUtf8Command类中,

run方法入口处记录当前选中行的行号:

class ConvertToUtf8Command(sublime_plugin.TextCommand):
    def run(self, edit, encoding=None, stamp=None, detect_on_fail=False):
        view = self.view

        pt = (0, 0)
        for region in view.sel():
            pt = view.rowcol(region.a)
            break

run方法结尾处跳转到之前记录的行号:

        encoding_cache.set(file_name, encoding)
        contents = contents.replace('\r\n', '\n').replace('\r', '\n')
        regions = sublime.Region(0, view.size())
        #sel = view.sel()
        #rs = [x for x in sel]
        vp = view.viewport_position()
        view.set_viewport_position((0, 0), False)
        view.replace(edit, regions, contents)
        #sel.clear()
        #for x in rs:
        #   sel.add(sublime.Region(x.a, x.b))
        view.set_viewport_position(vp, False)
        stamps[file_name] = stamp
        sublime.status_message('{0} -> UTF8'.format(encoding))

        view.run_command("goto_line", {"line": pt[0] + 1} )

一个文件保护两种编码方式,无法自由转换

我有个php文件,早先有人用GBK写,后来有人用UTF8写,里面有中文,
所以夹杂了GBK中文和UTF8,查看很不方便,用notepad++可以自由转换,切来切去无障碍,

ConvertToUTF8不能多次切换,期待更新

ST3 中 preview_action 选项失效

在ST2中如果该选项为false则预览文件时不会转码,直至为其打开一个tab
ST3中如果碰到需要转码的文件,无论该选项设置为true或False,都无法进入preview模式,在sidebar中单击文件就会打开一个新tab并且自动转码
主要用于对编码的检查,因为有时候会漏过status bar里的信息。。preview如果出现乱码能比较清楚地发现问题
PS: Goto Anything中这个选项似乎还是可以用的。。

1.2.3版本在打开gbk编码并保存时会保存为UTF8

首先,非常感谢您的插件,离开了它我就没办法用ST了!

这个BUG的情况是这样的:

我是MAC 10.9 ,用的是ST2,之前用1.2.2时,没有任何问题。

自动升级为1.2.3之后,今天突然编辑几个GB2312(GBK)字符集的文件,文件的识别显示都正确,不过保存的时候,没有如之前,保存会原来的编码,而是保存为UTF8了。

我手动下载了1.2.2,问题消除。

请您有空看看这个问题,需要什么更进一步的信息和测试,请联系我。

已安装 Codecs33 但还是报错

出错信息

Oops! The file /Users/popomore/code/project/personalportal/app/core/service/src/main/resources/META-INF/spring/core-service.xml is detected as GB2312 which is not supported by your Sublime Text.

Please check whether it is in the list of Python's Standard Encodings (http://docs.python.org/library/codecs.html#standard-encodings) or not.

If yes, please install Codecs33 (https://github.com/seanliang/Codecs33/tree/osx) and restart Sublime Text to make ConvertToUTF8 work properly. If it is still not working, please kindly send the following information to sunlxy#yahoo.com:
====== Debug Information ======
Version: 3059-x64
Platform: Darwin-13.0.0-x86_64-i386-64bit
Path: ['/Applications/Sublime Text.app/Contents/MacOS', '/Applications/Sublime Text.app/Contents/MacOS/python3.3.zip', '/Users/popomore/Library/Application Support/Sublime Text 3/Packages', '/Users/popomore/Library/Application Support/Sublime Text 3/Installed Packages/Javascript Beautify.sublime-package/libs', '/Users/popomore/Library/Application Support/Sublime Text 3/Installed Packages/Javascript Beautify.sublime-package/libs/js-beautify/python']
Encoding: GB2312

已安装 Codecs33

2014-01-21 7 13 58

环境

macos 10.9
sublime 3

部份 Big5 字元會在儲存時自動被改為 Big5-HKSCS 編碼

修改程式碼再 commit 之後,Subversion 的 history 裡顯示有一些不相關的地方也被修改了:

  • (U+FF0F) 被改成 (U+2215)
  • (U+FF3C) 被改成 (U+FE68)

用其他編輯器開啟同一份檔案檢查,確實已經被替換成另一個字元

請問這個問題可以用修改設定的方式來避免嗎?

設定為預設值,版本是 1.2.3

弹窗设置

可否增加一个设置,当confidence大于一定值时,不用选择encoding而是自动使用最高confidence的encoding呢?
此外,目前confidence大于0.95才会判断为成功检测到encoding,这个值可否放到设置里

单击左侧文件来预览时,依然是乱码

本插件在直接打开文件时,显示是正常的,而且效果比那个GBK support插件好。
若是能在预览时,也正确显示GBK文件,就完美了!

sublime.TRANSIENT. Open the file as a preview only: it won't have a tab assigned it until modified

搜索不了中文

GBK保存的文件,用ConverToUTF8插件,打开可以正确显示,不过‘CTRL+F’搜索的时候搜索不了~
ct

It doesn't work on ST3.

I've installed the PackageControl and the ConvertToUTF8, Codecs26 but the ST3 failed
to open the files encoded by the EUC-KR.

"Encoding EUC-KR is not supported' pop-up appears.

今天转换编码的时候不知道为什么报错了

Due to the limitation of embedded Python with Sublime Text, ConvertToUTF8 might not work properly.

You have to install an extra plugin to solve this problem, please kindly send the debug information to sunlxy#yahoo.com to get it:
====== Debug Information ======
Version: 3047
Platform: linux
Arch: x64
Path: ['/opt/sublime_text', '/opt/sublime_text/python3.3.zip', '/home/sergio/.config/sublime-text-3/Packages', '/home/sergio/.config/sublime-text-3/Installed Packages/Emmet.sublime-package', '/home/sergio/.config/sublime-text-3/Installed Packages/Emmet.sublime-package/emmet_completions', '/home/sergio/.config/sublime-text-3/Installed Packages/Emmet.sublime-package/emmet', '/home/sergio/.config/sublime-text-3/Installed Packages/JsFormat.sublime-package/libs', '/home/sergio/.config/sublime-text-3/Installed Packages/PyV8', '/home/sergio/.config/sublime-text-3/Installed Packages/PyV8/linux64-p3', '/home/sergio/.config/sublime-text-3/Installed Packages/PyV8/pyv8-linux64-p3']
Encoding: GBK

encode selection for empty file

At version 1.2.6, we must select an encoding each time when we open the empty file.
At version 1.2.4, this feature does not exist.
For the sake of simplicity, we want to fix the encoding for empty file, UTF-8.

默认保存编码为GBK,convert_on_save为true,保存的时候会闪一下

sublime 3
插件设置如下:
// Maximum size for encoding cache, 0 means no cache
"max_cache_size" : 0,
// Maximum lines to detect, 0 means unlimited
"max_detect_lines" : 0,
// Convert when previewing file: true or false
"preview_action" : true,
// Encoding for new file, empty means using sublime text's "default_encoding" setting
"default_encoding_on_create" : "GBK",
// Set this option to true will cause Sublime Text reload the saved file when losing focus
"lazy_reload": false,
// Convert in Find Results view
"convert_on_find": true,
// Convert when loading/saving a file
"convert_on_load" : true,
"convert_on_save" : true

然后我把lazy_reload设置成true,在保存就不会闪了,切换出去失去焦点了才会闪,有个朋友的是sublime2,没这个问题。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.