Comments (15)
I think this is a fair point, I don't have anything against adding unicode. I went with ASCII because I was worried about compatibility with terminal emulators and different system configurations lacking unicode support.
from rtv.
from rtv.
Fixed pull request in 47ad49a.
However, there still appear to be some issues with python3. For example,
$ python2 -m rtv -l http://www.reddit.com/r/LearnJapanese/comments/2ylsz9/request_japanese_audio_textbook_audiobook_or/
loads correctly, but
$ python3 -m rtv -l http://www.reddit.com/r/LearnJapanese/comments/2ylsz9/request_japanese_audio_textbook_audiobook_or/
crashes with the following error
Traceback (most recent call last):
File "/usr/lib/python3.4/runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/michael/Projects/rtv_project/rtv/__main__.py", line 7, in <module>
main()
File "/home/michael/Projects/rtv_project/rtv/main.py", line 84, in main
page.loop()
File "/home/michael/Projects/rtv_project/rtv/submission.py", line 28, in loop
self.draw()
File "/home/michael/Projects/rtv_project/rtv/page.py", line 143, in draw
self._draw_content()
File "/home/michael/Projects/rtv_project/rtv/page.py", line 194, in _draw_content
attr = self.draw_item(subwindow, data, inverted)
File "/home/michael/Projects/rtv_project/rtv/submission.py", line 95, in draw_item
return self.draw_comment(win, data, inverted=inverted)
File "/home/michael/Projects/rtv_project/rtv/submission.py", line 122, in draw_comment
win.addnstr(row, 1, text, n_cols-1)
_curses.error: addnwstr() returned ERR
Passing --force-ascii
fixes the crash, but I would like to figure out why some unicode characters appear to be messing with curses in python3.4.
from rtv.
That doesn't make a lot of sense because this works http://www.reddit.com/r/LearnJapanese/comments/2yonx8/is_there_any_difference_between_%E9%87%8D%E3%81%9F%E3%81%84_and_%E9%87%8D%E3%81%84/ and that has japanese characters in it. However I get the same issue on my computer
from rtv.
I did some investigating into this! It looks like the problem is with how python3 curses.addnstr() calculates the n. The expected behavior is that curses.addnstr(y, x, text, n) will print the first n characters of the_text_ string onto the screen. Here's what's happening.
import curses
import locale
locale.setlocale(locale.LC_ALL, '')
def func(stdscr):
stdscr.clear()
text = 'a' * 5
stdscr.addnstr(0, 0, text, 6)
stdscr.addnstr(1, 0, text, 5)
stdscr.addnstr(2, 0, text, 4)
text = 'あ' * 5
stdscr.addnstr(4, 0, text, 6)
stdscr.addnstr(5, 0, text, 5)
stdscr.addnstr(6, 0, text, 4)
text = ('あ' * 5).encode('utf-8')
stdscr.addnstr(8, 0, text, 6)
stdscr.addnstr(9, 0, text, 5)
stdscr.addnstr(10, 0, text, 4)
stdscr.refresh()
stdscr.getch()
stdscr.getch()
if __name__ == '__main__':
curses.wrapper(func)
- Passing unicode characters within ordinal range(128) works as expected, each character takes up one column of the terminal and up to n characters are printed.
- Passing unicode characters outside of the ordinal range(128). Up to n characters are printed, but each character takes up two columns in the terminal. So unless we know exactly how much space each character takes, passing in a value for n becomes useless for any practical sense.
- Passing utf-8 encoded bytes. Now this one is odd, it looks like n is counting the number of bytes passed in. The 'あ' character is 3-bytes long in utf-8. Setting n to 6 allows for 6/3=2 full characters to be printed. Setting n below six cuts off the final character. It is unclear what happens to the partial character bytes. This mode is also practically useless because it doesn't account for the width of each character on the screen.
from rtv.
I don't know if if there is a good way to handle this.
I didn't realize that some unicode characters take up more space than others. This will break how I am currently using textwrap to format paragraphs to fit on the page.
from rtv.
Yes. I observed too that sometimes a unicode character breaks at the line-change.
from rtv.
Heads up, I changed the default encoding to ascii until this is resolved. 7db8c2f
from rtv.
Interesting reading on the subject http://stackoverflow.com/questions/3634627
and https://pypi.python.org/pypi/wcwidth/0.1.4
from rtv.
I've refactored the code to follow the "unicode sandwich" design method described here. This should make the problem easier to address in the future while considering both py2 and py3.
from rtv.
We used unicodedata.east_asian_width to help calculate column widths for my curses CSV viewer.
from rtv.
Hey guys! I just merged a large update to the master branch that will hopefully smooth out all of the unicode issues in the codebase. We are now using the kitchen python package, which provides a bunch of unicode-aware text formatting functions. I also set the default program mode to enable unicode, with the option to disable it with the --ascii
flag.
@yskmt do you think you could help me test this out, or point me to some non-english subreddits? I would really appreciate it.
from rtv.
@michael-lazar Sure. Have you implemented it? Is it on master branch?
from rtv.
Yes it is currently on master. Unicode mode should be turned on by default, so all you have to do is checkout master and run it.
from rtv.
I haven't heard any objections so I'm closing this for now. If anybody discovers a problem, please open a new issue.
from rtv.
Related Issues (20)
- add Twitch mime parser HOT 5
- Crash when hitting G on empty thread HOT 1
- RTV development is shutting down HOT 30
- Offering to help with the project HOT 1
- ConnectionError upon first launch
- no confirmation on exit HOT 1
- Program exited with status 127 HOT 1
- When you set $RTV_BROWSER variable to a program with options like "mybrowser -u" it basicly doesn't work HOT 1
- [1.2.7] Where is the configuration option to toggle whether to open web browser links in a new tab or a new window?
- Not being able to view remote image url via iTerm2's imgcat HOT 6
- How do I change it that rtv uses leafpad instead of nano as its default text editor?
- Save option for posts.
- Option to edit a posted comment.
- Where does rtv store the credentials or session/cookie file so I could just take a rtv install to another OS/new install and don’t have to reopen Firefox for allowing the app access?
- Version 1.27.0_1 (from brew) crashed when opening url HOT 5
- Fix incase mpv is not opening anymore
- When pressing < o > key on a reddit post rtv crashes, "Can't pickle local object" HOT 1
- Crash while trying to sign in HOT 1
- Crash while trying to open a submission link HOT 2
- Archiving Github Repository
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rtv.