GithubHelp home page GithubHelp logo

Comments (10)

hect0x7 avatar hect0x7 commented on July 23, 2024

page是可迭代对象,直接for遍历page即可

def search():
    # 站内搜索main_tag=0。
    # 搜索第一页。
    page: JmSearchPage = client.search_site(author, page=1)
    
    # for循环遍历page即可
    for aid, atitle, atags in page.iter_id_title_tag():
        print(aid, atitle, atags, sep=',')
    
    # 直接返回这一页的所有本子id
    return list(page.iter_id())

from jmcomic-crawler-python.

hect0x7 avatar hect0x7 commented on July 23, 2024

另外试着用其他关键字运行了一下代码,有不少编码报错。
请把会导致报错的关键字提供给我

from jmcomic-crawler-python.

jzl543098871 avatar jzl543098871 commented on July 23, 2024

试着问了下chat
它告诉我用这个开头,但是用了之后下面run窗口全部都是乱码了

import sys

if sys.stdout.encoding != 'utf-8':
    sys.stdout = open(sys.stdout.fileno(), mode='w', encoding='utf-8', buffering=1)

截取了一点窗口,虽然确实没报错了

python.exe C:\Users\子夜\PycharmProjects\JM\搜索作者.py 
2023-12-03 17:34:56:銆恜lugin.invoke銆戣皟鐢ㄦ彃浠�: [login]
2023-12-03 17:34:56:銆恏tml銆慼ttps://18comic.vip/login
2023-12-03 17:34:56:銆恜lugin.login銆戠櫥褰曟垚鍔�
2023-12-03 17:34:56:銆恏tml銆慼ttps://18comic.vip/search/photos?main_tag=0&search_query=绋粯&page=1&o=mr&t=a
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/506940
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/431577
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/463953
2023-12-03 17:34:58:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [444178], 浣滆��: [銇姐倠銇°兗銇玗, 绔犺妭鏁�: [1], 鎬婚〉鏁�: [47], 鏍囬: [[銇姐倠銇°兗銇玗鐖嗕钩鎸併仸浣欍仚娆叉眰涓嶆簚銇汉濡绘按娉炽偆銉炽偣銉堛儵銈偪銉笺仺鍗遍櫤鏃ョó浠樸亼銉堛儸銉笺儖銉炽偘[incomplete]], 鍏抽敭璇�: ['CG', '宸ㄤ钩', '浜哄', '涔充氦', '涓枃']
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/photo/444178
2023-12-03 17:34:58:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [459602], 浣滆��: [銇°倱], 绔犺妭鏁�: [1], 鎬婚〉鏁�: [188], 鏍囬: [[銇°倱] 绋粯銇�! 銉椼儸銈� 銉椼儸銈� 銉椼儸銈� [DL鐗圿], 鍏抽敭璇�: ['宸ㄤ钩', '澶氭瘺', '鍌湢', '涓嚭', '寮锋毚', '閬庤啙瑗�', '鍑烘睏', '缇や氦', '鍠鏈�', '鏃ユ枃']
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/photo/459602
2023-12-03 17:34:59:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [469447], 浣滆��: [鏈ㄩ埓銈偙銉玗, 绔犺妭鏁�: [1], 鎬婚〉鏁�: [26], 鏍囬: [鎾ó姝愬悏妗戠殑JK鐢熷皬瀛㏒EX [鏈ㄩ埓浜� (鏈ㄩ埓銈偙銉�)] 绋粯銇娿仒銇曘倱銇甁K瀛愪綔銈奡EX [涓浗缈昏ǔ] [鐒′慨姝 [DL鐗圿], 鍏抽敭璇�: ['鐒′慨姝�', '闃块粦椤�', '鎬у嫆绱�', '鍏斿コ閮�', '铏曞コ', '钘ョ墿', '涓嚭', '鍏ц。', '寮锋毚', '鏍℃湇', '閬庤啙瑗�', '闆欓Μ灏�', '閫忚', '涓枃']
2023-12-03 17:34:59:銆恏tml銆慼ttps://18comic.vip/photo/469447

大概就是这样的,总感觉不对

from jmcomic-crawler-python.

hect0x7 avatar hect0x7 commented on July 23, 2024

可能是你控制台的编码是GBK,Python标准输出的编码是UTF8。属于你本地环境的问题了,你可以多调调试试看

from jmcomic-crawler-python.

jzl543098871 avatar jzl543098871 commented on July 23, 2024

另外试着用其他关键字运行了一下代码,有不少编码报错。 请把会导致报错的关键字提供给我

我这边用的就是主楼的那段代码,就是简单的吧原来的作者名称换成‘種付’了

不加上面提的那段import的话,报错如下

Exception in thread Thread-80 (<lambda>):
Traceback (most recent call last):
  File "C:\Python\lib\site-packages\jmcomic\api.py", line 48, in download_album
    dler.download_album(jm_album_id)
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 59, in download_album
    self.download_by_album_detail(album, client)
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 62, in download_by_album_detail
    self.before_album(album)
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 168, in before_album
    super().before_album(album)
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 8, in before_album
    jm_log('album.before',
  File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log
    cls.executor_log(topic, msg)
  File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging
    print(f'{format_ts()}:【{topic}】{msg}')
UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 116: illegal multibyte sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "C:\Python\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python\lib\site-packages\jmcomic\api.py", line 29, in <lambda>
    apply_each_obj_func=lambda aid: download_api(aid, option, downloader),
  File "C:\Python\lib\site-packages\jmcomic\api.py", line 47, in download_album
    with new_downloader(option, downloader) as dler:
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 198, in __exit__
    jm_log('dler.exception',
  File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log
    cls.executor_log(topic, msg)
  File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging
    print(f'{format_ts()}:【{topic}】{msg}')
UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 244: illegal multibyte sequence

from jmcomic-crawler-python.

jzl543098871 avatar jzl543098871 commented on July 23, 2024

可能是你控制台的编码是GBK,Python标准输出的编码是UTF8。属于你本地环境的问题了,你可以多调调试试看

好的,这个应该是我个人的问题,我自己调试一下,麻烦大佬了。

from jmcomic-crawler-python.

hect0x7 avatar hect0x7 commented on July 23, 2024

另外试着用其他关键字运行了一下代码,有不少编码报错。 请把会导致报错的关键字提供给我

我这边用的就是主楼的那段代码,就是简单的吧原来的作者名称换成‘種付’了

不加上面提的那段import的话,报错如下

Exception in thread Thread-80 (<lambda>):
Traceback (most recent call last):
  File "C:\Python\lib\site-packages\jmcomic\api.py", line 48, in download_album
    dler.download_album(jm_album_id)
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 59, in download_album
    self.download_by_album_detail(album, client)
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 62, in download_by_album_detail
    self.before_album(album)
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 168, in before_album
    super().before_album(album)
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 8, in before_album
    jm_log('album.before',
  File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log
    cls.executor_log(topic, msg)
  File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging
    print(f'{format_ts()}:【{topic}】{msg}')
UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 116: illegal multibyte sequence

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "C:\Python\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python\lib\site-packages\jmcomic\api.py", line 29, in <lambda>
    apply_each_obj_func=lambda aid: download_api(aid, option, downloader),
  File "C:\Python\lib\site-packages\jmcomic\api.py", line 47, in download_album
    with new_downloader(option, downloader) as dler:
  File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 198, in __exit__
    jm_log('dler.exception',
  File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log
    cls.executor_log(topic, msg)
  File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging
    print(f'{format_ts()}:【{topic}】{msg}')
UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 244: illegal multibyte sequence

这个是Python的标准输出流编码不对的问题,尝试使用如下代码,看看输出是什么

import sys
print(sys.stdout.encoding)

如果输出结果不是UTF-8,把下面的代码插入到你代码的开头

import io
import sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf-8')

from jmcomic-crawler-python.

jzl543098871 avatar jzl543098871 commented on July 23, 2024

控制台的编码

感谢,正常了,多嘴问一句option的设置,使用clash做代理的话,以下设置是哪里出错了,我开启clash选中代理节点但是不启用系统代理仍然无法连接到18comic.vip

client:
  cache: true
  domain: [ 18comic.vip ]
  impl: html
  postman:
    meta_data:
      headers: null
      impersonate: chrome110
      proxies: { clash }

from jmcomic-crawler-python.

hect0x7 avatar hect0x7 commented on July 23, 2024

控制台的编码

感谢,正常了,多嘴问一句option的设置,使用clash做代理的话,以下设置是哪里出错了,我开启clash选中代理节点但是不启用系统代理仍然无法连接到18comic.vip

client:
  cache: true
  domain: [ 18comic.vip ]
  impl: html
  postman:
    meta_data:
      headers: null
      impersonate: chrome110
      proxies: { clash }
client:
  domain: [ 18comic.vip ]
  impl: html
  postman:
    meta_data:
      proxies: clash # 改这里

如果你的clash开了系统代理,那么配置可以简化

client:
  domain: [ 18comic.vip ]
  impl: html

from jmcomic-crawler-python.

jzl543098871 avatar jzl543098871 commented on July 23, 2024

收到,感谢大佬细心讲解。我就是希望不开系统代理,平时一般也是不开全局代理的,不然逛nga或者贴吧的话,还得单独设置规则

from jmcomic-crawler-python.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.