Comments (10)
page是可迭代对象,直接for遍历page即可
def search():
# 站内搜索main_tag=0。
# 搜索第一页。
page: JmSearchPage = client.search_site(author, page=1)
# for循环遍历page即可
for aid, atitle, atags in page.iter_id_title_tag():
print(aid, atitle, atags, sep=',')
# 直接返回这一页的所有本子id
return list(page.iter_id())
from jmcomic-crawler-python.
另外试着用其他关键字运行了一下代码,有不少编码报错。
请把会导致报错的关键字提供给我
from jmcomic-crawler-python.
试着问了下chat
它告诉我用这个开头,但是用了之后下面run窗口全部都是乱码了
import sys
if sys.stdout.encoding != 'utf-8':
sys.stdout = open(sys.stdout.fileno(), mode='w', encoding='utf-8', buffering=1)
截取了一点窗口,虽然确实没报错了
python.exe C:\Users\子夜\PycharmProjects\JM\搜索作者.py
2023-12-03 17:34:56:銆恜lugin.invoke銆戣皟鐢ㄦ彃浠�: [login]
2023-12-03 17:34:56:銆恏tml銆慼ttps://18comic.vip/login
2023-12-03 17:34:56:銆恜lugin.login銆戠櫥褰曟垚鍔�
2023-12-03 17:34:56:銆恏tml銆慼ttps://18comic.vip/search/photos?main_tag=0&search_query=绋粯&page=1&o=mr&t=a
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/506940
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/431577
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/463953
2023-12-03 17:34:58:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [444178], 浣滆��: [銇姐倠銇°兗銇玗, 绔犺妭鏁�: [1], 鎬婚〉鏁�: [47], 鏍囬: [[銇姐倠銇°兗銇玗鐖嗕钩鎸併仸浣欍仚娆叉眰涓嶆簚銇汉濡绘按娉炽偆銉炽偣銉堛儵銈偪銉笺仺鍗遍櫤鏃ョó浠樸亼銉堛儸銉笺儖銉炽偘[incomplete]], 鍏抽敭璇�: ['CG', '宸ㄤ钩', '浜哄', '涔充氦', '涓枃']
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/photo/444178
2023-12-03 17:34:58:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [459602], 浣滆��: [銇°倱], 绔犺妭鏁�: [1], 鎬婚〉鏁�: [188], 鏍囬: [[銇°倱] 绋粯銇�! 銉椼儸銈� 銉椼儸銈� 銉椼儸銈� [DL鐗圿], 鍏抽敭璇�: ['宸ㄤ钩', '澶氭瘺', '鍌湢', '涓嚭', '寮锋毚', '閬庤啙瑗�', '鍑烘睏', '缇や氦', '鍠鏈�', '鏃ユ枃']
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/photo/459602
2023-12-03 17:34:59:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [469447], 浣滆��: [鏈ㄩ埓銈偙銉玗, 绔犺妭鏁�: [1], 鎬婚〉鏁�: [26], 鏍囬: [鎾ó姝愬悏妗戠殑JK鐢熷皬瀛㏒EX [鏈ㄩ埓浜� (鏈ㄩ埓銈偙銉�)] 绋粯銇娿仒銇曘倱銇甁K瀛愪綔銈奡EX [涓浗缈昏ǔ] [鐒′慨姝 [DL鐗圿], 鍏抽敭璇�: ['鐒′慨姝�', '闃块粦椤�', '鎬у嫆绱�', '鍏斿コ閮�', '铏曞コ', '钘ョ墿', '涓嚭', '鍏ц。', '寮锋毚', '鏍℃湇', '閬庤啙瑗�', '闆欓Μ灏�', '閫忚', '涓枃']
2023-12-03 17:34:59:銆恏tml銆慼ttps://18comic.vip/photo/469447
大概就是这样的,总感觉不对
from jmcomic-crawler-python.
可能是你控制台的编码是GBK,Python标准输出的编码是UTF8。属于你本地环境的问题了,你可以多调调试试看
from jmcomic-crawler-python.
另外试着用其他关键字运行了一下代码,有不少编码报错。 请把会导致报错的关键字提供给我
我这边用的就是主楼的那段代码,就是简单的吧原来的作者名称换成‘種付’了
不加上面提的那段import的话,报错如下
Exception in thread Thread-80 (<lambda>):
Traceback (most recent call last):
File "C:\Python\lib\site-packages\jmcomic\api.py", line 48, in download_album
dler.download_album(jm_album_id)
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 59, in download_album
self.download_by_album_detail(album, client)
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 62, in download_by_album_detail
self.before_album(album)
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 168, in before_album
super().before_album(album)
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 8, in before_album
jm_log('album.before',
File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log
cls.executor_log(topic, msg)
File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging
print(f'{format_ts()}:【{topic}】{msg}')
UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 116: illegal multibyte sequence
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\Python\lib\threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "C:\Python\lib\site-packages\jmcomic\api.py", line 29, in <lambda>
apply_each_obj_func=lambda aid: download_api(aid, option, downloader),
File "C:\Python\lib\site-packages\jmcomic\api.py", line 47, in download_album
with new_downloader(option, downloader) as dler:
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 198, in __exit__
jm_log('dler.exception',
File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log
cls.executor_log(topic, msg)
File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging
print(f'{format_ts()}:【{topic}】{msg}')
UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 244: illegal multibyte sequence
from jmcomic-crawler-python.
可能是你控制台的编码是GBK,Python标准输出的编码是UTF8。属于你本地环境的问题了,你可以多调调试试看
好的,这个应该是我个人的问题,我自己调试一下,麻烦大佬了。
from jmcomic-crawler-python.
另外试着用其他关键字运行了一下代码,有不少编码报错。 请把会导致报错的关键字提供给我
我这边用的就是主楼的那段代码,就是简单的吧原来的作者名称换成‘種付’了
不加上面提的那段import的话,报错如下
Exception in thread Thread-80 (<lambda>): Traceback (most recent call last): File "C:\Python\lib\site-packages\jmcomic\api.py", line 48, in download_album dler.download_album(jm_album_id) File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 59, in download_album self.download_by_album_detail(album, client) File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 62, in download_by_album_detail self.before_album(album) File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 168, in before_album super().before_album(album) File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 8, in before_album jm_log('album.before', File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log cls.executor_log(topic, msg) File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging print(f'{format_ts()}:【{topic}】{msg}') UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 116: illegal multibyte sequence During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Python\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Python\lib\threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "C:\Python\lib\site-packages\jmcomic\api.py", line 29, in <lambda> apply_each_obj_func=lambda aid: download_api(aid, option, downloader), File "C:\Python\lib\site-packages\jmcomic\api.py", line 47, in download_album with new_downloader(option, downloader) as dler: File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 198, in __exit__ jm_log('dler.exception', File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log cls.executor_log(topic, msg) File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging print(f'{format_ts()}:【{topic}】{msg}') UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 244: illegal multibyte sequence
这个是Python的标准输出流编码不对的问题,尝试使用如下代码,看看输出是什么
import sys
print(sys.stdout.encoding)
如果输出结果不是UTF-8,把下面的代码插入到你代码的开头
import io
import sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf-8')
from jmcomic-crawler-python.
控制台的编码
感谢,正常了,多嘴问一句option的设置,使用clash做代理的话,以下设置是哪里出错了,我开启clash选中代理节点但是不启用系统代理仍然无法连接到18comic.vip
client:
cache: true
domain: [ 18comic.vip ]
impl: html
postman:
meta_data:
headers: null
impersonate: chrome110
proxies: { clash }
from jmcomic-crawler-python.
控制台的编码
感谢,正常了,多嘴问一句option的设置,使用clash做代理的话,以下设置是哪里出错了,我开启clash选中代理节点但是不启用系统代理仍然无法连接到18comic.vip
client: cache: true domain: [ 18comic.vip ] impl: html postman: meta_data: headers: null impersonate: chrome110 proxies: { clash }
client:
domain: [ 18comic.vip ]
impl: html
postman:
meta_data:
proxies: clash # 改这里
如果你的clash开了系统代理,那么配置可以简化
client:
domain: [ 18comic.vip ]
impl: html
from jmcomic-crawler-python.
收到,感谢大佬细心讲解。我就是希望不开系统代理,平时一般也是不开全局代理的,不然逛nga或者贴吧的话,还得单独设置规则
from jmcomic-crawler-python.
Related Issues (20)
- 如何获得搜索结果数? HOT 1
- 无法正确获取所有domain HOT 6
- categories_filter_gen结果为空 HOT 5
- 你们下过来的长图都是这样的吗 HOT 5
- 关于命名规则的问题:能否在命名时加入tag以方便本地管理 HOT 2
- 在'趣味用法:测试你的ip可以访问哪些禁漫域名'中出现NameError: name 'AdvancedDict' is not defined HOT 2
- 按照album压缩 异常 HOT 2
- impersonate chrome is not supported HOT 3
- 希望能添加一个download_album_for_pdf函数直接下载pdf文件 HOT 2
- 发现个章节下载的小问题 HOT 1
- 手动调整Action下载包结构 下载完成的本子.zip/书名.zip
- 请问出现了如下状况怎么办啊 HOT 2
- 版本更新问题 HOT 1
- 有证书验证要怎么通过 HOT 1
- 下載多話的漫畫資料夾名稱問題 HOT 2
- 到最后一步下载不了,想请教下大佬 HOT 1
- 最新GitHub Actions 无法下载文件 HOT 7
- 功能需求:过滤重复图片 HOT 6
- 怎么使用jmcomic的搜索api HOT 8
- 你就是人类在漫漫黑夜中的一点星光
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jmcomic-crawler-python.