GithubHelp home page GithubHelp logo

ma6254 / fictiondown Goto Github PK

View Code? Open in Web Editor NEW
678.0 12.0 137.0 470 KB

小说下载|小说爬取|起点|笔趣阁|导出Markdown|导出txt|转换epub|广告过滤|自动校对

License: GNU General Public License v3.0

Go 99.21% Makefile 0.79%
biquge qidian fiction novels spider crawler golang

fictiondown's People

Contributors

caitinggui avatar edmund-zhao avatar ma6254 avatar mddct avatar weah2000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fictiondown's Issues

win10管理员运行程序内存溢出

FictionDown.exe -i .\一念永恒-耳根-起点中文网.FictionDown s -k 一念永恒 -p
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x18 pc=0x9acbec]

goroutine 1 [running]:
github.com/ma6254/FictionDown/site.Type1SearchAfter.func1(0xc00007a0c0, 0xc, 0x0, 0x0, 0x0, 0x0, 0x0)
/home/runner/work/FictionDown/FictionDown/site/sites.go:200 +0x24c
github.com/ma6254/FictionDown/site.Search(0xc00007a0c0, 0xc, 0xc000163980, 0xc00007a0c0, 0xc, 0xc0000a57a0, 0x4efadf)
/home/runner/work/FictionDown/FictionDown/site/site.go:238 +0x13c
main.glob..func6(0xc0000b8dc0, 0x100, 0xc0000b8dc0)
/home/runner/work/FictionDown/FictionDown/search.go:33 +0x7f
github.com/urfave/cli.HandleAction(0xa2c980, 0xb279f0, 0xc0000b8dc0, 0xc000163900, 0x0)
/home/runner/go/pkg/mod/github.com/urfave/[email protected]/app.go:490 +0xcf
github.com/urfave/cli.Command.Run(0xaf472e, 0x6, 0x0, 0x0, 0x1118610, 0x1, 0x1, 0xb00290, 0x12, 0x0, ...)
/home/runner/go/pkg/mod/github.com/urfave/[email protected]/command.go:210 +0x99d
github.com/urfave/cli.(*App).Run(0x1121fc0, 0xc0000ae000, 0x7, 0x8, 0x0, 0x0)
/home/runner/go/pkg/mod/github.com/urfave/[email protected]/app.go:255 +0x6b6
main.main()
/home/runner/work/FictionDown/FictionDown/main.go:87 +0x125
image

无法下载,起点

您好,我是第一次用这个FictionDown,我想用它下载"诡秘之主",带到kindle上二刷.

由于没有用过go语言相关程序,我又害怕出错,所有我的安装方式为:
1.打开了v2rayNG翻墙
2.下载安装了go语言支持(.msi for amd64)
3.go env -w GO111MODULE=on
4.go env -w GOPROXY=https://goproxy.cn,direct
5.go get -v github.com/ma6254/FictionDown@latest

似乎是安装成功了
image

然而接下来无论是我尝试搜索
image

还是我尝试提供网站直接下载
image

似乎运行并不正常,是我没安装好吗?

平台:
win10 企业版 LTSC 1809(os内部版本 17763.1637)
go version go1.15.6 windows/amd64

2021.1.9

dep ensure failed

$ dep ensure -v
(1/12) Wrote github.com/benbjohnson/phantomjs@master
(2/12) Wrote github.com/gofrs/[email protected]
(3/12) Wrote github.com/bmaupin/[email protected]
(4/12) Wrote golang.org/x/[email protected]
(5/12) Wrote github.com/go-yaml/[email protected]
(6/12) Wrote github.com/mattn/[email protected]
(7/12) Failed to write golang.org/x/net@master
(8/12) Failed to write golang.org/x/sys@master
(9/12) Failed to write gopkg.in/cheggaaa/[email protected]
(10/12) Failed to write github.com/antchfx/[email protected]
(11/12) Failed to write github.com/antchfx/xpath@master
(12/12) Failed to write github.com/urfave/[email protected]
grouped write of manifest, lock and vendor: error while writing out vendor tree: failed to write dep tree: failed to export golang.org/x/net: fatal: failed to unpack tree object 3a22650c66bd7f4fb6d1e8072ffd7b75c8a27898
: exit status 128

$ dep version
dep:
 version     : devel
 build date  : 
 git hash    : 
 go version  : go1.9.4
 go compiler : gc
 platform    : linux/amd64
 features    : ImportDuringSolve=false
$ go version
go version go1.11 linux/amd64

生成多平台的可执行文件

Flag --rm-dist has been deprecated, please use --clean instead
• starting release...
⨯ release failed after 0s error=yaml: unmarshal errors:
line 41: field replacements not found in type config.Archive

Windows下通过pandoc转换输出epub发生错误

环境

软件版本:commit 1c10eae tag: v0.1.3
Pandoc版本:

PS C:\Users\mjc\git\FictionDown\release> pandoc -v
pandoc.exe 2.9.2
Compiled with pandoc-types 1.20, texmath 0.12.0.1, skylighting 0.8.3.2
Default user data directory: C:\Users\mjc\AppData\Roaming\pandoc
Copyright (C) 2006-2019 John MacFarlane
Web:  https://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

操作系统:任意Windows版本

复现方法

PS C:\Users\mjc\git\FictionDown\release> .\FictionDown.exe -i .\诡秘之主-爱潜水的乌贼-笔趣阁1.FictionDown conv -f epub
2020/02/19 00:05:36 Loading cache file: .\诡秘之主-爱潜水的乌贼-笔趣阁1.FictionDown
2020/02/19 00:05:36 Start Conversion: Format:"epub" OutPath:"诡秘之主.epub"
2020/02/19 00:05:36 Save Cover Image: "C:\\Users\\mjc\\AppData\\Local\\Temp\\book_cover_126653631.jpg"
2020/02/19 00:05:40 中间文件转换完成: "诡秘之主.epub.md"
2020/02/19 00:05:40 调用Pandoc: "C:\\ProgramData\\chocolatey\\bin\\pandoc.exe" []string{"pandoc", "--epub-chapter-level", "2", "-o", "诡秘之主.epub", "诡秘之主.epub.md"}   
pandoc.exe: C:_cover_126653631.jpg: openBinaryFile: does not exist (No such file or directory)
exit status 1

或者

PS C:\Users\mjc\git\FictionDown\release> pandoc -o a.epub 诡秘之主.md
pandoc.exe: C:_cover_703999991.jpg: openBinaryFile: does not exist (No such file or directory)

MetaData部分

title: 诡秘之主
description: |-
  蒸汽与机械的浪潮中,谁能触及非凡?历史和黑暗的迷雾里,又是谁在耳语?我从诡秘中醒来,睁眼看见这个世界:
  枪械,大炮,巨舰,飞空艇,差分机;魔药,占卜,诅咒,倒吊人,封印物……光明依旧照耀,神秘从未远离,这是一段“愚者”的传说。
creator: 爱潜水的乌贼
lang: zh-CN
cover-image: C:\Users\mjc\AppData\Local\Temp\book_cover_703999991.jpg

推测为Pandoc和go-yaml的YAML实现不一致导致

已向Pandoc提交Issue:jgm/pandoc#6150

加个规则

www.zhuishubang.com

Code

package site

import (
	"fmt"
	"io"
	"net/url"
	"strings"

	"github.com/antchfx/htmlquery"
	"github.com/ma6254/FictionDown/store"
	"golang.org/x/text/encoding/simplifiedchinese"
	"golang.org/x/text/transform"
)

type wwwZhuishubangCom struct {
}

func (b *wwwZhuishubangCom) BookInfo(body io.Reader) (s *store.Store, err error) {
	body = transform.NewReader(body, simplifiedchinese.GBK.NewDecoder())
	doc, err := htmlquery.Parse(body)
	if err != nil {
		return
	}

	s = &store.Store{}

	// Book Name
	node_title := htmlquery.Find(doc, `//div[@class="bookPhr"]/h2`)
	if len(node_title) == 0 {
		err = fmt.Errorf("No matching title")
		return
	}
	s.BookName = htmlquery.InnerText(node_title[0])

	// Description
	node_desc := htmlquery.Find(doc, `//*[@class="introCon"]/p`)
	if len(node_desc) == 0 {
		err = fmt.Errorf("No matching desc")
		return
	}
	s.Description = strings.Replace(
		htmlquery.OutputHTML(node_desc[0], false),
		"<br/>", "\n",
		-1)

	// Author
	var author = htmlquery.Find(doc, `//div[@class="bookPhr"]/dl/dd`)
	s.Author = htmlquery.OutputHTML(author[0], false)

	// Contents
	node_content := htmlquery.Find(doc, `//div[@class="chapterCon"]/ul/li/a`)
	if len(node_desc) == 0 {
		err = fmt.Errorf("No matching contents")
		return
	}

	var vol = store.Volume{
		Name:     "正文",
		Chapters: make([]store.Chapter, 0),
	}

	//for  _, v := range node_content {
  for idx:=len(node_content)-1;idx>=0;idx--{
    v:=node_content[idx]
		//fmt.Printf("href: %v\n", chapter_u)
		chapterURL, err := url.Parse(htmlquery.SelectAttr(v, "href"))
		if err != nil {
			return nil, err
		}

		vol.Chapters = append(vol.Chapters, store.Chapter{
			Name: strings.TrimSpace(htmlquery.InnerText(v)),
			URL:  chapterURL.String(),
		})
	}
	s.Volumes = append(s.Volumes, vol)

	s.CoverURL = htmlquery.SelectAttr(htmlquery.FindOne(doc, `//*[@class="bookImg"]/img`), "src")

	return
}

func (b *wwwZhuishubangCom) Chapter(body io.Reader) ([]string, error) {
	body = transform.NewReader(body, simplifiedchinese.GBK.NewDecoder())
	doc, err := htmlquery.Parse(body)
	if err != nil {
		return nil, err
	}

	M := []string{}
	//list
	// nodeContent := htmlquery.Find(doc, `//div[@id="content"]/text()`)
	nodeContent := htmlquery.Find(doc, `//div[@class="articleCon"]/p/text()`)
	if len(nodeContent) == 0 {
		err = fmt.Errorf("No matching content")
		return nil, err
	}
	for _, v := range nodeContent {
		t := htmlquery.InnerText(v)
		t = strings.TrimSpace(t)

		switch t {
		case
			"本↘书↘首↘发↘追↘书↘帮↘http://m.zhuishubang.com/",
			"":
			continue
		}

		M = append(M, t)
	}

	return M, nil
}

无法读取起点章节,内容为空

bookurl: https://book.qidian.com/info/1025813823/
bookname: 仙朝纪元
author: 西城冷月
coverurl: https://bookcover.yuewen.com/qdbimg/349573/1025813823/180
description: |-
旧世之末,余火回光!
龙蛇起陆的仙道盛景、缱绻多情的绝代佳人,春色绚烂下,是那腐朽的灰败。
仙人在沉沦中徘徊,旧神在欲望中复苏……
建仙朝、铸仙鼎,口含天宪,言出法随,叫那天地换个新纪元!
这是一个幽幽长夜之内,一点星火乍起,煦照九天十地,三界六道……的故事。
tmap: []
volumes:

  • name: 作品相关
    isvip: false
    chapters: []
  • name: 潜龙勿用
    isvip: false
    chapters: []
  • name: 潜龙勿用
    isvip: true
    chapters: []
  • name: 见龙在田
    isvip: true
    chapters: []
  • name: 终日乾乾
    isvip: true
    chapters: []

chromedp更新了

chromedp更新后,方法名改了,所有调用chromedp的地方基本全不行了

感谢您的项目,提一个小小的建议

q(≧▽≦q)感谢您的项目,解决了在下的痛。
但是提个小建议:能否在发布release时给文件签名呢?
(目的:

  1. 防止您的权益受到侵害,毕竟国内有很多无良,从github上盗窃项目,套层壳然后收费售卖......
  2. 不知道您有没有用过kms pico呢?作者发布在一个国外的论坛上(被墙了),然后很多人建立仿站在文件里藏上挖矿⛏病毒......希望您能签名......最好给一个MD5码哦~)
    最后的最后,再次感谢orz!

example failed

donwload release and install phantomjs, then run example:

$ ./FictionDown --url https://book.qidian.com/info/3249362 d 
2019/03/11 16:11:02 Init PhantomJS
2019/03/11 16:11:03 URL: "https://book.qidian.com/info/3249362"
2019/03/11 16:11:03 Close PhantomJS
2019/03/11 16:11:03 failed
$ phantomjs --version
1.9.8

runtime error 搜索各站点时出现运行时错误

R:\down\FictionDown_0.1.3_Windows_x86_64.tar>FictionDown s -k '赛博剑仙铁雨'
2021/09/04 00:37:18 搜索站点: 新八一中文网 https://www.81new.net/ 404 404 Not Found
2021/09/04 00:37:19 搜索站点: 结果: 0 笔趣阁1 https://www.biquge5200.cc/
2021/09/04 00:37:20 搜索站点: 结果: 0 起点中文网 https://www.qidian.com/
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x18 pc=0x9acbec]

goroutine 1 [running]:
github.com/ma6254/FictionDown/site.Type1SearchAfter.func1(0xc00002c340, 0x14, 0x0, 0x0, 0x0, 0x0, 0x0)
        /home/runner/work/FictionDown/FictionDown/site/sites.go:200 +0x24c
github.com/ma6254/FictionDown/site.Search(0xc00002c340, 0x14, 0xc00013d9e0, 0xc00002c340, 0x14, 0xc0000897a0, 0x4efadf)
        /home/runner/work/FictionDown/FictionDown/site/site.go:238 +0x13c
main.glob..func6(0xc000092f20, 0x0, 0xc000092f20)
        /home/runner/work/FictionDown/FictionDown/search.go:33 +0x7f
github.com/urfave/cli.HandleAction(0xa2c980, 0xb279f0, 0xc000092f20, 0xc00013d900, 0x0)
        /home/runner/go/pkg/mod/github.com/urfave/[email protected]/app.go:490 +0xcf
github.com/urfave/cli.Command.Run(0xaf472e, 0x6, 0x0, 0x0, 0x1118610, 0x1, 0x1, 0xb00290, 0x12, 0x0, ...)
        /home/runner/go/pkg/mod/github.com/urfave/[email protected]/command.go:210 +0x99d
github.com/urfave/cli.(*App).Run(0x1121fc0, 0xc000054100, 0x4, 0x4, 0x0, 0x0)
        /home/runner/go/pkg/mod/github.com/urfave/[email protected]/app.go:255 +0x6b6
main.main()
        /home/runner/work/FictionDown/FictionDown/main.go:87 +0x125

软件版本: v0.1.3
运行环境: windows x64, linux x64
网络环境: 海外IP

稳定复现

自定义书源

有个未实现的功能就是自定义书源,这个刚好能用上。

[Enhanced] 顶点小说网域名更新,Xpath不需要变动

www.booktxt.net 301 跳转到 www.ddxstxt8.com

  • 章节目录结构Xpath等均不需要变动
  • https://github.com/ma6254/FictionDown/blob/35edca3576102a93f6c2a894e9b232155cbf92e5/sites/booktxt_net/main.go下的
		Match: []string{
			`https://www\.booktxt\.net/\d+_\d+/*`,
			`https://www\.booktxt\.net/\d+_\d+/\d+\.html/*`,
			`http://www\.booktxt\.net/book/goto/id/\d+`,
		},

需要替换为301跳转域名

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.