GithubHelp home page GithubHelp logo

kemono-scraper's People

Contributors

elvis972602 avatar gnsfujiwara avatar mookau avatar mvpair avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

kemono-scraper's Issues

How to use?

My sincerest apologies for asking this under "Issues" but I would love to use this tool, but I have no idea how to install/use it. Would it be possible to create a tutorial? All I really need guidance on is how to set it up. Sorry again!

"Cookie is empty"

Hi,
I can't seem to make cookies.txt work.

File cookies.txt contains:

kemono.party	FALSE	/	FALSE	1697535614	session	[token]
coomer.party	FALSE	/	FALSE	1697289572	session	[token]

I tried adding dot on beginning, change 2nd FALSE to TRUE, change session to kemono_auth or commer_auth, because your example is different from what cookie export gives you.

Your example ### Cookie File only needed if you want to download favorite creators or posts

--cookie PATH
cookie file, default is cookies.txt (value separate by whitespace)
syntax:

Domain Include subdomains Path Secure Expiry Name Value
.kemono.party FALSE / TRUE 1706755572 kemono_auth

you can get cookies easily by using Chrome extension Get cookies.txt LOCALLY

I get this error:
2023/09/17 12:37:06 cookie is empty

Regardless if I do:
.\kemono-scraper_no_cookies_detection.exe
.\kemono-scraper_no_cookies_detection.exe --cookie .\cookies.txt
I tried full path to the cookies.txt file, backslash and quotes.

Config:

cookie: .\cookies.txt
content: true
banner: true
async: true
max-download-parallel: 5
fav-site: coomer
fav-creator: true
output: D:\kemono-scraper
template: "[<ks:service>] <ks:creator>/<ks:post>/<ks:filename><ks:extension>"
image-template: "[<ks:service>] <ks:creator>/<ks:post>/<ks:index><ks:extension>"
video-template: "[<ks:service>] <ks:creator>/<ks:post>/video/<ks:filename><ks:extension>"
retry: 10
retry-interval: 15
# proxy: socks5://proxy:1080

I once again tried full path to the cookies.txt file, backslash and quotes.

I have to use the version with detection and even that has to be pointed on a browser that is not being used, otherwise I'll get an error like this:

2023/09/17 13:56:15 Error reading cookies: could not open database copy: could not read database file: 
open C:\Users\shodan\AppData\Local\Google\Chrome\User Data\Default\Network\Cookies: 
The process cannot access the file because it is being used by another process.

Am I doing something wrong? Can you add an option to use the cookie token in CLI for example?

Is it possible to run more processes at once? Because I thought the max-download-parallel refers to how many posts or creators are being scraped at once, but it seems to work only if currently scraped post has more of files inside of it.

has an error

link:https://coomer.party/onlyfans/user/loisplz

panic: unmarshal post list error: invalid character '<' looking for beginning of value

goroutine 1 [running]:
github.com/elvis972602/kemono-scraper/kemono.(*Kemono).Start(0xc00015c120)
C:/Users/elvis/GolandProjects/Kemono-scraper/kemono/kemono.go:233 +0x5d7
main.main()
C:/Users/elvis/GolandProjects/Kemono-scraper/main/main.go:464 +0x3a65

Issue with getting cookies to work

after some finagling, I've gotten it to latch onto my cookie correctly, as best I can tell;

The relevant bit from the cmd window...
\cookies.sqlite --fav-site kemono --fav-creator 1
2023/07/02 10:42:17 searched 11115 files
2023/07/02 10:42:17 found cookie database
2023/07/02 10:42:17 searched 11115 files
2023/07/02 10:42:17 found local state file
2023/07/02 10:42:17 fetching favorite creators from kemono.party
2023/07/02 10:42:19 Error getting favorites: 401

But I have no idea how to move past that 401 error.
I'm having it pull the cookie directly from my firefox profile folder (located through firefox, cookies.sqlite shows last updated 1 min ago.)

It's worked wonderful for downloading from a single creator at a time, though I was hoping to set it to run by itself.

Thank you for the program!

Suggestion - add more tags in output path

Hi, I want use these tags to save my output path.

  1. I want to add 'creator_id' tag in output path
    I think this option can find path easier by creator's id, if creator's name has modified.

  2. I want to add 'revision' tag
    revision feature already suggested by other user, I want it too and want to save all revisions.
    #31

Thank you.

Proxy errors

There seems to be some errors in the proxy settings. When I specify a proxy server in this setting, such as setting it to --proxy http://127.0.0.1:1080 , still encountering some unreachable errors. And when I used a --proxy, I observed a link established with kemono.party, but it still prompts for various connection errors.

But when I took over all network connections using a virtual network card, the error no longer occurred. All downloads are proceeding normally. The virtual network card and proxy server use the same server connection. I doubt if there are any network connections that have not been overwritten by proxy settings.

Here are some error messages I have encountered:

Error getting favorites: Get https://kemono.party/api/favorites?type=user: dial tcp 199.59.148.209:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

HTTP:EOF (Forgotten specific information)

Bug: runtime error: slice bounds out of range [:-1]

When i downloaded this post: https://kemono.party/patreon/user/11324565/post/28226372

The program showed this error message:
panic: runtime error: slice bounds out of range [:-1]

goroutine 49 [running]:
main.main.func4({0x868, {0xc00080dc10, 0x8}, {{0x2fd1c238, 0xed71b00a6, 0x171dc80}}, {0xc00080dc20, 0xa}, {0xc00080dc30, 0x7}, ...}, ...)
C:/Users/elvis/GolandProjects/Kemono-scraper/main/main.go:319 +0xc2b
github.com/elvis972602/kemono-scraper/downloader.(*downloader).Download.func1()
C:/Users/elvis/GolandProjects/Kemono-scraper/downloader/downloader.go:309 +0x24d
created by github.com/elvis972602/kemono-scraper/downloader.(*downloader).Download
C:/Users/elvis/GolandProjects/Kemono-scraper/downloader/downloader.go:297 +0xd3

Panic: interface conversion: interface {} is int, not float64

I follow the instruction to create config.yaml as below, and put it in the same directory with exe file.

The error message comes when I am trying to run it. It works fine if I input in command line.

Also switches --extensionOnly and --extensionExcept in README.MD are incorrect. The program only recognizes --extension-only and --extension-except. This takes me much time to understand where the problem is after reading error log.

Moreover, when I try to download large PSD files, it always ends like �[31m[Failed]�[0mcontext deadline exceeded and files are broken. Take a try at https://kemono.party/fanbox/user/273185/post/5358600

retry: 10
retry-interval: 30
with-prefix-number: true
panic: interface conversion: interface {} is int, not float64

goroutine 1 [running]:
main.setFlag()
	C:/Users/elvis/GolandProjects/colly/Kemono-scraper-latest/main/main.go:549 +0x1179
main.main()
	C:/Users/elvis/GolandProjects/colly/Kemono-scraper-latest/main/main.go:55 +0x9f

Suggestion: Skip files

It seems the program downloads both the images and the .zip file which contains all the images inside. I think it is a waste of time to download both since they are the same thing in most cases. A option to download only one of these would be welcome.

How do I build this from source

Tried doing go mod download and go build on /main

Idk if i'm doing this correctly as I don't have knowledge using go lang

kemono start failed: fetch creator list error: Get "https://kemono.party/api/v1/creators": EOF

Sorry for asking this, I'm new and I would love to use this tool, but I have no idea what's causing this
I already try this "you can get cookies easily by using Chrome extension [Get cookies.txt LOCALLY]"
but it always return fail start:
"kemono start failed: fetch creator list error: Get "https://kemono.party/api/v1/creators": EOF"

The cookies txt looks like:

.kemono.su TRUE / FALSE 1733729884 _ddgid VfSCXXXXXXWcs9vD
.kemono.su TRUE / FALSE 1702280284 _ddgmark EJYULXXXXXXID3IbL
.kemono.su TRUE / FALSE 1702204684 _ddg5 ltLCXXXXXX4qHzcI4
.kemono.su TRUE / FALSE 1733729885 _ddg2 2utfTXXXXXX7W9RE
.kemono.su TRUE / FALSE 1733729889 _ddg1 HFKJO2VXXXXXXQCcVoYy
kemono.su FALSE / FALSE 1736667489 thumbSize 180
kemono.su FALSE / FALSE 1736663889 __PPU_puid 7308443XXXXXX194176
kemono.su FALSE / FALSE 0 bnState_1942468 {"impressions":1,"delayStarted":0}

with XXXXXX some random characters

C:\Users\rumbba\Downloads\kemono>kemono-scraper.exe -link https://kemono.su/fanbox/user/938544
Downloading Kemono
fetching creator list...
2023/12/10 14:47:03 kemono start failed: fetch creator list error: Get "https://kemono.party/api/v1/creators": EOF

C:\Users\rumbba\Downloads\kemono>kemono-scraper_no_cookies_detection.exe -link https://kemono.su/fanbox/user/938544
Downloading Kemono
fetching creator list...
2023/12/10 14:47:32 kemono start failed: fetch creator list error: Get "https://kemono.party/api/v1/creators": EOF

C:\Users\rumbba\Downloads\kemono>kemono-scraper.exe -cookie kemono.su_cookies.txt
2023/12/10 14:48:21 creator is empty

C:\Users\rumbba\Downloads\kemono>kemono-scraper.exe -link https://kemono.su/fanbox/user/938544
Downloading Kemono
fetching creator list...
2023/12/10 14:48:24 kemono start failed: fetch creator list error: Get "https://kemono.party/api/v1/creators": EOF

C:\Users\rumbba\Downloads\kemono>kemono-scraper.exe -link https://kemono.su/fanbox/user/938544/post/2326356
Downloading Kemono
fetching creator list...
2023/12/10 14:50:37 kemono start failed: fetch creator list error: Get "https://kemono.party/api/v1/creators": EOF

IDK if this is a stupid Question, but i really need some guidance. thanks alot

Suggestion - Alternative Kemono Domain Support

It is great to see programe works better day by day.

As per the banner of Kemono, it recommends users to add a backup site with SU domain in bookmark, if something wrong happens to main site.

I reckon it would be nice to support a backup site by a switch, in case the main site goes down?

Suggestion: add no-banner flag

Sometimes it is necessary to set banner: true in config.yaml while still override them occasionally with flags. However currently it's impossible to do so.

From common argparse practise these boolean flags are provided in pairs: --banner and --no-banner, so that the boolean flags can have a chance to override configs no matter which state was set before.

Will Kemono Scraper consider to add this support? Thank you.

ERROR:No creator information was retrieved

When I entered the same command about a month ago, I was able to download the file, but when I just ran it, I got the following result and was unable to download the file.
Can you tell me how to handle it?

D:\kemono-dl-main>python kemono-dl.py --cookies "cookie.txt" --from-file "fav_users.txt" --extract-links --icon --inline --banner --dms --filename-pattern "{username}({user_id}) - ({id}){title}\{username}({user_id}) - ({id}){title} {index}.{ext}"
WARNING:URL is not downloadable | https://kemono.su/fanbox/user/254423
WARNING:URL is not downloadable | https://kemono.su/fanbox/user/3316400
WARNING:URL is not downloadable | https://kemono.su/fanbox/user/2401221
WARNING:URL is not downloadable | https://kemono.su/fanbox/user/8980751
WARNING:URL is not downloadable | https://kemono.su/fanbox/user/14253294
ERROR:No creator information was retrieved. | exiting

Download posts from different periods

https://kemono.party/fanbox/user/2557134/post/5773097

Kemono seems to offer posts from different periods. As shown in the link above, attachments have been removed in newer versions. I hope to download all versions to ensure file integrity. I hope to complete the historical versions without affecting the existing image database. If there is a unique numerical identifier or tag to distinguish versions?

Downloading only the latest or only the oldest posts is not appropriate, as some authors prefer to add new content while others prefer to delete content. It is necessary to save all versions, but different versions of posts may have the same name.

I have been thinking for a long time but haven’t come up with a good naming solution (without affecting the current database content, especially the folder hierarchy).

The current folder and file structure:

\[fanbox]noeyebrow
        \[20230421] [5773097] 🚀「明日もな」原寸JPG配布終了
                0.jpg
                content.html

How to update the database

I tried:

date-after: 20230531
update-after: 20230531

Whether they are written in the configuration file simultaneously or separately, new posts from June and July cannot be downloaded.

I hope to have incremental updates to the database: download new posts and update posts with changes. How should I write the configuration file?

Are there any conflicting relationships between different settings? A clearer explanation is indeed needed.
Does "-- date" represent the upload date of the post on the source website, and "-- update" represent the modification date of the post on Kemono? Can these two settings be used simultaneously and what specific effects will they have?

Can't use the command:--fav-creator

this is my config.yaml:
cookie-browser: firefox
banner: true
fav-site: kemono
fav-creator: true
retry: 10
I had cookie.txt next to the .exe
But I double-clilck exe,it said:
2023/12/25 22:21:40 load cookie from cookies.txt
2023/12/25 22:21:40 fetching favorite creators from kemono.party
then .exe closed.
anything wrong? Plese help me. thank you.

Cannot skip existing files

Something's wrong, this problem has reappeared. This time, I did not open any other software.

The software will continuously try to write to the existing image number 0, and it will be rejected. Repeat many times until successful after about ten minutes.

I found that the program was constantly creating. tmp files, and when I tried to delete the existing 0.jpg, it would prompt that the file was occupied by "kemono scraper. exe".

图片
图片

图片

Error encountered in a post with no content

https://kemono.party/

The program reports an error and exits directly: "panic: runtime error: index out of range [-1]"
The official said that this type of post is caused by a blacklist, and I think it can be directly ignored.
(Does anyone know what a blacklist means?)

https://kemono.party/
And in this post, the downloader download three extra txt files.

"An existing connection was forcibly closed by the remote host."
I have often encountered such mistakes recently, and they will be resolved in a few minutes. It seems that the website is restricting downloads?

invalid character '<' looking for beginning of value

Downloading Coomer
fetching creator list...
panic: unmarshal creator list error: invalid character '<' looking for beginning of value

goroutine 1 [running]:
github.com/elvis972602/kemono-scraper/kemono.(*Kemono).Start(0xc000071ae0)
C:/Users/elvis/GolandProjects/Kemono-scraper/kemono/kemono.go:225 +0x111e
main.main()
C:/Users/elvis/GolandProjects/Kemono-scraper/main/main.go:484 +0x3ca5

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.