darealfreak / watcher-go Goto Github PK
View Code? Open in Web Editor NEWdownload and keep track of your favorite artists on multiple platforms
License: MIT License
download and keep track of your favorite artists on multiple platforms
License: MIT License
if you are f.e. IP banned or something the like it would be beneficial to be able to skip modules.
just running only 1 module without the parallel mode can be quite time consuming, so definitely a need for that.
twitter module often tries to download already downloaded posts, most likely the order is reversed
Dependabot can't resolve your Go dependency files.
As a result, Dependabot couldn't update your dependencies.
The error Dependabot encountered was:
github.com/DaRealFreak/watcher-go/cmd/watcher: cannot find module providing package github.com/DaRealFreak/watcher-go/cmd/watcher
If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.
due to using the full file path for all added images the command exceeds the max length earlier than expected.
change into the dir for conversion without using the full file path
sankaku got books, would be neat to track books of specific tags too and get them in separate folders
pixiv booth are not supported as of yet, would be neat to support it too, unsure if we can retrieve it with the API though
currently there is no option to enable or disable tracked items except for manually editing it in the database, this should get added
since proxy loops are now included in the base module we should add a function to retrieve the next proxy to the module base
since the API now accepts start_date and end_date arguments we can circumvent the 5000 API results limit of the pixiv API
Dependabot couldn't parse the go.mod found at /go.mod
.
The error Dependabot encountered was:
go: github.com/spf13/[email protected] requires
github.com/grpc-ecosystem/[email protected] requires
gopkg.in/[email protected]: invalid version: git fetch --unshallow -f origin in /opt/go/gopath/pkg/mod/cache/vcs/748bced43cf7672b862fbc52430e98581510f4f2c34fb30c0064b7102a68ae2c: exit status 128:
fatal: The remote end hung up unexpectedly
currently a random delay between 1.5 and 2.5 seconds is chosen.
the data harvesting check is most likely a leaky bucket too, since requests can take a different amount of time.
Instead of random sleep times a leaky bucket should be implemented for eh as for pixiv already.
refill of 1.5s should be a good start for estimating the bucket size.
modules should be able to have a custom configuration (f.e. animation conversion of pixiv to webp/gif/fliff)
mainly this API functionality:
https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline
have to wait on the twitter developer account response though
currently the OAuth2 clients are not getting backed up, this should get added
forgot to remove the response structs when I removed the unused API functions, they should get removed too
currently modules get fully initialized twice:
1st time at the startup for cobra commands
2nd time for actually parsing the jobs
for debugging purposes it would help to differentiate between those 2.
when the manga type work of pixiv isn't fully downloaded (f.e. 2 out of 5 pages) the item gets marked as complete nonetheless.
we may have to split illustration and manga downloads here
respecting the search type for pixiv is defined by the type argument
type:
mode:
since it also saves deleted galleries and lists previously updated galleries it would be pretty nice to have
currently in adding account and update account command we use one time "user" and one time "username" as argument, unifying it to "user" would be neat
since we only need the client request there is no need for a webserver.
we can ignore the not resolved error and just retrieve the URL
currently it's always in the path where the app is executed, would make sense to make this configurable too since the configuration file is configurable too
Just iterating through the jump selection after the last selected item would have the executed requests and be more error prone
more types of pixiv should become supported for download
since the UUID can't be compared with bigger/smaller comparison deleting the newest work can cause having to download all gallery items once again, using the published_time as unique ID could prevent that since can use > comparison to detect new items
DA module currently fails with return value "invalid_request (Request field validation failed.)" for parsing collections of favourites
for the eh module the generated image link is only available for 3600 seconds.
with the random delay this time can get exceeded with a minimum of just 1440 images.
the module should generate the image link on the download part to fix this
the identifier shouldn't be be client ID only but client ID and access token to work with static tokens too
instead of implementing another wrapper for the default session we could use the default session and add an http round tripper to manage the pixiv API headers which would be much cleaner
since logrus is in maintenance mode migrating the logging to zap would be nice to have
to directly sort by new updates the pixiv module should also update the changed timestamp of the user directory
patreon as site would be pretty neat, patreon even offers a pretty good API
https://docs.patreon.com/#apiv2-oauth
stop reporting custom thrown errors, they are already handled/there for a reason
proxy usage would be neat, optimally even module specific proxy connections
with the new structure we don't get the file extension in the download anymore, we should use ImageMagick to check for image similarity
%IM%convert input1.png -resize 200x200 input1_scaled.png
%IM%convert input2.png -resize 200x200 input2_scaled.png
%IM%compare -subimage-search -metric mse input1_scaled.png input2_scaled.png NULL:
or for better performance but without coordinates of expected sub image:
%IM%convert input1.png -resize 200x200 input1_scaled.png
%IM%convert input2.png -resize 200x200 input2_scaled.png
%IM%compare -metric mse input1_scaled.png input2_scaled.png NULL:
While at it we should rename the downloaded file to identify and match the image format:
%IM%identify -format "%m" input2
the endpoint throws a lot of internal server errors, while the web interface works properly
a fallback in case the API endpoint throws the internal server error to the web interface would be useful.
response on internal server error:
{
"error": "server_error",
"error_description": "Internal server error.",
"status": "error"
}
generally the application should skip the currently handled item on errors.
while it is great for seeing when something failed, it is kinda tedious when you run the application and your internet is gone for a few minutes
currently missing from the CLI workflow and only configurable through environment variables so far:
f.e. 1671575/af7d69fce1 contains a new error message that a gallery is unavailable due to copyright which is not caught yet
we should clear temp files after we're done using them. windows f.e. will clear them automatically after 10 days but on many many downloads the size can escalate really quick
translate the app url meta data to parse the endpoint:
document.querySelector('meta[property="da:appurl"]').content
https://www.deviantart.com/developers/app_links
relevant uris:
all other uris don't contain deviations or don't provide a proper sorting so we can update/track them properly
Currently when the current item gets deleted/updated eh search update won't stop and it will try to add every gallery again.
This causes some delays for opening all pages (leaky bucket) again.
Check whether there is an increment identifier visible for fixing it or use fallback items to minimize the chance of it happening
the search functionality is currently not working when the download limit got reached while it shouldn't be affected by it
while having one proxy is already helpful it would be even better if the proxies could be switched after capping the limit on one.
so it could loop through the proxies to ignore any imposed limitation.
add option to use goroutines for parallel processing of modules.
since the session is not shared and the sqlite database connection is waiting for the other write process to finish nothing is speaking against it.
the API search is limited up to 5000 results
the web search does not impose any limits so a workaround would be needed.
currently when an error occurs it is not directly visible from which module it originates from
write a custom log formatter to set for the modules and use it on the error check
the twitter module currently ignores all videos
currently missing from the CLI workflow:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.