Comments (29)
Hi @smed79 the assorted repo was created when we migrated to the actual infrastructure. I will leave it as it is until I talked with Mitchell.
Will do asap @smed79 👍
from pyfunceble.
@anudeepND i know.You mentioned it in the issue i opened in your repo. No apologies needed as i'm only trying to make you aware of the situation so you can do some cleaning.I'm not affected as i'm loading your lists from the Ultimate-Hosts-Blacklist,specifically the clean.list
which is already filtered from anything possible(dead,invalid,whitelist and so on,and so on).
I would advise you to go by what is in the INVALID folder,see if you missed something and then check only the commits to that particular folder every week or so to see if anything new show up until @funilrys sort this one out and leave only the newcomers in the folder.
from pyfunceble.
Fix of @dnmTX last report (#16 (comment)) introduced with 784ad72.
It is now in the dev
version.
P.S: As @Ultimate-Hosts-Blacklist now use the master/stable version it will be effective from the time this issue will be reclosed automatically.
from pyfunceble.
@dnmTX @funilrys All these invalid domain came from a sub domain scanner I used months ago. I didn't take a closer look at the output, which made those domains to be ended up on the final list. Sorry for that :)
I will take closer look, line by line if possible to remove them.
Edit: The domains mentioned by @dnmTX was removed on December 23rd (anudeepND/blacklist@1a8f70e) along with other misspelled domains.
from pyfunceble.
@anudeepND it doesn't get updated at the present aka removing the old(non existent)entries and only leaving if there is a new ones.That's why i told you for now(at least) to check the "COMMITS" to that folder and see if anything new is added.This is what this post(issue) is about: me and @funilrys debating on what to do with all those domains that already has been removed from your original lists but still remain in the INVALID folder.So stand by as @funilrys got the last word here........
from pyfunceble.
@anudeepND yes, it's the expected behavior 👍
from pyfunceble.
@dnmTX will keep this closed until it is not the case 😸
I'm already on the monitoring since I pushed it to the dev version 😉
For now, I can tell that it is working!
from pyfunceble.
Please create an issue or create an internal discussion at @Ultimate-Hosts-Blacklist as @smed79 is responsible of those mentioned repository! We only provide the infrastructure!
I'll wait awhile to see if @smed79 respond to the discussion here.If he doesn't,i will.
from pyfunceble.
Hi,
propellerads revolving adservers added ==> domains.list
for the "assorted" repo, I do not remember that I am at the origin of creating that repo.
thank you for notifying.
from pyfunceble.
for the "assorted" repo, I do not remember that I am at the origin of creating that repo.
I do a search in my email-box and find the below conversations about cliqz.com
https://git.io/fhZUB
Ultimate-Hosts-Blacklist/Ultimate.Hosts.Blacklist@fbad015
It's unclear for me what this repo will inclut, so I am not going to maintain it.
@funilrys I sent you a request via email about creating a repo with the purpose of blocking "popads revolving ad servers".
Thank you.
from pyfunceble.
@funilrys for some reason in smed79_propellerads_adservers there is no clean.list
. Could be a bug.
from pyfunceble.
I agree with you, too many lists is confusing... 😕
They are relevant for ALL countries/locations especially for those users who visit streaming, torrent or adult sites. The malvertising ad networks that you have mentioned are using rotating domains trying to escape ad blockers. For that reason strict blocking is applied by EasyList for some sites (e.g. #p130918).
getadmiral list has the purpose of blocking the anti adblock wall https://vgy.me/1mqjc4.jpg
some sites here where it is used https://ghostbin.com/paste/uzocj/raw
PS:
- I am planning to maintain another list which will block
ad-maven.com
revolving adservers. - when I will have some free time, I am going to merge the lists mentioned in one full list.
from pyfunceble.
@dnmTX: @dead-hosts use the dev
version by default but members/maintainers can switch to the stable if they want 😄
Sooo, yes any changes here are there too 😉
@dead-hosts is actually the first place to use PyFunceble... You don't even have to think about how to use PyFunceble if your list are tested at @dead-hosts 😄
from pyfunceble.
On the other side, if it become ACTIVE
, we should include it into the official ACTIVE
list and maybe at the same time into a new analytic section/directory.
from pyfunceble.
in 99.5% of the cases(based on my observations for that particular lists-anudeepND) they are indeed INVALID.
For example:
2 ml.pubnative.net
# there is a "2" and "space" in the front of the domain
url p.adsymptotic.com
# there is a "url"(weird,i know) and "space" in the front
Is there a any chance those two and the rest,which are similar cases to EVER become valid?
Your script is doing great job finding them,the rest is up to you,either to dispose them or leave them on rotation which is more overhead for the whole filtering process.Anyway it takes two to three days to filter one lists,how about we start thinking how to reduce that time.
from pyfunceble.
@dnmTX I have removed most of the domains present in INVALID section, but it's not updated/removed from INVALID section. (For example, I have removed domains ending with invalid TLD .col). How often does the INVALID section gets updated?
from pyfunceble.
Fix introduced with 61b3bdd.
It is now on the dev
version.
from pyfunceble.
Let me explain what change since 13 minutes for everyone using the dev
version and later this week for everyone else using the stable
version.
Problematic
We were systematically generating outputs when retesting the content of the database subsystem. This caused some list to have INVALID
and INACTIVE
elements which are not anymore on the list.
Solution
I disabled the production of outputs (on file not on screen) for every element which is still INACTIVE
and INVALID
and at the same time already registered into the database.
That concretely means that for now, if the system retests an element which is the database, you'll get a friendly line like a normal test on screen but if the tested element is still INACTIVE
or INVALID
you'll not get anything in the generated data.
Side note
Please be aware: If the algorithm/system/script changes because of something like #17 (sorry 😭), a new web practice or a new RFC, some of those domains may become ACTIVE
in the future.
If it is the case we put the newly ACTIVE
domain in the official {domains, json, hosts}/ACTIVE/*
lists and at the same time we write it into the Analytic/SUSPICIOUS/*
files so that you can keep a track about what changes.
If you use outputs from @dead-hosts or @Ultimate-Hosts-Blacklist it is not a problem as they generate a clean.list
which only contain the elements which are ACTIVE
but if you use PyFunceble as a "standalone" sub-system/script/module you should keep in mind that such changes can happen.
If there is any question please let me know.
Cheers,
Nissar
from pyfunceble.
Looks GOOD and...no...no questions 😉
from pyfunceble.
@funilrys I looked at the latest commit, the output doesn't contain INVALID list, which means everything's good?
from pyfunceble.
Would advise to keep this one OPEN and monitor it for couple of weeks.
@anudeepND the filtering just started so lets wait until it's done before make any conclusions.
How to check? Go to info.json and there is a entry there:
"currently_under_test":
1=still filtering 0=done with the filtering
Usually takes 2 to 3 days.
from pyfunceble.
Ok...due to @anudeepND already removed all(assuming) invalid ones and there is no way to know if the new changes are working i was monitoring different lists-justdomains_....On which the domains.list
hasn't been updated for 29 days so far and in that lists's INVALID
folder there were two domains sitting there for a long time(which are also in the domains.list
as well,sorry but don't really remember the exact ones).Now,the filtering just finished and the INVALID
folder is empty,means,that something is not right.
Those two domains should've show up after each filtering because they were never removed from the domains.list
aka the original lists.
from pyfunceble.
Here you go,i found them.Those two should've show up in the INVALID
folder after the last filtering:
On the UP side filtering cycle is much much faster,the invalid domains are filtered indeed and by the look of it ACTIVE/hosts
doesn't get filled with duplicates anymore.
Downside is that if there is any invalid domains in the feature they will not be placed in the INVALID
folder until the next filtering.
from pyfunceble.
@funilrys i'm not sure you are familiar but just to point it out:
This one has no domains in it 😶
This one has only two(2) 🤔
from pyfunceble.
Hi @dnmTX, I'm aware of that 😸
@Ultimate-Hosts-Blacklist is open for everybody who wants to have their own repository and at the same being included into https://github.com/mitchellkrogza/Ultimate.Hosts.Blacklist 😄
Please create an issue or create an internal discussion at @Ultimate-Hosts-Blacklist as @smed79 is responsible of those mentioned repository! We only provide the infrastructure!
Cheers,
Nissar
from pyfunceble.
@smed79 i've been meaning to ask you. All your ads lists(admeasures_adservers
,getadmiral.com
and so on),which countries/locations they are most relevant to? Trying to decide which ones to use but really don't need the "extra weight" if you know what i mean.Anything North America would be my first choice.
@funilrys sorry,i know it's not really relevant to the subject(s) here,but anyway,it's all get mixed up in this post here mines as well ask away.
from pyfunceble.
@smed79 thank you for that informative answer.Looks like at one point or another they are all relevant to me( streaming, torrent or adult sites) so i guess i'll load them all.
Thank you for all the lists you providing and for the great job maintaining them 👍
when I will have some free time, I am going to merge the lists mentioned in one full list.
That would be great,i guess i'd better wait till then.
from pyfunceble.
Yep,looks like we are on track here. @anudeepND check out the newcomers in your INVALID folder. 🙂
@funilrys are any of the latest changes/fixes applied to Dead Hosts repo?
I was monitoring lightswich05 and it was stuck on filtering for two days.
Wondering if i have to move there and start...."inspecting" 😋
from pyfunceble.
@dnmTX: @dead-hosts use the dev version by default but members/maintainers can switch to the stable if they want 😄
PING @lightswitch05 !!!!
from pyfunceble.
Related Issues (20)
- FEATURE: Preload/Continue like the CI workflow ... but without Git HOT 4
- BUG: Cant install latest version with pip HOT 2
- BUG: urls in domain lists.... HOT 2
- FEATURE: Special Rules for forumactif.com HOT 1
- Contribution Tracking
- DOC: Moving away from restructured text HOT 8
- BUG: URL in file header should be changed
- BUG: dead domain query HOT 9
- BUG: sqlalche braekes after finishing...
- FEATURE: Reputation filter using the proxy connection HOT 3
- FEATURE: Sharing WHOIS
- FEATURE: meilisearch or redis support vs RamDrive
- BUG: TypeError(f"<data> should be {dict}, {type(data)} given.") HOT 5
- BUG: log file not created... HOT 2
- Special Rules, are they working as expected? HOT 2
- Unstable special rules HOT 1
- BUG: Object of type datetime is not JSON serializable HOT 2
- FEATURE: Timestamp in CLI output
- BUG: domains can't start with a dot... HOT 2
- pyfunceble.funilrys.com
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyfunceble.