GithubHelp home page GithubHelp logo

Comments (12)

 avatar commented on May 24, 2024 2

Side note, it may require a semi large rework but we could utilize Arti (GitLab Repo; lib.rs page) instead of a proxy to a localhost TOR server which would require less setup than the latter (a TOR localhost proxy requires you to run TOR in the background while Arti implements all the logic itself, so you don't need anything except for the libreddit program to be running).

from libreddit.

seychelles111 avatar seychelles111 commented on May 24, 2024 1

Hey i've utilized an idea to "educationally-use"-IPV6 #845

imo use an ipv6 per user.... maybe strong server would be needed ... idk

from libreddit.

stulto avatar stulto commented on May 24, 2024

However...

* I'm not 100% sure if that does not count as "abuse" of the TOR network as quite some traffic might be generated.

I feel if the instance also hosted TOR guard and relay nodes this shouldn't be an issue, and would actually help other instances/peers on the network.

Though there's still the problem of exit nodes... maybe have 2/3 of connections go through the TOR network (1/3 through exit nodes and the other 1/3 through Reddit's hidden service) and the other 1/3 go through clearnet. That way there'd be equal compensation for the use of exit nodes, and adequate compensation for the guard and relay nodes.

Though it might be smart to make this as a support layer that libreddit can sit atop instead of something baked directly into the project (as it would facilitate reuse in other privacy respecting frontends). Say a session based connection scrabler, where every session is randomly distributed over TOR, (maybe I2P,) and clearnet, over a predetermined list of servernames ("{old|www}.reddit.com" have their own hidden services "{old|www}.redditto...wfj4ooad.onion" which could also be used for scraping (I'm of course assuming that old.reddit.com and www.reddit.com would have independent rate limits as they're distinct servers but that may be wrong)).

Just a thought.

from libreddit.

seychelles111 avatar seychelles111 commented on May 24, 2024

Time to route different /64 ipv4 addresses - or limit by /48 addresses

from libreddit.

 avatar commented on May 24, 2024

I kind-of dirty patched TOR routing into libreddit with arti-hyper. It works, but it has a considerable delay because it creates a new TOR client for every request. That works great for anonymity and dodging 429s, but not so great for avoiding delays, so we'll want to store a global client and only reload when we hit a 429 in production probably.

src/client.rs:

diff --git a/src/client.rs b/src/client.rs
index 4c174cd..ca3eef6 100644
--- a/src/client.rs
+++ b/src/client.rs
@@ -1,24 +1,20 @@
+use arti_client::*;
+use arti_hyper::*;
 use cached::proc_macro::cached;
 use futures_lite::{future::Boxed, FutureExt};
-use hyper::client::HttpConnector;
-use hyper::{body, body::Buf, client, header, Body, Client, Method, Request, Response, Uri};
-use hyper_rustls::HttpsConnector;
+use hyper::{body, body::Buf, client, header, Body, Method, Request, Response, Uri};
 use libflate::gzip;
-use once_cell::sync::Lazy;
 use percent_encoding::{percent_encode, CONTROLS};
 use serde_json::Value;
 use std::{io, result::Result};
+use tls_api::{TlsConnector as TlsConnectorTrait, TlsConnectorBuilder};
+use tls_api_native_tls::TlsConnector;
 
 use crate::dbg_msg;
 use crate::server::RequestExt;
 
 const REDDIT_URL_BASE: &str = "https://www.reddit.com";
 
-static CLIENT: Lazy<Client<HttpsConnector<HttpConnector>>> = Lazy::new(|| {
-	let https = hyper_rustls::HttpsConnectorBuilder::new().with_native_roots().https_only().enable_http1().build();
-	client::Client::builder().build(https)
-});
-
 /// Gets the canonical path for a resource on Reddit. This is accomplished by
 /// making a `HEAD` request to Reddit at the path given in `path`.
 ///
@@ -75,7 +71,12 @@ async fn stream(url: &str, req: &Request<Body>) -> Result<Response<Body>, String
 	let uri = url.parse::<Uri>().map_err(|_| "Couldn't parse URL".to_string())?;
 
 	// Build the hyper client from the HTTPS connector.
-	let client: client::Client<_, hyper::Body> = CLIENT.clone();
+	let client: client::Client<_, hyper::Body> = {
+		let tor_client = TorClient::builder().bootstrap_behavior(BootstrapBehavior::OnDemand).create_unbootstrapped().unwrap();
+		let tls_connector = TlsConnector::builder().unwrap().build().unwrap();
+		let tor_connector = ArtiHttpConnector::new(tor_client, tls_connector);
+		hyper::Client::builder().build(tor_connector)
+	};
 
 	let mut builder = Request::get(uri);
 
@@ -129,7 +130,12 @@ fn request(method: &'static Method, path: String, redirect: bool, quarantine: bo
 	let url = format!("{}{}", REDDIT_URL_BASE, path);
 
 	// Construct the hyper client from the HTTPS connector.
-	let client: client::Client<_, hyper::Body> = CLIENT.clone();
+	let client: client::Client<_, hyper::Body> = {
+		let tor_client = TorClient::builder().bootstrap_behavior(BootstrapBehavior::OnDemand).create_unbootstrapped().unwrap();
+		let tls_connector = TlsConnector::builder().unwrap().build().unwrap();
+		let tor_connector = ArtiHttpConnector::new(tor_client, tls_connector);
+		hyper::Client::builder().build(tor_connector)
+	};
 
 	// Build request to Reddit. When making a GET, request gzip compression.
 	// (Reddit doesn't do brotli yet.)

Then I just added arti_client, arti_hyper, tls_api, and tls_api_native_tls to Cargo.toml.

from libreddit.

artemislena avatar artemislena commented on May 24, 2024

T.: There's https://git.spec.cat/Nyaaori/libreddit also which uses Arti, too. Doesn't seem particularly slow either; we're using't for lr.artemislena.eu currently.

from libreddit.

 avatar commented on May 24, 2024

@artemislena looks great :) thanks for sharing. maybe you could set up a github mirror and open a pull request here so your improvements are available to more people?

from libreddit.

avincent98144 avatar avincent98144 commented on May 24, 2024

from libreddit.

avincent98144 avatar avincent98144 commented on May 24, 2024

from libreddit.

 avatar commented on May 24, 2024

@avincent98144 ...can you elaborate? i can't really tell what you're trying to say. if you meant to say Tanith's links lead nowhere, that's not true (at least not for me):

image

image

from libreddit.

avincent98144 avatar avincent98144 commented on May 24, 2024

from libreddit.

artemislena avatar artemislena commented on May 24, 2024

T.: It's not our Forgejo instance, we didn't make the fork, n we don't got enough experience in Rust programming (or enough interest in programming in general) for doing this ^^; I mean sure we could open a PR but we can't provide any support on't, beyond on how ya host't; for the container it's recommended mounting /data in a persistent location (owned by UID 1000 in the container) for faster startup.
@avincent98144 Idk, the Forgejo link works fine for us, but ya can use https://send.artemislena.eu/download/08239cf210c1cd8f/#Qx11ppk5NmwGySBcsx-IRg for downloading a tarballa the repo (link's gonna expire after 100 downloads or 7 days).

from libreddit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.