GithubHelp home page GithubHelp logo

Implement upload solution about zimfarm HOT 18 CLOSED

openzim avatar openzim commented on June 7, 2024
Implement upload solution

from zimfarm.

Comments (18)

kelson42 avatar kelson42 commented on June 7, 2024

I might have a simple basic idea is to create a docker image/container which simply handles the uploads (on the server side).

The technology of file transfer might be discussed, but basically it's a server which allows to upload a file which has to be able to do two things:

  • Take uploads
  • Check user/pass against zimfarm API.

This could be quite easy to do that with the FTP protocoll with for example:

from zimfarm.

automactic avatar automactic commented on June 7, 2024

The problem with FTP (same with rsync) is the first time worker uploads, a fingerprint verification prompt will appear

from zimfarm.

kelson42 avatar kelson42 commented on June 7, 2024

@automactic FTP is user/pass based, nothing to do with any crypto proof or fingerprint. You just need the right user/pass.

from zimfarm.

automactic avatar automactic commented on June 7, 2024

Mmm... yes you don't need to verify key with FTP.

I don't know if WebDAV could be a good option.

from zimfarm.

kelson42 avatar kelson42 commented on June 7, 2024

@automactic I don't really care about the protocoll as long as it work. WebDAV might work but it's more risky than FTP and I'm not sure you will get such an easy solution like you might have for FTP with twisted or pyftpdlib. In addition they are for more less WebDAV clients than FPT clients.

from zimfarm.

automactic avatar automactic commented on June 7, 2024

Yes I will try to do this with pyftpdlib, looks promising. Since FTP does not encrypt password and everyone can see it. But if we use FTPS I believe the credentials should also be encrypted.

from zimfarm.

kelson42 avatar kelson42 commented on June 7, 2024

@automactic good... I would recommend to somehow use password with a short life time (maybe only one usage?). We can do that as the dispatcher distribute them and do the check.

from zimfarm.

automactic avatar automactic commented on June 7, 2024

It could also be a good idea to have the worker request a file transfer token from dispatcher and send this token to ftp server when uploading the file. Upon receiving the token, the ftp server verify it with dispatcher.

This way, the worker only need to transfer the password once (with dispatcher) during its lifetime. But user also cannot login to the ftp server with other tools (since users cannot see the token)

from zimfarm.

automactic avatar automactic commented on June 7, 2024

Due to some reasons I cannot connect to the ftp server once they are in the container, but it works fine outside the container. The error log is below

INFO:pyftpdlib:196.52.2.24:59014-[] FTP session opened (connect)
DEBUG:pyftpdlib:196.52.2.24:59014-[] -> 220 Welcome to Zimfarm Warehouse.
DEBUG:pyftpdlib:196.52.2.24:59014-[] <- USER admin
DEBUG:pyftpdlib:196.52.2.24:59014-[] -> 331 Username ok, send password.
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] <- PASS ******
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] -> 230 Hi, there!
INFO:pyftpdlib:196.52.2.24:59014-[admin] USER 'admin' logged in.
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] <- FEAT
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] -> 211 End FEAT.
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] <- OPTS UTF8 ON
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] -> 501 Invalid argument.
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] <- SYST
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] -> 215 UNIX Type: L8
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] <- PWD
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] -> 257 "/" is the current directory.
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] <- CWD /
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] -> 250 "/" is the current directory.
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] <- TYPE A
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] -> 200 Type set to: ASCII.
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] <- PASV
DEBUG:pyftpdlib:196.52.2.24:59014-[admin] -> 227 Entering passive mode (172,17,0,3,200,212).
DEBUG:pyftpdlib:[debug] call: close() (<FTPHandler(id=140131051023048, addr='196.52.2.24:59014', user='admin')>)
DEBUG:pyftpdlib:[debug] call: close() (<pyftpdlib.handlers.PassiveDTP listening 172.17.0.3:0 at 0x7f72cd171710>)
INFO:pyftpdlib:196.52.2.24:59014-[admin] FTP session closed (disconnect).

Any ideas?

from zimfarm.

kelson42 avatar kelson42 commented on June 7, 2024

@automactic I fully support #38 (comment). Regarding the transfer problem, you probably do not support the FTP passiv mode properly. Look a bit how FTP passiv works (you need basically to open other ports).

from zimfarm.

kelson42 avatar kelson42 commented on June 7, 2024

Here is for example an example fix stilliard/docker-pure-ftpd@da4dee5

from zimfarm.

automactic avatar automactic commented on June 7, 2024

I took a further look into the issue and read more about passive FTP. It looks like after receiving PASV, the server is returning 227 Entering passive mode (172,17,0,3,109,116). The problem is when worker is uploading files, the data transfer happens on 172.17.0.0.3:28020, which is internal to docker network. We need a way to have the FTP server return the correct address.

from zimfarm.

automactic avatar automactic commented on June 7, 2024

I need to set masquerade_address on the FTP server

from zimfarm.

kelson42 avatar kelson42 commented on June 7, 2024

@automactic OK, so does it work ?

from zimfarm.

automactic avatar automactic commented on June 7, 2024

Yes it does, but when you need to run ftp server you need to set external ip in environment

from zimfarm.

kelson42 avatar kelson42 commented on June 7, 2024

@ dispatcher/warehouse should be moved to /. Considering that the warehouse does not probably run on the same serveur and that in any case is not depending from the dispatcher, it should not be part of it IMO.

from zimfarm.

kelson42 avatar kelson42 commented on June 7, 2024

@automactic Could you also merge your code to master so we can close that ticket?

from zimfarm.

automactic avatar automactic commented on June 7, 2024

The plain FTP server approach discussed in this issue is already implemented.

from zimfarm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.