GithubHelp home page GithubHelp logo

diyer22-forks / megfile Goto Github PK

View Code? Open in Web Editor NEW

This project forked from megvii-research/megfile

0.0 1.0 0.0 6.82 MB

Megvii FILE Library - Working with Files in Python

Home Page: http://megvii-research.github.io/megfile

License: Apache License 2.0

Makefile 0.21% Python 99.79%

megfile's Introduction

megfile - Megvii FILE library

Build Documents Codecov Latest version Support python versions License CII Best Practices

megfile provides a silky operation experience with different backends (currently including local file system and s3), which enable you to focus more on the logic of your own project instead of the question of "Which backend is used for this file?"

megfile provides:

  • Almost unified file system operation experience. Target path can be easily moved from local file system to s3.
  • Complete boundary case handling. Even the most difficult (or even you can't even think of) boundary conditions, megfile can help you easily handle it.
  • Perfect type hints and built-in documentation. You can enjoy the IDE's auto-completion and static checking.
  • Semantic version and upgrade guide, which allows you enjoy the latest features easily.

megfile's advantages are:

  • smart_open can open resources that use various protocols, including fs, s3, http(s) and stdio. Especially, reader / writer of s3 in megfile is implemented with multi-thread, which is faster than known competitors.
  • smart_glob is available on s3. And it supports zsh extended pattern syntax of [], e.g. s3://bucket/video.{mp4,avi}.
  • All-inclusive functions like smart_exists / smart_stat / smart_sync. If you don't find the functions you want, submit an issue.
  • Compatible with pathlib.Path interface, referring to S3Path and SmartPath.

Quick Start

Here's an example of writing a file to s3, syncing to local, reading and finally deleting it.

Functional Interface

from megfile import smart_open, smart_exists, smart_sync, smart_remove, smart_glob

# open a file in s3 bucket
with smart_open('s3://playground/megfile-test', 'w') as fp:
    fp.write('megfile is not silver bullet')

# test if file in s3 bucket exist
smart_exists('s3://playground/megfile-test')

# or in local file system
smart_exists('/tmp/playground/megfile-test')

# copy files or directories
smart_sync('s3://playground/megfile-test', '/tmp/playground/megfile-test')

# remove files or directories
smart_remove('s3://playground/megfile-test')

# glob files or directories in s3 bucket
smart_glob('s3://playground/megfile-?.{mp4,avi}')

# smart_open also support protocols like http / https
smart_open('https://www.google.com')

SmartPath Interface

from megfile.smart_path import SmartPath

path = SmartPath('s3://playground/megfile-test')
if path.exists():
    with path.open() as f:
        result = f.read(7)
        assert result == b'megfile'

Command Line Interface

$ megfile --help  # see what you can do

$ megfile ls s3://playground/
$ megfile ls -l -h s3://playground/

$ megfile cat s3://playground/megfile-test

$ megfile cp s3://playground/megfile-test /tmp/playground/megfile-test

Installation

PyPI

pip3 install megfile

You can specify megfile version as well

pip3 install "megfile~=0.0"

Build from Source

megfile can be installed from source

git clone [email protected]:megvii-research/megfile.git
cd megfile
pip3 install -U .

Development Environment

git clone [email protected]:megvii-research/megfile.git
cd megfile
pip3 install -r requirements.txt -r requirements-dev.txt

Configuration

Before using megfile to access files on s3, you need to set up authentication credentials for your s3 account using the AWS CLI or editing the file ~/.aws/config directly, see also: boto3 configuration & boto3 credentials.

$ aws configure
AWS Access Key ID [None]: accesskey
AWS Secret Access Key [None]: secretkey
Default region name [None]:
Default output format [None]:

# for aliyun oss only
$ aws configure set s3.addressing_style virtual
$ aws configure set s3.endpoint_url http://oss-cn-hangzhou.aliyuncs.com

$ cat ~/.aws/config
[default]
aws_secret_access_key = accesskey
aws_access_key_id = secretkey

s3 =
    addressing_style = virtual
    endpoint_url = http://oss-cn-hangzhou.aliyuncs.com

How to Contribute

  • We welcome everyone to contribute code to the megfile project, but the contributed code needs to meet the following conditions as much as possible:

    You can submit code even if the code doesn't meet conditions. The project members will evaluate and assist you in making code changes

    • Code format: Your code needs to pass code format check. megfile uses yapf as lint tool and the version is locked at 0.27.0. The version lock may be removed in the future

    • Static check: Your code needs complete type hint. megfile uses pytype as static check tool. If pytype failed in static check, use # pytype: disable=XXX to disable the error and please tell us why you disable it.

      Note : Because pytype doesn't support variable type annation, the variable type hint format introduced by py36 cannot be used.

      i.e. variable: int is invalid, replace it with variable # type: int

    • Test: Your code needs complete unit test coverage. megfile uses pyfakefs and moto as local file system and s3 virtual environment in unit tests. The newly added code should have a complete unit test to ensure the correctness

  • You can help to improve megfile in many ways:

    • Write code.
    • Improve documentation.
    • Report or investigate bugs and issues.
    • If you find any problem or have any improving suggestion, submit a new issuse as well. We will reply as soon as possible and evaluate whether to adopt.
    • Review pull requests.
    • Star megfile repo.
    • Recommend megfile to your friends.
    • Any other form of contribution is welcomed.

megfile's People

Contributors

bbtfr avatar diyer22 avatar hugoahoy avatar leavers avatar loveeatcandy avatar xyb avatar yujianpeng66 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.