GithubHelp home page GithubHelp logo

stow's People

Contributors

agrimprasad avatar alrs avatar dahernan avatar darwayne avatar enghabu avatar ernesto-jimenez avatar jasonhancock avatar jasonsattler avatar jdtobe avatar jsteenb2 avatar jtarchie avatar marbergq avatar marianina8 avatar matryer avatar naysayer avatar piotrrojek avatar pisush avatar sbward avatar tamalsaha avatar urisimchoni avatar xercoy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stow's Issues

local/remote: Pull metadata keys out as constants

is_dir and path etc. should be exposed in the package as:

const (
    MetadataPath = "path"
    MetadataIsDir = "is_dir"
    MetadataPerm = "perm"
)

This formalises the metadata available in each implementation.

feature: stow server

Stow service that exposes the file system via an HTTP interface. Would need associated stowserver implementation (client).

  • Stow server
  • Stow server implementation (client)

Make common error types

Further along the road, let's implement some common error types, like e.g.:

var (
    ErrWrCfg       = errors.New("Wrong config.")
    ErrNotFound    = errors.New("Element not found.")
    ErrNoMoreItems = errors.New("No more items.")
    ErrNoContent   = errors.New("No content.")
    ErrIOProb      = errors.New("Couldn't read/write.")
)

Every error that would be common for local and azure should be implemented in this package.

add doc.go

doc.go:

// Package stow provides an abstraction on cloud storage capabilities.
package stow

local/remote: one container

Might as well make local and remote implementations expose a single container that containers everything. This will enable harvesting of files in the root, as well as simplify things.

Perhaps the single container could be called "All"?

implementation: Backblaze B2

I was looking into implementing a Backblaze B2 backend for Stow. There are some limitations:

  • Only returns a SHA1 checksum (https://www.backblaze.com/b2/docs/b2_get_file_info.html), not MD5. Might need to rename Item.MD5() to Item.Hash() and have it return the checksum + a string representing the type of hash. Or maybe just backend.HashType flag that could be set in each backend implementation.
  • B2 doesn't support listing files or buckets with a prefix. Wondering if we should implement some sort of "SupportsFeature" type of booleans to each implementation (example b2.SupportsPrefixes) which could then be used in the test suite to only test things like prefixes on backends that support it. Adding the prefix support ourselves would be easy for buckets, but less so for files (not impossible, just a bit of work to identify the logic).

add doc.go to each implementation

Add a doc.go file to each implementation with a small summary of what the package provides.

E.g.

// Package local provides a filesystem Location. Folders are containers, and 
// files are items.
package local

Can a location provide a name?

Can a location provide a suitable name?

Check Azure, S3, Swift, APIs etc. to see if they will (once dialled) be able to return a name

Implement Remove* in azure

    ../../../stow/azure/config.go:44: cannot use l (type *location) as type stow.Location in return argument:
        *location does not implement stow.Location (missing RemoveContainer method)
    ../../../stow/azure/container.go:17: cannot use (*container)(nil) (type *container) as type stow.Container in assignment:
        *container does not implement stow.Container (missing RemoveItem method)
    ../../../stow/azure/location.go:36: cannot use container (type *container) as type stow.Container in return argument:
        *container does not implement stow.Container (missing RemoveItem method)
    ../../../stow/azure/location.go:52: cannot use container literal (type *container) as type stow.Container in assignment:
        *container does not implement stow.Container (missing RemoveItem method)

Set up Travis

Setup Travis CI for Stow and all implementations.

Add support for container and item metadata

Assumption: Most implementations provide a way to get and set metadata for items and containers.

Examples of metadata include custom stuff set by users or other tools, and in the local implementation, os metadata.

optional prefix support

As per the discussion on #37 we should add the ItemsWithPrefix interface and update Walk to make use of it, like in @ernesto-jimenez's example.

So the Container interface loses the prefix argument from Items.

And we'll need to do ContainersWithPrefix counterpart.

paging

Use cursor pattern for paging.

Stow mentions "Metafarm"

The Dockerfile mentions "Metafarm" and shouldn't. Stow will be open sourced eventually and should be free from any GM details.

Consistency between different adapters

I started to experiment with Stow and wanted to use the local file system adapter for testing.But I've noticed there are some small inconsistencies between the local file system and the Google Cloud Storage adapter.
I've included 2 snippets do demonstrate the issue. I'm not sure if this behaviour is done on purpose or not, fixing or changing this could potentially break other people's code.

This is the Google Cloud Storage example, you can see I don't prefix the path of container.Put or container.Item with the container name

func main() {
	creds := `{}`
	projectID := ""

	config := stow.ConfigMap{
		"json":       creds,
		"project_id": projectID,
	}

		location, err := stow.Dial(google.Kind, config)
	if err != nil {
		panic(err)
	}
	defer location.Close()

	cs, _,_ := location.Containers("", stow.CursorStart, 10)
	for _, c := range cs {
		fmt.Printf("%v -> %v\n", c.Name(), c.ID())
	}
	container, _ := location.Container("niels-test")
	contents := "This is a new file stored in the cloud"
	r := strings.NewReader(contents)
	size := int64(len(contents))

	item, _ := container.Put("user/nielsstevens/index.json", r, size, nil)
        fmt.Printf("%+v", item)

	result, _ := container.Item("user/nielsstevens/index.json")
	fmt.Printf("%+v", result)
}

But now when I try the same approach for local file system, you will see I need to prefix the container.Item and container.Put with the config.Path and the container.Name which is not the same for GCS Adapter.

func main() {
        config := stow.ConfigMap{
		"path":       "/tmp/",
	}

	location, _ := stow.Dial(local.Kind, config)
	defer location.Close()

	cs, _,_ := location.Containers("", stow.CursorStart, 10)
	for _, c := range cs {
		fmt.Printf("%v -> %v\n", c.Name(), c.ID())
	}
	container, _ := location.Container("niels-test")

	contents := "This is a new file stored in the cloud"
	r := strings.NewReader(contents)
	size := int64(len(contents))

	item, _ := container.Put("/tmp/niels-test/user/nielsstevens/index.json", r, size, nil)
        fmt.Printf("%+v", item)

	result, _ := container.Item("/tmp/niels-test/user/nielsstevens/index.json")
	fmt.Printf("%+v", result)
}

command line tool

A command line tool for stow would be quite a powerful addition.

Examples:

stow list containers -kind azure -config key="value" key2="value2"
stow add item -container name -kind azure -config etc...
stow download item -container name -kind azure -config etc.... >> file.txt
cat file.txt | stow upload item -container name -kind azure -config etc...
  • It would go in github.com/graymeta/stow/cmd/stow
  • The command should rely on no state (which means entire config needs to be passed in each time)

Enum-like location types

Implement enum-like location types, so we don't use strings as location names.
E.g.:

type LocationType int

//go:generate stringer -type=LocationType
const (
    Azure LocationType = iota
    S3
    Local
)

// Config contains LocationType like Azure with properties,
// e.g. account number and key.
type Config struct {
    LocationType LocationType
    Properties   map[string]interface{}
}

Make it compatible with Windows

It would be nice to be able to run new architecture on every system. I know Windows isn't the target for us, but still -- possibility of development using native Windows would be nice.
Although we're using extensively filepath.Join, tests aren't passing on Windows (example from local location connector):

C:\Users\Piotr\Desktop\gopath\src\github.com\graymeta\stow\local>go test -v
=== RUN   TestContainers
--- PASS: TestContainers (0.00s)
=== RUN   TestContainersPrefix
--- PASS: TestContainersPrefix (0.02s)
=== RUN   TestContainer
        local_test.go:136: file://C:%5CUsers%5CPiotr%5CDesktop%5Cgopath%5Csrc%5Cgithub.com%5Cgraymeta%5Cstow%5Clocal%5Ctestdata%5Cstow319373961%5Cthree != file://C:\Users\Piotr\Desktop\gopath\src\github.com\graymeta\stow\local\testdata\stow319373961\three
--- FAIL: TestContainer (0.01s)
=== RUN   TestNewContainer
--- PASS: TestNewContainer (0.00s)
=== RUN   TestCreateItem
--- PASS: TestCreateItem (0.01s)
=== RUN   TestItems
--- PASS: TestItems (0.00s)
=== RUN   TestByURL
--- PASS: TestByURL (0.00s)
=== RUN   TestItemReader
--- PASS: TestItemReader (0.00s)
FAIL
exit status 1
FAIL    github.com/graymeta/stow/local  0.071s

Seems like an issue with url escaping.

prepare for open-source

  • Configure Travis for build
  • Artwork (Stow logo)
  • Set a version number
  • Remove all test credentials

Support local servers

I could not find a method to use a local s3 server, it would be very nice to have support also for local versions of eg. s3.

eg. minio.io, ceph/radosgw, ...

Tests on Travis CI fail with s3

Connection issue? Confirmed with @Xercoy, fails every time.

=== RUN   TestStow
SIGQUIT: quit
PC=0x457461 m=0
goroutine 0 [idle]:
runtime.futex(0x91ae10, 0x0, 0x0, 0x0, 0x7fff00000000, 0x42295e, 0x0, 0x0, 0x7fffc4361de8, 0x40eefb, ...)
    /home/travis/.gimme/versions/go/src/runtime/sys_linux_amd64.s:387 +0x21
runtime.futexsleep(0x91ae10, 0x7fff00000000, 0xffffffffffffffff)
    /home/travis/.gimme/versions/go/src/runtime/os_linux.go:45 +0x62
runtime.notesleep(0x91ae10)
    /home/travis/.gimme/versions/go/src/runtime/lock_futex.go:145 +0x6b
runtime.stopm()
    /home/travis/.gimme/versions/go/src/runtime/proc.go:1617 +0xad
runtime.findrunnable(0xc420023300, 0x0)
    /home/travis/.gimme/versions/go/src/runtime/proc.go:2049 +0x241
runtime.schedule()
    /home/travis/.gimme/versions/go/src/runtime/proc.go:2148 +0x14c
runtime.park_m(0xc4200d21a0)
    /home/travis/.gimme/versions/go/src/runtime/proc.go:2211 +0xa0
runtime.mcall(0x7fffc4361f70)
    /home/travis/.gimme/versions/go/src/runtime/asm_amd64.s:240 +0x5b
goroutine 1 [chan receive, 9 minutes]:
testing.(*T).Run(0xc4200d60c0, 0x783003, 0x8, 0x796188, 0xc42012fcf0)
    /home/travis/.gimme/versions/go/src/testing/testing.go:675 +0x2ea
testing.runTests.func1(0xc4200d60c0)
    /home/travis/.gimme/versions/go/src/testing/testing.go:830 +0x67
testing.tRunner(0xc4200d60c0, 0xc42012fdb0)
    /home/travis/.gimme/versions/go/src/testing/testing.go:637 +0x81
testing.runTests(0x796200, 0x9161a0, 0x4, 0x4, 0xc42012fe48)
    /home/travis/.gimme/versions/go/src/testing/testing.go:836 +0x299
testing.(*M).Run(0xc42054cef8, 0xc4200c2e20)
    /home/travis/.gimme/versions/go/src/testing/testing.go:771 +0x90
main.main()
    github.com/graymeta/stow/s3/_test/_testmain.go:60 +0xc6
goroutine 17 [syscall, 9 minutes, locked to thread]:
runtime.goexit()
    /home/travis/.gimme/versions/go/src/runtime/asm_amd64.s:2160 +0x1
goroutine 5 [IO wait]:
net.runtime_pollWait(0x7fcaed978fc8, 0x72, 0x3)
    /home/travis/.gimme/versions/go/src/runtime/netpoll.go:164 +0x59
net.(*pollDesc).wait(0xc42020daa8, 0x72, 0x8ebba0, 0x8e8458)
    /home/travis/.gimme/versions/go/src/net/fd_poll_runtime.go:75 +0x38
net.(*pollDesc).waitRead(0xc42020daa8, 0xc420100000, 0x1000)
    /home/travis/.gimme/versions/go/src/net/fd_poll_runtime.go:80 +0x34
net.(*netFD).Read(0xc42020da40, 0xc420100000, 0x1000, 0x1000, 0x0, 0x8ebba0, 0x8e8458)
    /home/travis/.gimme/versions/go/src/net/fd_unix.go:246 +0x181
net.(*conn).Read(0xc420190150, 0xc420100000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /home/travis/.gimme/versions/go/src/net/net.go:176 +0x70
crypto/tls.(*block).readFromUntil(0xc420330ab0, 0x7fcaed9300c8, 0xc420190150, 0x5, 0xc420190150, 0x0)
    /home/travis/.gimme/versions/go/src/crypto/tls/conn.go:481 +0x91
crypto/tls.(*Conn).readRecord(0xc4202d9500, 0x796a17, 0xc4202d9608, 0x0)
    /home/travis/.gimme/versions/go/src/crypto/tls/conn.go:583 +0xc4
crypto/tls.(*Conn).Read(0xc4202d9500, 0xc420101000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /home/travis/.gimme/versions/go/src/crypto/tls/conn.go:1120 +0x116
net/http.(*persistConn).Read(0xc420089200, 0xc420101000, 0x1000, 0x1000, 0x8f, 0x0, 0x0)
    /home/travis/.gimme/versions/go/src/net/http/transport.go:1292 +0x14b
bufio.(*Reader).fill(0xc4201feba0)
    /home/travis/.gimme/versions/go/src/bufio/bufio.go:97 +0x10b
bufio.(*Reader).ReadSlice(0xc4201feba0, 0xc42038460a, 0x2, 0x2, 0x2, 0x0, 0x0)
    /home/travis/.gimme/versions/go/src/bufio/bufio.go:338 +0xb5
net/http/internal.readChunkLine(0xc4201feba0, 0xc4201feba0, 0xc4203846b0, 0x2, 0x2, 0x2)
    /home/travis/.gimme/versions/go/src/net/http/internal/chunked.go:122 +0x34
net/http/internal.(*chunkedReader).beginChunk(0xc420384690)
    /home/travis/.gimme/versions/go/src/net/http/internal/chunked.go:48 +0x32
net/http/internal.(*chunkedReader).Read(0xc420384690, 0xc420274689, 0x577, 0x577, 0xc420544f40, 0xc420274600, 0x60b3b6)
    /home/travis/.gimme/versions/go/src/net/http/internal/chunked.go:93 +0x11d
net/http.(*body).readLocked(0xc4203e9c80, 0xc420274689, 0x577, 0x577, 0xc4205352c8, 0x42923d, 0x795c30)
    /home/travis/.gimme/versions/go/src/net/http/transfer.go:648 +0x61
net/http.(*body).Read(0xc4203e9c80, 0xc420274689, 0x577, 0x577, 0x0, 0x0, 0x0)
    /home/travis/.gimme/versions/go/src/net/http/transfer.go:640 +0xf6
net/http.(*bodyEOFSignal).Read(0xc4203e9d00, 0xc420274689, 0x577, 0x577, 0x0, 0x0, 0x0)
    /home/travis/.gimme/versions/go/src/net/http/transport.go:2011 +0xe9
bytes.(*Buffer).ReadFrom(0xc4205353b8, 0x8ea3e0, 0xc4203e9d00, 0xc4203f4000, 0x0, 0x200)
    /home/travis/.gimme/versions/go/src/bytes/buffer.go:179 +0x155
io/ioutil.readAll(0x8ea3e0, 0xc4203e9d00, 0x200, 0x0, 0x0, 0x0, 0x0, 0x0)
    /home/travis/.gimme/versions/go/src/io/ioutil/ioutil.go:33 +0x150
io/ioutil.ReadAll(0x8ea3e0, 0xc4203e9d00, 0xc4203e9d00, 0x8ea3e0, 0xc4203e9d00, 0xc4200300c0, 0x199)
    /home/travis/.gimme/versions/go/src/io/ioutil/ioutil.go:42 +0x3e
github.com/aws/aws-sdk-go/service/s3.buildGetBucketLocation(0xc4201bea80)
    /home/travis/gopath/src/github.com/aws/aws-sdk-go/service/s3/bucket_location.go:18 +0xb9
github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run(0xc4201bec30, 0xc4201bea80)
    /home/travis/gopath/src/github.com/aws/aws-sdk-go/aws/request/handlers.go:136 +0x87
github.com/aws/aws-sdk-go/aws/request.(*Request).Send(0xc4201bea80, 0xc4200300b8, 0xc4201bea80)
    /home/travis/gopath/src/github.com/aws/aws-sdk-go/aws/request/request.go:318 +0x463
github.com/aws/aws-sdk-go/service/s3.(*S3).GetBucketLocation(0xc420030098, 0xc4200300b8, 0x0, 0x0, 0x1)
    /home/travis/gopath/src/github.com/aws/aws-sdk-go/service/s3/api.go:1244 +0x4d
github.com/graymeta/stow/s3.(*location).Containers(0xc4200c3160, 0x0, 0x0, 0x0, 0x0, 0x64, 0xc420331f80, 0x7fcaed9d0000, 0x0, 0x71d9e0, ...)
    /home/travis/gopath/src/github.com/graymeta/stow/s3/location.go:70 +0x14f
github.com/graymeta/stow.WalkContainers(0x8efe60, 0xc4200c3160, 0x0, 0x0, 0x64, 0xc42018d100, 0x8e9d20, 0xc4203cda00)
    /home/travis/gopath/src/github.com/graymeta/stow/walk.go:63 +0x88
github.com/graymeta/stow/test.All(0xc4200d6180, 0x78166d, 0x2, 0x8eb420, 0xc420015800)
    /home/travis/gopath/src/github.com/graymeta/stow/test/test.go:238 +0x4569
github.com/graymeta/stow/s3.TestStow(0xc4200d6180)
    /home/travis/gopath/src/github.com/graymeta/stow/s3/stow_test.go:20 +0x109
testing.tRunner(0xc4200d6180, 0x796188)
    /home/travis/.gimme/versions/go/src/testing/testing.go:637 +0x81
created by testing.(*T).Run
    /home/travis/.gimme/versions/go/src/testing/testing.go:674 +0x2c0
goroutine 467 [select]:
net/http.(*persistConn).writeLoop(0xc420089200)
    /home/travis/.gimme/versions/go/src/net/http/transport.go:1680 +0x3b4
created by net/http.(*Transport).dialConn
    /home/travis/.gimme/versions/go/src/net/http/transport.go:1094 +0x7e2
goroutine 466 [select]:
net/http.(*persistConn).readLoop(0xc420089200)
    /home/travis/.gimme/versions/go/src/net/http/transport.go:1575 +0x9dc
created by net/http.(*Transport).dialConn
    /home/travis/.gimme/versions/go/src/net/http/transport.go:1093 +0x7bd
rax    0xca
rbx    0x0
rcx    0xffffffffffffffff
rdx    0x0
rdi    0x91ae10
rsi    0x0
rbp    0x7fffc4361db8
rsp    0x7fffc4361d70
r8     0x0
r9     0x0
r10    0x0
r11    0x286
r12    0xc4201feca8
r13    0x0
r14    0xc4201fec60
r15    0x0
rip    0x457461
rflags 0x286
cs     0x33
fs     0x0
gs     0x0
*** Test killed with quit: ran too long (10m0s).
FAIL    github.com/graymeta/stow/s3 600.005s

Support parallel streaming upload to s3

eg. as implemented in https://github.com/rlmcpherson/s3gof3r

The other feature that isn’t available in most other S3 clients is pipeline support, which is made easy with Go’s reader and writer interfaces. This allows usage like
$ tar -czf - <my_dir/> | gof3r put -b <s3_bucket> -k <s3_object>
$ gof3r get -b <s3_bucket> -k <s3_object> | tar -zx
We use the command line tool at CodeGuard to transfer many terabytes into and out of S3 every day, tarring directories in parallel with the uploads and downloads.

Beside parallel upload in general, streaming upload is a really handy feature for big file transfer.
This project also has the added benefit of much robuster error handling and parallel upload/download.

I'm posting this here in the spirit of providing feedback as eg. parallel multipart uploading might influence the API and possible required configuration if added later on

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.