graymeta / stow Goto Github PK
View Code? Open in Web Editor NEWCloud storage abstraction package for Go
License: Apache License 2.0
Cloud storage abstraction package for Go
License: Apache License 2.0
Setup Travis CI for Stow and all implementations.
is_dir
and path
etc. should be exposed in the package as:
const (
MetadataPath = "path"
MetadataIsDir = "is_dir"
MetadataPerm = "perm"
)
This formalises the metadata available in each implementation.
Unable to harvest from Azure containers with spaces.
As per the discussion on #37 we should add the ItemsWithPrefix
interface and update Walk to make use of it, like in @ernesto-jimenez's example.
So the Container
interface loses the prefix argument from Items
.
And we'll need to do ContainersWithPrefix
counterpart.
Assumption: Most implementations provide a way to get and set metadata for items and containers.
Examples of metadata include custom stuff set by users or other tools, and in the local implementation, os metadata.
Along with existing functionality, we are going to add a virtual container called Root
which represents the root folder itself.
Add a doc.go
file to each implementation with a small summary of what the package provides.
E.g.
// Package local provides a filesystem Location. Folders are containers, and
// files are items.
package local
err 2016-07-21T13:12:27Z staged: ItemByURL(azure://.../Raw%20Untranscoded%20Footage/MVI_2853.MOV_map): wrong path
eg. as implemented in https://github.com/rlmcpherson/s3gof3r
The other feature that isn’t available in most other S3 clients is pipeline support, which is made easy with Go’s reader and writer interfaces. This allows usage like
$ tar -czf - <my_dir/> | gof3r put -b <s3_bucket> -k <s3_object>
$ gof3r get -b <s3_bucket> -k <s3_object> | tar -zx
We use the command line tool at CodeGuard to transfer many terabytes into and out of S3 every day, tarring directories in parallel with the uploads and downloads.
Beside parallel upload in general, streaming upload is a really handy feature for big file transfer.
This project also has the added benefit of much robuster error handling and parallel upload/download.
I'm posting this here in the spirit of providing feedback as eg. parallel multipart uploading might influence the API and possible required configuration if added later on
A command line tool for stow would be quite a powerful addition.
Examples:
stow list containers -kind azure -config key="value" key2="value2"
stow add item -container name -kind azure -config etc...
stow download item -container name -kind azure -config etc.... >> file.txt
cat file.txt | stow upload item -container name -kind azure -config etc...
github.com/graymeta/stow/cmd/stow
Currently Azure implementation supports only put
ting/uploading files smaller than 4MB. We should be able to upload files of arbitrary size.
see #55
see Removing credentials from each commit while retaining credit/authorship
It would be very nice to have a unified way to get upload and download progress. Any plans on adding this?
Stow logo.
Can a location provide a suitable name?
Check Azure, S3, Swift, APIs etc. to see if they will (once dialled) be able to return a name
Like in blob stores, nested local storage should show up like this:
parent
parent/child
parent/child/grandchild
parent/child2
instead of what we have now:
parent
Connection issue? Confirmed with @Xercoy, fails every time.
=== RUN TestStow
SIGQUIT: quit
PC=0x457461 m=0
goroutine 0 [idle]:
runtime.futex(0x91ae10, 0x0, 0x0, 0x0, 0x7fff00000000, 0x42295e, 0x0, 0x0, 0x7fffc4361de8, 0x40eefb, ...)
/home/travis/.gimme/versions/go/src/runtime/sys_linux_amd64.s:387 +0x21
runtime.futexsleep(0x91ae10, 0x7fff00000000, 0xffffffffffffffff)
/home/travis/.gimme/versions/go/src/runtime/os_linux.go:45 +0x62
runtime.notesleep(0x91ae10)
/home/travis/.gimme/versions/go/src/runtime/lock_futex.go:145 +0x6b
runtime.stopm()
/home/travis/.gimme/versions/go/src/runtime/proc.go:1617 +0xad
runtime.findrunnable(0xc420023300, 0x0)
/home/travis/.gimme/versions/go/src/runtime/proc.go:2049 +0x241
runtime.schedule()
/home/travis/.gimme/versions/go/src/runtime/proc.go:2148 +0x14c
runtime.park_m(0xc4200d21a0)
/home/travis/.gimme/versions/go/src/runtime/proc.go:2211 +0xa0
runtime.mcall(0x7fffc4361f70)
/home/travis/.gimme/versions/go/src/runtime/asm_amd64.s:240 +0x5b
goroutine 1 [chan receive, 9 minutes]:
testing.(*T).Run(0xc4200d60c0, 0x783003, 0x8, 0x796188, 0xc42012fcf0)
/home/travis/.gimme/versions/go/src/testing/testing.go:675 +0x2ea
testing.runTests.func1(0xc4200d60c0)
/home/travis/.gimme/versions/go/src/testing/testing.go:830 +0x67
testing.tRunner(0xc4200d60c0, 0xc42012fdb0)
/home/travis/.gimme/versions/go/src/testing/testing.go:637 +0x81
testing.runTests(0x796200, 0x9161a0, 0x4, 0x4, 0xc42012fe48)
/home/travis/.gimme/versions/go/src/testing/testing.go:836 +0x299
testing.(*M).Run(0xc42054cef8, 0xc4200c2e20)
/home/travis/.gimme/versions/go/src/testing/testing.go:771 +0x90
main.main()
github.com/graymeta/stow/s3/_test/_testmain.go:60 +0xc6
goroutine 17 [syscall, 9 minutes, locked to thread]:
runtime.goexit()
/home/travis/.gimme/versions/go/src/runtime/asm_amd64.s:2160 +0x1
goroutine 5 [IO wait]:
net.runtime_pollWait(0x7fcaed978fc8, 0x72, 0x3)
/home/travis/.gimme/versions/go/src/runtime/netpoll.go:164 +0x59
net.(*pollDesc).wait(0xc42020daa8, 0x72, 0x8ebba0, 0x8e8458)
/home/travis/.gimme/versions/go/src/net/fd_poll_runtime.go:75 +0x38
net.(*pollDesc).waitRead(0xc42020daa8, 0xc420100000, 0x1000)
/home/travis/.gimme/versions/go/src/net/fd_poll_runtime.go:80 +0x34
net.(*netFD).Read(0xc42020da40, 0xc420100000, 0x1000, 0x1000, 0x0, 0x8ebba0, 0x8e8458)
/home/travis/.gimme/versions/go/src/net/fd_unix.go:246 +0x181
net.(*conn).Read(0xc420190150, 0xc420100000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/home/travis/.gimme/versions/go/src/net/net.go:176 +0x70
crypto/tls.(*block).readFromUntil(0xc420330ab0, 0x7fcaed9300c8, 0xc420190150, 0x5, 0xc420190150, 0x0)
/home/travis/.gimme/versions/go/src/crypto/tls/conn.go:481 +0x91
crypto/tls.(*Conn).readRecord(0xc4202d9500, 0x796a17, 0xc4202d9608, 0x0)
/home/travis/.gimme/versions/go/src/crypto/tls/conn.go:583 +0xc4
crypto/tls.(*Conn).Read(0xc4202d9500, 0xc420101000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/home/travis/.gimme/versions/go/src/crypto/tls/conn.go:1120 +0x116
net/http.(*persistConn).Read(0xc420089200, 0xc420101000, 0x1000, 0x1000, 0x8f, 0x0, 0x0)
/home/travis/.gimme/versions/go/src/net/http/transport.go:1292 +0x14b
bufio.(*Reader).fill(0xc4201feba0)
/home/travis/.gimme/versions/go/src/bufio/bufio.go:97 +0x10b
bufio.(*Reader).ReadSlice(0xc4201feba0, 0xc42038460a, 0x2, 0x2, 0x2, 0x0, 0x0)
/home/travis/.gimme/versions/go/src/bufio/bufio.go:338 +0xb5
net/http/internal.readChunkLine(0xc4201feba0, 0xc4201feba0, 0xc4203846b0, 0x2, 0x2, 0x2)
/home/travis/.gimme/versions/go/src/net/http/internal/chunked.go:122 +0x34
net/http/internal.(*chunkedReader).beginChunk(0xc420384690)
/home/travis/.gimme/versions/go/src/net/http/internal/chunked.go:48 +0x32
net/http/internal.(*chunkedReader).Read(0xc420384690, 0xc420274689, 0x577, 0x577, 0xc420544f40, 0xc420274600, 0x60b3b6)
/home/travis/.gimme/versions/go/src/net/http/internal/chunked.go:93 +0x11d
net/http.(*body).readLocked(0xc4203e9c80, 0xc420274689, 0x577, 0x577, 0xc4205352c8, 0x42923d, 0x795c30)
/home/travis/.gimme/versions/go/src/net/http/transfer.go:648 +0x61
net/http.(*body).Read(0xc4203e9c80, 0xc420274689, 0x577, 0x577, 0x0, 0x0, 0x0)
/home/travis/.gimme/versions/go/src/net/http/transfer.go:640 +0xf6
net/http.(*bodyEOFSignal).Read(0xc4203e9d00, 0xc420274689, 0x577, 0x577, 0x0, 0x0, 0x0)
/home/travis/.gimme/versions/go/src/net/http/transport.go:2011 +0xe9
bytes.(*Buffer).ReadFrom(0xc4205353b8, 0x8ea3e0, 0xc4203e9d00, 0xc4203f4000, 0x0, 0x200)
/home/travis/.gimme/versions/go/src/bytes/buffer.go:179 +0x155
io/ioutil.readAll(0x8ea3e0, 0xc4203e9d00, 0x200, 0x0, 0x0, 0x0, 0x0, 0x0)
/home/travis/.gimme/versions/go/src/io/ioutil/ioutil.go:33 +0x150
io/ioutil.ReadAll(0x8ea3e0, 0xc4203e9d00, 0xc4203e9d00, 0x8ea3e0, 0xc4203e9d00, 0xc4200300c0, 0x199)
/home/travis/.gimme/versions/go/src/io/ioutil/ioutil.go:42 +0x3e
github.com/aws/aws-sdk-go/service/s3.buildGetBucketLocation(0xc4201bea80)
/home/travis/gopath/src/github.com/aws/aws-sdk-go/service/s3/bucket_location.go:18 +0xb9
github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run(0xc4201bec30, 0xc4201bea80)
/home/travis/gopath/src/github.com/aws/aws-sdk-go/aws/request/handlers.go:136 +0x87
github.com/aws/aws-sdk-go/aws/request.(*Request).Send(0xc4201bea80, 0xc4200300b8, 0xc4201bea80)
/home/travis/gopath/src/github.com/aws/aws-sdk-go/aws/request/request.go:318 +0x463
github.com/aws/aws-sdk-go/service/s3.(*S3).GetBucketLocation(0xc420030098, 0xc4200300b8, 0x0, 0x0, 0x1)
/home/travis/gopath/src/github.com/aws/aws-sdk-go/service/s3/api.go:1244 +0x4d
github.com/graymeta/stow/s3.(*location).Containers(0xc4200c3160, 0x0, 0x0, 0x0, 0x0, 0x64, 0xc420331f80, 0x7fcaed9d0000, 0x0, 0x71d9e0, ...)
/home/travis/gopath/src/github.com/graymeta/stow/s3/location.go:70 +0x14f
github.com/graymeta/stow.WalkContainers(0x8efe60, 0xc4200c3160, 0x0, 0x0, 0x64, 0xc42018d100, 0x8e9d20, 0xc4203cda00)
/home/travis/gopath/src/github.com/graymeta/stow/walk.go:63 +0x88
github.com/graymeta/stow/test.All(0xc4200d6180, 0x78166d, 0x2, 0x8eb420, 0xc420015800)
/home/travis/gopath/src/github.com/graymeta/stow/test/test.go:238 +0x4569
github.com/graymeta/stow/s3.TestStow(0xc4200d6180)
/home/travis/gopath/src/github.com/graymeta/stow/s3/stow_test.go:20 +0x109
testing.tRunner(0xc4200d6180, 0x796188)
/home/travis/.gimme/versions/go/src/testing/testing.go:637 +0x81
created by testing.(*T).Run
/home/travis/.gimme/versions/go/src/testing/testing.go:674 +0x2c0
goroutine 467 [select]:
net/http.(*persistConn).writeLoop(0xc420089200)
/home/travis/.gimme/versions/go/src/net/http/transport.go:1680 +0x3b4
created by net/http.(*Transport).dialConn
/home/travis/.gimme/versions/go/src/net/http/transport.go:1094 +0x7e2
goroutine 466 [select]:
net/http.(*persistConn).readLoop(0xc420089200)
/home/travis/.gimme/versions/go/src/net/http/transport.go:1575 +0x9dc
created by net/http.(*Transport).dialConn
/home/travis/.gimme/versions/go/src/net/http/transport.go:1093 +0x7bd
rax 0xca
rbx 0x0
rcx 0xffffffffffffffff
rdx 0x0
rdi 0x91ae10
rsi 0x0
rbp 0x7fffc4361db8
rsp 0x7fffc4361d70
r8 0x0
r9 0x0
r10 0x0
r11 0x286
r12 0xc4201feca8
r13 0x0
r14 0xc4201fec60
r15 0x0
rip 0x457461
rflags 0x286
cs 0x33
fs 0x0
gs 0x0
*** Test killed with quit: ran too long (10m0s).
FAIL github.com/graymeta/stow/s3 600.005s
I could not find a method to use a local s3 server, it would be very nice to have support also for local versions of eg. s3.
eg. minio.io, ceph/radosgw, ...
We should not hard-code this.
../../../stow/azure/config.go:44: cannot use l (type *location) as type stow.Location in return argument:
*location does not implement stow.Location (missing RemoveContainer method)
../../../stow/azure/container.go:17: cannot use (*container)(nil) (type *container) as type stow.Container in assignment:
*container does not implement stow.Container (missing RemoveItem method)
../../../stow/azure/location.go:36: cannot use container (type *container) as type stow.Container in return argument:
*container does not implement stow.Container (missing RemoveItem method)
../../../stow/azure/location.go:52: cannot use container literal (type *container) as type stow.Container in assignment:
*container does not implement stow.Container (missing RemoveItem method)
The metadata should contain the target of the link as link
field.
To be used when items or containers are not found.
The Dockerfile mentions "Metafarm" and shouldn't. Stow will be open sourced eventually and should be free from any GM details.
Stow service that exposes the file system via an HTTP interface. Would need associated stowserver
implementation (client).
Here: https://github.com/graymeta/stow/blob/master/swift/item.go#L74
Using sync.Once (see this example) will mean the code is safe.
The remote implementation isn't ready for prime time, and may never belong in Stow. Remove it.
It would be nice to be able to run new architecture on every system. I know Windows isn't the target for us, but still -- possibility of development using native Windows would be nice.
Although we're using extensively filepath.Join
, tests aren't passing on Windows (example from local
location connector):
C:\Users\Piotr\Desktop\gopath\src\github.com\graymeta\stow\local>go test -v
=== RUN TestContainers
--- PASS: TestContainers (0.00s)
=== RUN TestContainersPrefix
--- PASS: TestContainersPrefix (0.02s)
=== RUN TestContainer
local_test.go:136: file://C:%5CUsers%5CPiotr%5CDesktop%5Cgopath%5Csrc%5Cgithub.com%5Cgraymeta%5Cstow%5Clocal%5Ctestdata%5Cstow319373961%5Cthree != file://C:\Users\Piotr\Desktop\gopath\src\github.com\graymeta\stow\local\testdata\stow319373961\three
--- FAIL: TestContainer (0.01s)
=== RUN TestNewContainer
--- PASS: TestNewContainer (0.00s)
=== RUN TestCreateItem
--- PASS: TestCreateItem (0.01s)
=== RUN TestItems
--- PASS: TestItems (0.00s)
=== RUN TestByURL
--- PASS: TestByURL (0.00s)
=== RUN TestItemReader
--- PASS: TestItemReader (0.00s)
FAIL
exit status 1
FAIL github.com/graymeta/stow/local 0.071s
Seems like an issue with url
escaping.
I started to experiment with Stow and wanted to use the local file system adapter for testing.But I've noticed there are some small inconsistencies between the local file system and the Google Cloud Storage adapter.
I've included 2 snippets do demonstrate the issue. I'm not sure if this behaviour is done on purpose or not, fixing or changing this could potentially break other people's code.
This is the Google Cloud Storage example, you can see I don't prefix the path of container.Put
or container.Item
with the container name
func main() {
creds := `{}`
projectID := ""
config := stow.ConfigMap{
"json": creds,
"project_id": projectID,
}
location, err := stow.Dial(google.Kind, config)
if err != nil {
panic(err)
}
defer location.Close()
cs, _,_ := location.Containers("", stow.CursorStart, 10)
for _, c := range cs {
fmt.Printf("%v -> %v\n", c.Name(), c.ID())
}
container, _ := location.Container("niels-test")
contents := "This is a new file stored in the cloud"
r := strings.NewReader(contents)
size := int64(len(contents))
item, _ := container.Put("user/nielsstevens/index.json", r, size, nil)
fmt.Printf("%+v", item)
result, _ := container.Item("user/nielsstevens/index.json")
fmt.Printf("%+v", result)
}
But now when I try the same approach for local file system, you will see I need to prefix the container.Item
and container.Put
with the config.Path
and the container.Name
which is not the same for GCS Adapter.
func main() {
config := stow.ConfigMap{
"path": "/tmp/",
}
location, _ := stow.Dial(local.Kind, config)
defer location.Close()
cs, _,_ := location.Containers("", stow.CursorStart, 10)
for _, c := range cs {
fmt.Printf("%v -> %v\n", c.Name(), c.ID())
}
container, _ := location.Container("niels-test")
contents := "This is a new file stored in the cloud"
r := strings.NewReader(contents)
size := int64(len(contents))
item, _ := container.Put("/tmp/niels-test/user/nielsstevens/index.json", r, size, nil)
fmt.Printf("%+v", item)
result, _ := container.Item("/tmp/niels-test/user/nielsstevens/index.json")
fmt.Printf("%+v", result)
}
Can all the implementations provide this?
I was looking into implementing a Backblaze B2 backend for Stow. There are some limitations:
backend.HashType
flag that could be set in each backend implementation.b2.SupportsPrefixes
) which could then be used in the test suite to only test things like prefixes on backends that support it. Adding the prefix support ourselves would be easy for buckets, but less so for files (not impossible, just a bit of work to identify the logic).Implement enum
-like location types, so we don't use strings as location names.
E.g.:
type LocationType int
//go:generate stringer -type=LocationType
const (
Azure LocationType = iota
S3
Local
)
// Config contains LocationType like Azure with properties,
// e.g. account number and key.
type Config struct {
LocationType LocationType
Properties map[string]interface{}
}
Amazon S3 implementation.
doc.go:
// Package stow provides an abstraction on cloud storage capabilities.
package stow
Remote implementation currently has no tests.
Further along the road, let's implement some common error types, like e.g.:
var (
ErrWrCfg = errors.New("Wrong config.")
ErrNotFound = errors.New("Element not found.")
ErrNoMoreItems = errors.New("No more items.")
ErrNoContent = errors.New("No content.")
ErrIOProb = errors.New("Couldn't read/write.")
)
Every error that would be common for local
and azure
should be implemented in this package.
Might as well make local and remote implementations expose a single container that containers everything. This will enable harvesting of files in the root, as well as simplify things.
Perhaps the single container could be called "All"?
We should add context
for cancellation and timeout support.
Put the All
container for local to the TOP (i.e. first item)
We have a few root directories "test", "build", etc. Should we put provides inside a subdirectory or are we OK with them being alongside non-provider dirs?
Use cursor pattern for paging.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.