bbengfort / fluidfs Goto Github PK

View Code? Open in Web Editor NEW

1.0 3.0 0.0 3.66 MB

A highly consistent distributed filesystem built with FUSE

Home Page: http://www.fluidfs.com

License: MIT License

Makefile 0.53% Go 98.57% HTML 0.90%

fuse-filesystem replica distributed-systems

fluidfs's Introduction

FluidFS

A highly consistent distributed filesystem

For more information, please see the documentation: bbengfort.github.io/fluidfs.

Getting Started

The easiest way to get started with Go is to download the source code and install it using the go get command as follows:

$ go get github.com/bbengfort/fluidfs
$ go install github.com/bbengfort/fluidfs/cmd/...

Two commands should now be in your $PATH: the fluidfs command, which runs the FluidFS server and the fluid command, a client to the server. In order to set the configuration for FluidFS, create a folder in your home directory called .fluidfs and copy the fixtures/config-example.yml file to that folder.

$ mkdir ~/.fluidfs
$ cp fixtures/config-example.yml ~/.fluidfs/config.yml

The configuration file has many comments to guide you in the setup. Open the file for editing and ensure that at least the pid configuration is set to a number greater than 0. You can then start the FluidFS server as follows:

$ fluidfs start

This should run the server, but since no mount points have been defined, there will be no file system interaction. Define a mount point with the client:

$ fluid mount ~user ~/Fluid

This creates the ~user prefix (bucket) and mounts it to the directory Fluid in your home directory. Make sure that this directory exists! You may have to restart the FluidFS server for changes to take effect. Once done, you can cd into the ~/Fluid directory, create and modify files as needed, and FluidFS will track them.

To view the web interface for FluidFS, open it as follows:

$ fluid web

This will open the default browser to the web interface.

Binary Assets

The web interface for FluidFS are compiled as binary assets along with the fluidfs server. When adding new web interface functionality, ensure that the assets are rebuilt by using the following command:

$ go-bindata-assetfs assets/...

Make sure when you do so, you're in the fluid directory. Note that you'll have to change the package of the generated bindata_assetfs.go file, and potentially also handle the naming of several of the functions that fail the linter test. For more information on the binary assets, see: go-bindata-assetfs.

Development

The primary interface is a command line program that interacts directly with the fluid library. Note that cmd/fluid/main.go uses the CLI library rather than implementing console commands itself. Building from source is implemented using the included Makefile, which fetches dependencies and builds locally rather than to the $GOPATH:

$ make

There is an RSpec-style test suite that uses Ginkgo and Gomega. These tests can be run with the Makefile:

$ make test

Note that labels in the Github issues are defined in the blog post: How we use labels on GitHub Issues at Mediocre Laboratories.

The repository is set up in a typical production/release/development cycle as described in A Successful Git Branching Model. A typical workflow is as follows:

Select a card from the dev board - preferably one that is "ready" then move it to "in-progress".
Create a branch off of develop called "feature-[feature name]", work and commit into that branch.
```
 ~$ git checkout -b feature-myfeature develop
```

Once you are done working (and everything is tested) merge your feature into develop.

 ~$ git checkout develop
 ~$ git merge --no-ff feature-myfeature
 ~$ git branch -d feature-myfeature
 ~$ git push origin develop

Repeat. Releases will be routinely pushed into master via release branches, then deployed to the server.

Agile Board and Documentation

The development board can be found on Waffle:

https://waffle.io/bbengfort/fluidfs

The documentation can be built and served locally with mkdocs:

$ mkdocs serve

The latest version of the documentation is hosted with GitHub Pages and can be found at the project link: bbengfort.github.io/fluidfs. To build and publish the documentation, use the make file:

$ make publish

This will use the mkdocs gh-deploy command to build the site to the gh-pages branch and will push to origin.

About

FluidFS is a research project to create a distributed file system in user space (FUSE) that is highly consistent and reliable. It is meant as a Dropbox replacement, allowing direct synchronization between devices on a personal network rather than going through a cloud service.

Attribution

The image used in this README, "Atlanta - Georgia Aquarium" by Milos Kravcik is licensed under CC-BY-NC-ND 2.0.

fluidfs's People

Contributors

Stargazers

Watchers

fluidfs's Issues

Linearizable w/ timeouts?

Strong guarantees for correctness: linearizable w/ timeouts, is this still strong? E.g. if we implement a lease mechanism such that the leader can take away locks, is this still linearizable?

See #11 for second question.

Directory versioning

For getting back earlier directory versions, including files not present in current version.

Cache Invalidation

Make sure that FUSE invalidates the kernel cache so that all system calls go through our application.

Check the clockfs example in FUSE.

Create FS API

Create an API that both the workload generation client and the Fuse client can interact with -- the front-end of the FluidFS daemon process.

fstab

create an fstab (file system table) file to describe the mount points for fluidfs (allowing multiple mount points per local replica).

In previous systems we only have had a single mount point, but as FluidFS is getting more complex, I want to be able to allow for multiple mounts - particularly as we start setting up subquorums for different tag spaces.

Optimize Lookup (Double Lookup Problem)

Right now we store the entities on directories in a single Children mapping. So in order to do a lookup, we have to first get the entity from the namespace, then get the data from the prefix or version table.

This is unavoidable for files since we are maintaining versions for all the files. However, we could skip the double lookup for directories by storing ChildDirs and ChildFiles maps instead, and that way we'd know exactly where to look for child data.

Web Interface

Create the web interface for visualizing the status of FluidFS as well as for command and control purposes.

Losing fork thread

So a replica that loses in a fork continues on his merry way w/ a fork of the file system, using the DB buckets etc. Need to have this stored, and retrievable. Could potentially do this w/ a filter applied to name before the name DB bucket.

Logging

Create a logging system that has the following log levels:

DEBUG
INFO
WARNING
ERROR
FATAL

And add logging system to FluidFS.

Note the logging must be configurable from the YAML file.

Project setup

Setup the Github project for FlowFS.

Write optimization

Current mechanism of writes, on close:

chunkify file (ready for anti-entropy)
create meta information
send meta-information to the leader
store cached version locally
begin consensus for the write.

Optimization:
Don't need to chunkify before sending close to leader, just send version version meta. Later we can instantiate the version either in log or just externally, possibly w/ antientropy of blobs.

autoip

Create a mechanism to detect the external IP address of the replica and to report it to other devices.

We should also have a mechanism to routinely check the external ip address so that mobile devices can maintain their external connections.

Hard link name (and link refactoring)

When a hard link is created the name isn't modified, meaning that there appear to be copies of the file and in order to remove the hard link, the referenced name must be removed, which is impossible to see.

Add Godeps

Make sure dependencies are vendored with Godeps.

Analyze blob storage performance

Answer the following questions:

Which is faster, blob storage in BoltDB or on disk?
How many files can be stored in a single directory before we lose performance?
What is the optimal disk structure for storing blobs?

Optimize hierarchical blob storage

File versioning

Create the ability to lookup historical version information from FluidFS.

Blob Replication

Create the anti-entropy blob replication mechanism.

Write Locks and Version Replication 1

Create version replication with write locks:

On READ the system requests a lock for the file.
On WRITE the system replicates the version and lock release.

Note that there are actually three modes of operation:

Best Effort
Transactions
Locks/Leases (Linearizeability)

The procedure is documented as follows:

On a read access (open):

Read the latest version from the DB (check if locked, if so abort)
Request a lock from the leader (consensus occurs)
On lock commit, update metadata and return the file's blobs.

On a write access (close):

store the blobs and the write metadata into the cache (read your writes)
forward the remote write to the leader
if leader is unavailable, keep retrying both version decision and lock release occur simultaneously
if leader rejects the write (e.g. the lock was removed before the write), mark the conflict on the file (similar to the Git method of marking merge conflicts).

Hard and Symlinks

Create hard and symlink functionality using FUSE.

Implement the following functions:

Link()
Symlink()
Readlink()

For now these are going to be implemented simply as files. Hardlinks created with Link are just going to be named references to the File pointer they refer to (See #42). Symlinks are just files whose data is the name of the target.

Analyze and build soft-state prefix caching

Can we use the key-* mechanism of BoltDB to list a directory's contents. E.g. the directory is the prefix, list everything below.

However, if we list "/" -- this will return every key in the database, so analyze the performance of prefix lookups vs. cacheing children on a per-key basis.

Update to FluidFS

Memory Free

Create a mechanism to free memory used by the file system when the file is no longer being used or read from.

Vim Versioning Bug

We are trying to create a version history so that we can serialize all versions in CTO fashion. But when using vim, it moves the original file (with the original version number) to a new file, then creates a new file (and therefore a new version history) with the new data.

I am trying to bump the version number, but the following is happening when I'm using vim:

Open new file foo.txt, write a line, save and close the file.
File is flushed, assigned version (1,1), iNode 1
Open file append a line, save and close the file.
iNode 1 with version (1,1) is moved to foo.txt~
iNode 2 is created as foo.txt (but no version history)
File is flushed, assigned version (1,1), iNode 2
iNode 1 (foo.txt~) is removed

Note that there are .swp, .swx, .swo files in there as well.

So basically our versioning system expects the namespace to not change out from under it, like it is in this case. Other bash programs do this too (mv orig to a backup and write a new version), and we wouldn't get linearizability as a result.

Also foo.txt~ is not flushed, so there no record of it in the db and it wouldn't be sent to Raft (which is why it's taken me forever to find this bug).

Blob Chunking

Implement blob chunking with Rabin-Karp or some other chunking mechanism.

Metrics API

Create mechanism for collecting and logging metrics from the file system.

Note - this must be a call-through system (e.g. sends off go routines that are analyzed asynchronously) and can be ignored or shut-off on demand.

Create fluidfs -- the fluid server daemon.

This is the process that is always on that mounts Fuse directories and listens for command line calls from the fluid config client. It will also run a simple web server for communication and integration.

HAWK Signed Messages

Implement HAWK signed messages for communication across the various replica servers.

Metadata versions

When changing the meta data of a file, e.g. Setattr, we need to get Raft confirmation and do a flush (metadirty=true).

Get Travis Working

Travis doesn't seem to be working, ensure it's up and running.

refactor chunker to use 64 bit uints

Right now there is a mix of int and uint64 in the chunker methods. Change this to make sure unsigned ints are used where necessary.

The only weirdness about this is the use of a negative number in fixed length chunking so that the first chunk gets set correctly.

Hosts definition

Create the ability to define other replicas on the network in an easy fashion, making sure that multiple addresses or address updating is allowed.

Domain Names

Purchase the fluidfs.com and flowfs.com domain names which are available.

FUSE Interaction

Create the FUSE components to interact with the file system.

Create Documention

Create blob store mechanism.

Store blobs in a directory structure such that blobs are not in a single directory but rather in multiple directories based on the prefix of their hash.

mount and umount commands

Create mount and umount commands that allow for a RESTful API.

Here's the deal, right now we just have a hacked together mount command on the CLI just to get things going. However, what we really want is mount and umount commands.

The MountHandler should be a RESTful endpoint:

GET /mounts -- list the mounts
POST /mounts -- add a new mount
GET /mount/uuid -- detail about a mount
PUT /mount/uuid -- update a mount
DELETE /mount/uuid -- umount

That way this command is usable from the Web API as well.

Create BoltDB interface

Create mechanism to interact with BoltDB including the following buckets:

Names (matches file names to version)
Versions (matches version numbers to meta data)
Prefix (storing directory metadata)

Chunker.Offset and Chunker.Stride

The fixed length chunker and the variable length chunker return two different data for the Offset() method.

Fixed length returns the end index, where variable length returns the length of the current block.

Modify fixed length to return the end index.

Create LevelDB interface

See #14 -- create a LevelDB driver that implements the Database interface.

Organization refactor

Right now the project structure:

fluidfs
└── Godeps
└── cmd
|   ├── fluid
|   └── fluidfs
└── docs
└── fixtures
|   └── fluidfs-example.yml
└── fluid
|   ├── boltdb.go
|   ├── chunk.go
|   ├── config.go
|   ├── config_test.go
|   ├── db.go
|   ├── errors.go
|   ├── fluid.go
|   ├── fluid_suite_test.go
|   ├── leveldb.go
|   ├── logger.go
|   ├── logger_test.go
|   ├── utils.go
|   ├── version.go
|   └── version_test.go
└── vendor

is top heavy, holding a variety of files and packages that are related to the deployment of the application, but not source code for the package. For example the Godeps and vendor folders contain dependency information. Docs contains the documentation hosted on the GitHub Pages site, and fixtures contains files (e.g. configuration files) that initialize deployment.

The primary source code is in the cmd directory - the compiled binaries for FluidFs and the fluid directory - the fluid library package.

However, several components including:

logger
config
db
chunk

Seem like they can go in their own packages, but I don't want to create an entire new GitHub repository for them. There are two options for refactoring:

create subpackages in github.com/bbengfort/fluidfs/fluid/subpackage
create subpackages in github.com/bbengfort/fluidfs/subpackage

Option #1 keeps all the source in a single top level directory, but makes the import paths longer. Option #2 seems like the "right" (more go idiomatic) mechanism, but creates confusion about which top level directories are packages and which are application files.

@keleher any thoughts?

Cursor: Scan and Keys

Implement the Cursor object as well as the Scan and Keys methods for the Database interface of:

BoltDB
LevelDB
Tests

Delete Entities

When we delete entities do we delete them from meta storage as well?

Raft Consensus

Implement Raft Consensus for FluidFS coordination and control.

Metainformation w/o blobs?

See #10

Second guarantee: Blobs and correctness.

Is it correct if we have metainformation w/o blobs?

chmod - change the mode
chown - change the owner
chgrp - change the group
count - count the number of files/directories matching the given pattern
du - summarize the disk space used
cat - stream the file from remote
tail - read the last few bytes from the remote
head - read the first few bytes from remote