GithubHelp home page GithubHelp logo

microsoft / faster Goto Github PK

View Code? Open in Web Editor NEW
6.2K 184.0 555.0 9.66 MB

Fast persistent recoverable log and key-value store + cache, in C# and C++.

Home Page: https://aka.ms/FASTER

License: MIT License

CMake 0.15% C++ 15.12% C 0.01% C# 83.78% Shell 0.03% PowerShell 0.44% JavaScript 0.39% HTML 0.09%
key-value-store hash-table persistent recoverable concurrent multi-threaded library logging indexing

faster's Introduction

FASTER logo

NuGet Build Status Gitter

Introduction

Managing large application state easily, resiliently, and with high performance is one of the hardest problems in the cloud today. The FASTER project offers two artifacts to help tackle this problem.

  • FASTER Log is a high-performance concurrent persistent recoverable log, iterator, and random reader library in C#. It supports very frequent commit operations at low latency, and can quickly saturate disk bandwidth. It supports both sync and async interfaces, handles disk errors, and supports checksums.

  • FASTER KV is a concurrent key-value store + cache (available in C# and C++) that is designed for point lookups and heavy updates. FASTER supports data larger than memory, by leveraging fast external storage (local or cloud). It also supports consistent recovery using a fast non-blocking checkpointing technique that lets applications trade-off performance for commit latency.

Both FASTER KV and FASTER Log offer orders-of-magnitude higher performance than comparable solutions, on standard workloads. Start learning about FASTER, its unique capabilities, and how to get started at our official website:

aka.ms/FASTER

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

faster's People

Contributors

1u0 avatar abioy avatar badrishc avatar chinkulkarni avatar darrenge avatar gunaprsd avatar hiteshmadan avatar iamcarbon avatar jahunter-m avatar jthelin avatar kkanellis avatar marius-klimantavicius avatar markpapadakis avatar matthewbrookes avatar mito-csod avatar peterfreiling avatar pradeepyadavmsft avatar qtcwt avatar quiye avatar rohankadekodi-msr avatar sajjadrahnama avatar sebastianburckhardt avatar sillycross avatar tedhartms avatar thiagot1 avatar tli2 avatar tornhoof avatar vazois avatar willsmythe avatar wilsonqin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

faster's Issues

errno usage that should be under _DEBUG code

There are a few error conditions in src/environment/file_linux.cc that assign errno to a new int and then return without doing anything with it.

Those probably are there to see the errno with a debugger. I am assuming those should be behind an ifdef _DEBUG?

like this

    if(result == -1) {
#ifdef _DEBUG
      int error = errno;
#endif
      return Status::IOError;
    }

EDIT: Styling for the code block

Make the intro more palatable to business people

What differentiates FASTER are its cache-optimized index that achieves very high performance — up to 160 million operations per second when data fits in memory;

Where?

I too can make magic given enough resources. I propose making clear on the readme where this tool can aggregate value, specifically on the context given by this issue's title.

C ABI

Is there any plans to add a C ABI? Or I missed it and there is already? Thanks!

.NET Foundation

Given lots Apache Foundation projects were mentioned in the paper references, I wonder if this project should also follow suite to be put under .NET Foundation? Doing this would surely help the Business world having more confidence on this project in that we know it will last longer... We all know how many projects out of MSR eventually disappear...

How to recover from snapshot after process restart?

Is it possible to recover FASTER state from snapshots after process restart (crash). All of the examples regarding recovery are for the same process and uses Guid that was returned by TakeFullCheckpoint. Does it mean that this Guid should be stored somewhere (DB for durability) after taking a checkpoint, then during startup it would be read and used for recovery?
What if multiple threads had performed checkpoint - which Guid should be used for recovery?

Unity Implementation? C#

I've been trying to get this to work with Unity but no matter what I do it doesn't seem to work. Would there be any reason that it is incompatible? I have tried using different methods from the examples (SumStore, and a slightly modified version of it) and it kept giving me the same problem:
It froze on startup, sometimes it would recover and was able to finish but I usually had to just kill the program. I could have reduced the amount of data, but the weird thing is that when it would occasionally finish, all that was left in the log files was just one single folder and I noticed that the folder was also missing a lot of stuff.

After trying to troubleshoot it for a while I decided to instead just make my own script.
I was using much less data in my script (length of 30), but I'm getting the same sort of issues.
Currently what is happening is the info.dat file is missing in index-checkpoints guid folder, and the cpr-checkpoints guid folders are completely empty. There was a more things going wrong that I can't recall (that were happening when I was using the pre-made examples), but the general gist is that I'm having a hard time getting it to work in Unity.

Is there any hope for me?
Thanks for your time :)

Update ReadMe to comment on relative performance difference between C# and C++

It would be useful to have some comments about the relative performance difference between the two.

  1. For instance, is C# simply a wrapper around the C++ version?
  2. Does C++ benefit somehow by using memory-mapped files, using blt'able types to write structures directly to disk, which may not be feasible in C#. Or if such techniques are used, are C# Spans being used? Alternatively, if the underlying architecture is relatively neutral with no real significant difference between C++ and C# that would also be valuable to know.

Add a reference to the roadmap in the Readme

By having a reference to the Roadmap, you can address several issues. First, it tells people you're not done and where the gaps are, and Second, if anyone is interested in contributing, they can start engaging in high-level discussions -- experimenting, etc.

TakeCheckpoint calls within/outside session

Issue

Currently the system accepts TakeCheckpoint and CompleteCheckpoint calls only within a session. But ideally we must be able to issue these calls both from within a session and from outside. The disadvantage of having an open session for the checkpoint thread is that, even that is being tracked by the FASTER checkpointing algorithm.

Proposal

First check if the thread that issues the checkpoint request is part of a session. Modify the complete checkpointing function to behave appropriately.

Unexpected token in release mode. C#

PS C:\Fuentes\Faster\cs\playground\SumStore\bin\x64\Release> .\SumStore.exe single populate

Excepción no controlada: System.Exception: Errors during code-gen compilation:
C:\Users\Diego\AppData\Local\Temp\FASTER\xn0h51qp\PersistentMemoryMalloc.cs(85,49): error CS1073: Unexpected token ','
en FASTER.core.Roslyn.FasterHashTableCompiler`7.GenerateFasterHashTableClass(Boolean persistGeneratedCode, Boolean optimizeCode, Int64 LogTotalSizeBytes, Double LogMutableFraction, Int32 LogPageSizeBits) en C:\Fuentes\Faster\cs\src\core\Codegen\FasterHashTableCompiler.cs:línea 59
en FASTER.core.HashTableManager.GetFasterHashTable[TKey,TValue,TInput,TOutput,TContext,TFunctions,TIFaster](Int64 size, IDevice logDevice, IDevice objectLogDevice, String checkpointDir, Int64 LogTotalSizeBytes, Double LogMutableFraction, Int32 LogPageSizeBits, Boolean persistDll, Boolean optimizeCode) en C:\Fuentes\Faster\cs\src\core\Codegen\HashTableManager.cs:línea 32
en SumStore.SingleThreadedRecoveryTest..ctor() en C:\Fuentes\Faster\cs\playground\SumStore\SingleThreadedRecoveryTest.cs:línea 25
en SumStore.Program.Main(String[] args) en C:\Fuentes\Faster\cs\playground\SumStore\Program.cs:línea 30
PS C:\Fuentes\Faster\cs\playground\SumStore\bin\x64\Release>

Recover from latest successful checkpoint

Issue

Currently FASTER requires user to provide the appropriate checkpoint GUID for recovery. This is useful for advanced users who might want to do time travel and recover to an earlier checkpoint. But, we need a simpler API that will simply recovery from latest checkpoint.

Proposal

This feature is simple to implement. When FASTER moves into the PERSISTENCE_CALLBACK phase (Checkpoint.cs), we can write the current checkpoint GUID into a separate file. During recovery we can always look into this file to recover the latest checkpoint.

Detecting data corruption for checkpoint files

Issue

Currently, we rely on windows file system API to write, flush and protect the files that have important meta information regarding checkpoints and session recovery points. We need to be able to ensure whatever data we read back during recovery is protected and free from any data corruption.

Proposal

One way to do this is using checksum integrity verifiers. Simply, we can generate a hash for each file and ensure that the hash of the file being used for recovery is consistent.

Memory increasing with string type values

I coded some C++ tests based on the sum-store example but modified them a bit to handle string values.

It works great, however the memory consumption gets growing till it eats all my RAM. Using Valgrind and Massif it happens that most of my RmwContexts live forever. Also the async. callbacks (deep copies) used for the contexts are never called.

Best regards & many thanks

Docker file

Is there a docker image to try this?
Thanks

Where does the data gets stored.

I am just exploring this repo. Any available sample is not storing the data to the provided log file.

Is there any working sample which used the log file.

Wrong value being potentially read?

I might be missing something, but consider the case of hash collision during read.

https://github.com/Microsoft/FASTER/blob/master/cc/src/core/faster.h#L1132

vs.

https://github.com/Microsoft/FASTER/blob/1ae31a5616342d5729b64c655bad97a374c21104/cs/src/core/Index/FASTER/FASTERImpl.cs#L1471

I'm not showing the handling of missing out on hash collision there? It looks to me like the C# version does the correct check, but the C++ version assumes that there will always be a value.

Python layer

Is it already implemented or any plan of adding Python layer to FASTER so that one can import and work with it.

Shared Memory

Is it available to use Faster across processes? I mean allocating Faster in Shared Memory or set it to use Shared Memory as device!

AccessViolationException when running benchmark

Clone the repo, opened the C# solution, set mode to release, run the FASTER.benchmark project.
(Ctrl+F5 in VS after marking it as a startup project).

I get the following output and the process die with access violation.

WARNING: Could not find YCSB directory, loading synthetic data instead
loaded 250000000 keys.
loaded 1000000000 txns.
loaded 1000000000 txns.
Executing setup.





Unhandled Exception: Unhandled Exception:Unhandled Exception:  Unhandled Exception: Unhandled Exception:
Unhandled Exception:
Unhandled Exception:
Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FASTER.core.FasterKV.InternalUpsert(Key* key, Value* value, Context* userContext, PendingContext& pendingContext) in C:\Work\FASTER\cs\src\core\Index\FASTER\FASTERImpl.cs:line 371
   at FASTER.benchmark.FASTER_YcsbBenchmark.SetupYcsb(Int32 thread_idx) in C:\Work\FASTER\cs\benchmark\FasterYcsbBenchmark.cs:line 132
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FASTER.core.FasterKV.InternalUpsert(Key* key, Value* value, Context* userContext, PendingContext& pendingContext) in C:\Work\FASTER\cs\src\core\Index\FASTER\FASTERImpl.cs:line 371
   at FASTER.benchmark.FASTER_YcsbBenchmark.SetupYcsb(Int32 thread_idx) in C:\Work\FASTER\cs\benchmark\FasterYcsbBenchmark.cs:line 132
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FASTER.core.FasterKV.InternalUpsert(Key* key, Value* value, Context* userContext, PendingContext& pendingContext) in C:\Work\FASTER\cs\src\core\Index\FASTER\FASTERImpl.cs:line 371
   at FASTER.benchmark.FASTER_YcsbBenchmark.SetupYcsb(Int32 thread_idx) in C:\Work\FASTER\cs\benchmark\FasterYcsbBenchmark.cs:line 132
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FASTER.core.FasterKV.InternalUpsert(Key* key, Value* value, Context* userContext, PendingContext& pendingContext) in C:\Work\FASTER\cs\src\core\Index\FASTER\FASTERImpl.cs:line 371
   at FASTER.benchmark.FASTER_YcsbBenchmark.SetupYcsb(Int32 thread_idx) in C:\Work\FASTER\cs\benchmark\FasterYcsbBenchmark.cs:line 132
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FASTER.core.FasterKV.InternalUpsert(Key* key, Value* value, Context* userContext, PendingContext& pendingContext) in C:\Work\FASTER\cs\src\core\Index\FASTER\FASTERImpl.cs:line 371
   at FASTER.benchmark.FASTER_YcsbBenchmark.SetupYcsb(Int32 thread_idx) in C:\Work\FASTER\cs\benchmark\FasterYcsbBenchmark.cs:line 132
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FASTER.core.FasterKV.InternalUpsert(Key* key, Value* value, Context* userContext, PendingContext& pendingContext) in C:\Work\FASTER\cs\src\core\Index\FASTER\FASTERImpl.cs:line 371
   at FASTER.benchmark.FASTER_YcsbBenchmark.SetupYcsb(Int32 thread_idx) in C:\Work\FASTER\cs\benchmark\FasterYcsbBenchmark.cs:line 132
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FASTER.core.FasterKV.InternalUpsert(Key* key, Value* value, Context* userContext, PendingContext& pendingContext) in C:\Work\FASTER\cs\src\core\Index\FASTER\FASTERImpl.cs:line 371
   at FASTER.benchmark.FASTER_YcsbBenchmark.SetupYcsb(Int32 thread_idx) in C:\Work\FASTER\cs\benchmark\FasterYcsbBenchmark.cs:line 132
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FASTER.core.FasterKV.InternalUpsert(Key* key, Value* value, Context* userContext, PendingContext& pendingContext) in C:\Work\FASTER\cs\src\core\Index\FASTER\FASTERImpl.cs:line 371
   at FASTER.benchmark.FASTER_YcsbBenchmark.SetupYcsb(Int32 thread_idx) in C:\Work\FASTER\cs\benchmark\FasterYcsbBenchmark.cs:line 132
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()

Issue with culture != InvariantCulture

Hi,

Just noticed a bug when running the ClassCache C# test in a culture that uses ',' as decimal separator:

Errors during code-gen compilation:
C:\Users####\AppData\Local\Temp\2\FASTER\gzwgtyln\PersistentMemoryMalloc.cs(91,49): error CS1073: Unexpected token ','

Adding
System.Threading.Thread.CurrentThread.CurrentCulture = System.Globalization.CultureInfo.InvariantCulture;
fixed the issue, so seems to be a culture issue.

Can't install

I'm getting "Failed to add reference to 'adv-file-ops'.".
Is there anything I'm doing wrong? I wanted to add it to a 4.7.1 class library project.

This is the output:

Attempting to gather dependency information for package 'FASTER.core.1.0.0' with respect to project 'EventProcessor', targeting '.NETFramework,Version=v4.7.1'
Gathering dependency information took 744,2 ms
Attempting to resolve dependencies for package 'FASTER.core.1.0.0' with DependencyBehavior 'Lowest'
Resolving dependency information took 0 ms
Resolving actions to install package 'FASTER.core.1.0.0'
Resolved actions to install package 'FASTER.core.1.0.0'
Retrieving package 'FASTER.core 1.0.0' from 'nuget.org'.
Adding package 'FASTER.core.1.0.0' to folder 'C:\Users\v.hetzer\source\repos\EventSender\packages'
Added package 'FASTER.core.1.0.0' to folder 'C:\Users\v.hetzer\source\repos\EventSender\packages'
Install failed. Rolling back...
Package 'FASTER.core.1.0.0 : Microsoft.CodeAnalysis.CSharp.Scripting [2.8.2, ), System.Reflection.Emit.ILGeneration [4.3.0, ), System.Runtime.CompilerServices.Unsafe [4.5.1, )' does not exist in project 'EventProcessor'
Removing package 'FASTER.core.1.0.0 : Microsoft.CodeAnalysis.CSharp.Scripting [2.8.2, ), System.Reflection.Emit.ILGeneration [4.3.0, ), System.Runtime.CompilerServices.Unsafe [4.5.1, )' from folder 'C:\Users\v.hetzer\source\repos\EventSender\packages'
Removed package 'FASTER.core.1.0.0 : Microsoft.CodeAnalysis.CSharp.Scripting [2.8.2, ), System.Reflection.Emit.ILGeneration [4.3.0, ), System.Runtime.CompilerServices.Unsafe [4.5.1, )' from folder 'C:\Users\v.hetzer\source\repos\EventSender\packages'
Executing nuget actions took 1,73 sec
Failed to add reference to 'adv-file-ops'.
Please make sure that the file is accessible, and that it is a valid assembly or COM component.
Time Elapsed: 00:00:02.7011605
========== Finished ==========

(The package dependencies are there because I installed them.)

pthread_create not found when cmake

The cmake command under the build/Debug directory failed in my machine.

The cmake version I use is 3.5.1

Here is the CMakeError.log:

Run Build Command:"/usr/bin/make" "cmTC_d8de1/fast"
/usr/bin/make -f CMakeFiles/cmTC_d8de1.dir/build.make CMakeFiles/cmTC_d8de1.dir/build
make[1]: Entering directory '/home/ubuntu/FASTER_TEST/FASTER/cc/build/Debug/CMakeFiles/CMakeTmp'
Building C object CMakeFiles/cmTC_d8de1.dir/CheckSymbolExists.c.o
/usr/bin/cc -o CMakeFiles/cmTC_d8de1.dir/CheckSymbolExists.c.o -c /home/ubuntu/FASTER_TEST/FASTER/cc/build/Debug/CMakeFiles/CMakeTmp/CheckSymbolExists.c
Linking C executable cmTC_d8de1
/usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_d8de1.dir/link.txt --verbose=1
/usr/bin/cc CMakeFiles/cmTC_d8de1.dir/CheckSymbolExists.c.o -o cmTC_d8de1 -rdynamic
CMakeFiles/cmTC_d8de1.dir/CheckSymbolExists.c.o: In function main': CheckSymbolExists.c:(.text+0x16): undefined reference to pthread_create'
collect2: error: ld returned 1 exit status
CMakeFiles/cmTC_d8de1.dir/build.make:97: recipe for target 'cmTC_d8de1' failed
make[1]: *** [cmTC_d8de1] Error 1
make[1]: Leaving directory '/home/ubuntu/FASTER_TEST/FASTER/cc/build/Debug/CMakeFiles/CMakeTmp'
Makefile:126: recipe for target 'cmTC_d8de1/fast' failed
make: *** [cmTC_d8de1/fast] Error 2

File /home/ubuntu/FASTER_TEST/FASTER/cc/build/Debug/CMakeFiles/CMakeTmp/CheckSymbolExists.c:
/* */
#include <pthread.h>

int main(int argc, char** argv)
{
(void)argv;
#ifndef pthread_create
return ((int*)(&pthread_create))[argc];
#else
(void)argc;
return 0;
#endif
}

I also checked part of the output in the terminal:

-- Found PythonInterp: /usr/bin/python (found version "2.7.12")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes

which said that pthread is found but pthread_create is not found.

How should I do to cmake successfully?

Missing LICENSE file

The repo is missing the traditional LICENSE file that explicitly states license terms for the source code. I note there are comments in the source code referencing the MIT license. Please add a LICENSE file that makes the terms explicit. A sample MIT license file can be found at

https://opensource.org/licenses/MIT.

read KeyValues from disk

i want to know how to dump data on disk, so that i can read the KV directly when start my application.
i wrote a sample test application but not work.
A simple application: if i input "i" input values while input "r" read values.
i start the application and input "i".
then, i restart the application and input "r".
the keys are the same but all NOTFOUND.

class Program
    {
        const long keySpace = (1L << 14);
        const long numOps = (1L << 19);
        const long refreshInterval = (1L << 8);
        const long completePendingInterval = (1L << 10);
        const long checkpointInterval = (1L << 16);
        private readonly static IDevice logd = FASTERFactory.CreateLogDevice("C:/Temp/fdb/application.log");
        private readonly static IManagedFAST<MyKey, MyValue, MyInput, MyOutput, MyContext> dbf = FASTERFactory.Create
            <MyKey, MyValue, MyInput, MyOutput, MyContext, MyFunctions>
            (keySpace, logd, new MyFunctions(), LogMutableFraction: 0.1, LogPageSizeBits: 9, LogTotalSizeBytes: 512 * 16);
        static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");
            dbf.StartSession();
            string str = "";
            while (str  != "q")
            {
                if (str == "i")
                {
                    UpsertValues();
                }
                else if (str == "r")
                {
                    ReadValues();
                }
                str = Console.ReadLine();
            }
            dbf.CompletePending(true);
            dbf.StopSession();
        }
        public static void UpsertValues()
        {
            int i = 0;
            while (i < numOps)
            {
                var key = new MyKey { };
                key.value = "" + i;
                Console.WriteLine("write key: " + key.value);
                var value = new MyValue { };
                value.value = "hello" + i;
                dbf.Upsert(key, value, null, 0);
                ++i;
                if (i % completePendingInterval == 0)
                {
                    dbf.CompletePending(false);
                } else  if (i % refreshInterval == 0)
                {
                    dbf.Refresh();
                }
            }
            dbf.CompletePending(true);
        }
        public static void ReadValues()
        {
           int i = 0;
            while (i < numOps)
            {
                var key = new MyKey { value = "" + i };
                MyOutput output = new MyOutput { value = "" };
                //var status = dbf.Read(key, null, ref output, new MyContext { }, 0);
                var status = dbf.Read(key, null, ref output, null, 0);
                ++i;
                if (status == Status.PENDING)
                {
                    dbf.CompletePending(true);
                }
                else if (status == Status.ERROR || status == Status.NOTFOUND)
                {
                    Console.WriteLine("key: " + key.value + " status: " + status);
                    continue;
                }
                Console.WriteLine("find key: " + key.value + " value: " + output.value);
            }
        }
    }

Loading Large-scale Key-Value Pairs in Benchmarking

I tried to use the benchmark code in cc/ to test faster's performance under limited memory but I found that the function "BlockAllocate" fall into an infinite loop because there's no checkpoint thread when setup the faster key-value store.

I tried to add the checkpoint thread in setup_store function using another checkpoint interval for loading. And this loop will not end until upsert operations are all finished in thread_setup_store.

After my modification, the setup_store function now can load more key-value pairs even the memory is limited, but another problem comes: after I add checkpoint in setup_store function, the checkpoint thread in run_benchmark can not run normally. What I got is just "Failed to start checkpoint". I checked the comment in "CheckPoint()", which says that there is another checkpoint running. Is there any way I can finish the previous checkpoint thread?

Here is my code in setup_store:

while(!load_done_) { if(current_time - last_checkpoint_time >= std::chrono::seconds(1)){ store->Checkpoint(nullptr, callback, token); }

The flag load_done_ will be set true after all upsert operations finished in thread_setup_store function.

At what level is locking enforced?

So, if I have a million records and there are threads constantly reading and writing all of them, at what level does FASTER locking work? Does it provide serializable consistency?

Would like example of using a string key in C++ please

I noticed that the paper says:

"Note that keys are not part of the Faster hash index, unlike many traditional designs, which provides two benefits:
• It reduces the in-memory footprint of the hash index, allowing
us to retain it entirely in memory.
• It separates user data and index metadata, which allows us to
mix and match the hash index with different record allocators."

And also the benchmark-dir/README.md says:

"The output of YCSB's "basic" driver is verbose. A typical line looks like:

INSERT usertable user5575651532496486335 [ field1='...' ... ]

To speed up file ingestion, our basic YCSB benchmark assumes that the input
file consists only of the 8-byte-integer portion of the key--e.g.:

5575651532496486335"

Does FASTER support string keys? And if so, is there any example C++ code showing string key manipulation?

Compaction based garbage collection

Currently we support garbage collection based on expiry of keys, via truncation from the head of the log. This has extremely low overhead, and captures the cases where there is a limit on TTL of all keys. However, there are cases where actual compaction of live keys is necessary. Thus, we need an mechanism to roll forward pages from the head, by determining liveness of keys in the pages to be expired, and reinserting live keys at the tail. Adding @gunaprsd.

C++ make test fails test #3 and test #4 never seems to end?!

Following the build instructions and running make worked without issue on my Ubuntu VM. However running 'make test' didn't work so well:

$ make test
Running tests...
Test project /home/simon/FASTER/cc/build/Release
    Start 1: in_memory_test
1/5 Test #1: in_memory_test ...................   Passed    2.18 sec
    Start 2: malloc_fixed_page_size_test
2/5 Test #2: malloc_fixed_page_size_test ......   Passed   13.12 sec
    Start 3: paging_queue_test
3/5 Test #3: paging_queue_test ................***Exception: Other340.92 sec
    Start 4: recovery_queue_test

Is test #4 running 'forever' because test #3 failed?
How to debug the test #3 failure?
How long should make test normally take to run?
Is there a way to make it run faster as a kind of quick test after a code change?

Unable to reserve an epoch entry

Hi,

This might be related to the question asked by some other user about where data is stored. I don't see the log files getting created.

But the program runs and it was processing my computation. I am doing some analysis that generated a million records in a tree form. I am using FASTER to both store the records and the index (additional records added to overcome the fact that faster doesn't have build in indexing).

But I see the below exception after some time.

Unhandled Exception: System.AggregateException: One or more errors occurred. (Unable to reserve an epoch entry, try increasing the epoch table size (kTableSize)) ---> System.Exception: Unable to reserve an epoch entry, try increasing the epoch table size (kTableSize)
   at FASTER.core.LightEpoch.ReserveEntry(Int32 startIndex, Int32 threadId) in D:\a\1\s\cs\src\core\Epochs\LightEpoch.cs:line 325
   at FASTER.core.LightEpoch.ReserveEntryForThread() in D:\a\1\s\cs\src\core\Epochs\LightEpoch.cs:line 346
   at FASTER.core.FasterKV`6.InternalRefresh() in D:\a\1\s\cs\src\core\Index\FASTER\FASTERThread.cs:line 49
   at FASTER.core.FasterKV`6.InternalAcquire() in D:\a\1\s\cs\src\core\Index\FASTER\FASTERThread.cs:line 29

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.