GithubHelp home page GithubHelp logo

Comments (19)

rfjakob avatar rfjakob commented on May 22, 2024

That would be defininitely interesting. You can run the built-in benchmark using

cd gocryptfs/internal/stupidgcm
go test -bench .

On my machine, I get this (StupidGCM = simple OpenSSL wrapper, GoGCM = built-in Go crypto):

Benchmark4kEncStupidGCM-2      50000         24774 ns/op     165.33 MB/s
Benchmark4kEncGoGCM-2          10000        120745 ns/op      33.92 MB/s

My cpu does not have AES-NI,

cat /proc/cpuinfo 
[...]
model name  : Intel(R) Pentium(R) CPU G630 @ 2.70GHz
[...]
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave lahf_lm arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid xsaveopt

from gocryptfs.

lxp avatar lxp commented on May 22, 2024

My machine (i5-4690K) is still not fully idle, but I think the results are clear enough:

$ go test -bench .
PASS
Benchmark4kEncStupidGCM-4     200000          7123 ns/op     575.03 MB/s
Benchmark4kEncGoGCM-4         500000          2512 ns/op    1629.95 MB/s
ok      github.com/rfjakob/gocryptfs/internal/stupidgcm 2.867s
$ go test -bench .
PASS
Benchmark4kEncStupidGCM-4     200000          6949 ns/op     589.37 MB/s
Benchmark4kEncGoGCM-4         500000          2480 ns/op    1651.41 MB/s
ok      github.com/rfjakob/gocryptfs/internal/stupidgcm 2.803s
$ go test -bench .
PASS
Benchmark4kEncStupidGCM-4     200000          6985 ns/op     586.37 MB/s
Benchmark4kEncGoGCM-4         500000          2480 ns/op    1651.13 MB/s
ok      github.com/rfjakob/gocryptfs/internal/stupidgcm 2.813s

Results from the old openssl_benchmark.bash from v0.9:

$ ./openssl_benchmark.bash 
+ go test -bench=.
Benchmarking AES-GCM-256 with 4kB block size
testing: warning: no tests to run
PASS
BenchmarkGoEnc4K-4       1000000          1493 ns/op    2743.30 MB/s
BenchmarkGoDec4K-4       1000000          1481 ns/op    2764.83 MB/s
BenchmarkOpensslEnc4K-4   200000          7624 ns/op     537.24 MB/s
BenchmarkOpensslDec4K-4   100000         20524 ns/op     199.56 MB/s
ok      github.com/rfjakob/gocryptfs/openssl_benchmark  6.878s
$ ./openssl_benchmark.bash 
+ go test -bench=.
Benchmarking AES-GCM-256 with 4kB block size
testing: warning: no tests to run
PASS
BenchmarkGoEnc4K-4       1000000          1497 ns/op    2734.83 MB/s
BenchmarkGoDec4K-4       1000000          1487 ns/op    2754.54 MB/s
BenchmarkOpensslEnc4K-4   200000          7648 ns/op     535.54 MB/s
BenchmarkOpensslDec4K-4   100000         20577 ns/op     199.05 MB/s
ok      github.com/rfjakob/gocryptfs/openssl_benchmark  6.901s
$ ./openssl_benchmark.bash 
+ go test -bench=.
Benchmarking AES-GCM-256 with 4kB block size
testing: warning: no tests to run
PASS
BenchmarkGoEnc4K-4       1000000          1500 ns/op    2729.13 MB/s
BenchmarkGoDec4K-4       1000000          1490 ns/op    2747.32 MB/s
BenchmarkOpensslEnc4K-4   200000          7690 ns/op     532.61 MB/s
BenchmarkOpensslDec4K-4   100000         20579 ns/op     199.03 MB/s
ok      github.com/rfjakob/gocryptfs/openssl_benchmark  6.941s

I am not sure what causes the difference in Go crypto performance (but I also didn't look into the code).
What I also find interesting in the old benchmark is that OpenSSL decryption is significantly slower than encryption.

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

The old benchmarks use a 12-byte IV, which is Go's default. Since v0.7, gocryptfs actually uses 16 bytes and the new benchmarks reflect that.

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

In any case, the performance difference between Go and OpenSSL is huge. I will add autodection that switches to Go crypto if AES-NI is available.

from gocryptfs.

lxp avatar lxp commented on May 22, 2024

Ah okay, that explains it.
For me, the current situation is no problem, as I just use -openssl=false during mounting.
Yeah, autodetection was exactly what I wanted to recommend :)
I think the Go crypto code already does it. I am just not sure if it is easily accessible from outside.

from gocryptfs.

lxp avatar lxp commented on May 22, 2024

I am rather new to Go. Do you know if there is an easy way to compile the benchmark as binary?
Then, I could also test it on one of the first Intel processors supporting AES-NI (Xeon E5620).
I know it has worse AES-NI performance than newer processors, but would be interesting to know if Go crypto is still faster.

from gocryptfs.

alphazo avatar alphazo commented on May 22, 2024

Similar results here on an i5 core that has AES-NI instructions.

$ go test -bench .
PASS
Benchmark4kEncStupidGCM-4     200000          8815 ns/op     464.65 MB/s
Benchmark4kEncGoGCM-4         300000          3796 ns/op    1078.98 MB/s
ok      github.com/rfjakob/gocryptfs/internal/stupidgcm 3.147s

$ cat /proc/cpuinfo 
[...]
model name  : Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz
[...]
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

@lxp Run go test -c to get the stupidgcm.test binary. Benchmark is run using

./stupidgcm.test -test.bench .

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

Ugh. Looks like it is going to be more complicated than checking for the "aes" flag.

$ go test -bench .
PASS
Benchmark4kEncStupidGCM-2     200000         10611 ns/op     385.99 MB/s
Benchmark4kEncGoGCM-2          30000         44999 ns/op      91.02 MB/s
ok      github.com/rfjakob/gocryptfs/internal/stupidgcm 4.429s

$ cat /proc/cpuinfo | grep -e "model name\|flags" | head -2
model name  : Intel Xeon E312xx (Sandy Bridge)
flags       : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm xsaveopt

$ go version
go version go1.5.1 linux/amd64

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

Ok here we go, Go seems to use the AES instructions from v1.6. This is on the same box as above.

 ~/go/bin/go test -bench .
PASS
Benchmark4kEncStupidGCM-2     100000         16528 ns/op     247.81 MB/s
Benchmark4kEncGoGCM-2         300000          5014 ns/op     816.86 MB/s
ok      github.com/rfjakob/gocryptfs/internal/stupidgcm 3.603s

$ ~/go/bin/go version
go version go1.6.2 linux/amd64

from gocryptfs.

alphazo avatar alphazo commented on May 22, 2024

Hi guys, if you are interested I ran some benchmarks on my desktop machine and a fresh SSD comparing plain, gocryptfs (openssl on/off), encfs, securefs, truecrypt & dm-crypt.
Keep in mind that Truecrypt & dm-crypt do play in a different league since they are not file based encryption tools.
https://gist.github.com/alphazo/09a2e523e22e7aa00d491ab67678dd80

from gocryptfs.

lxp avatar lxp commented on May 22, 2024

@rfjakob Thank you, I didn't expect a that simple solution :)
I compiled a version with Go 1.6 and used the same binary on all machines.
I think the benchmarks draw a pretty clear picture.
AES-NI + Go 1.6+ -> Go Crypto
Otherwise -> OpenSSL

$ go version
go version go1.6 linux/amd64

AES-NI

Skylake (Launch: Q3'15)

$ cat /proc/cpuinfo
model name  : Intel(R) Core(TM) i3-6100U CPU @ 2.30GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm hwp hwp_notify hwp_act_window hwp_epp intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4     200000         10688 ns/op     383.22 MB/s
Benchmark4kEncGoGCM-4         300000          4073 ns/op    1005.57 MB/s

Haswell (Launch: Q2'14)

$ cat /proc/cpuinfo
model name  : Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt dtherm ida arat pln pts
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4     200000          6710 ns/op     610.43 MB/s
Benchmark4kEncGoGCM-4         500000          2422 ns/op    1690.86 MB/s

Ivy Bridge (Launch: Q2'12)

$ cat /proc/cpuinfo 
model name  : Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4     200000         14684 ns/op     278.94 MB/s
Benchmark4kEncGoGCM-4         300000          7792 ns/op     525.62 MB/s

Sandy Bridge (Launch: Q1'11)

$ cat /proc/cpuinfo 
model name  : Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-4     100000         19070 ns/op     214.78 MB/s
Benchmark4kEncGoGCM-4         200000         10981 ns/op     373.01 MB/s

Westmere (Launch: Q1'10)

$ cat /proc/cpuinfo 
model name  : Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm epb tpr_shadow vnmi flexpriority ept vpid dtherm ida arat
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-16        100000             18297 ns/op         223.85 MB/s
Benchmark4kEncGoGCM-16            200000              9579 ns/op         427.58 MB/s

no AES-NI

Ivy Bridge (Launch: Q1'13)

$ cat /proc/cpuinfo 
model name  : Intel(R) Pentium(R) CPU G2130 @ 3.20GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-2     100000         22691 ns/op     180.51 MB/s
Benchmark4kEncGoGCM-2          20000         92810 ns/op      44.13 MB/s

Nehalem (Launch: Q3'09)

$ cat /proc/cpuinfo 
model name  : Intel(R) Xeon(R) CPU           X3460  @ 2.80GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-8      50000         35247 ns/op     116.21 MB/s
Benchmark4kEncGoGCM-8          20000         92230 ns/op      44.41 MB/s

Core (Launch: Q1'08)

$ cat /proc/cpuinfo 
model name  : Intel(R) Core(TM)2 Duo CPU     E7400  @ 2.80GHz
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm
$ ./stupidgcm.test -test.bench .
PASS
Benchmark4kEncStupidGCM-2      30000         46697 ns/op      87.71 MB/s
Benchmark4kEncGoGCM-2          10000        194095 ns/op      21.10 MB/s

Maybe, I will add two older AMD processors (without AES-NI), when I have time.

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

from gocryptfs.

alphazo avatar alphazo commented on May 22, 2024

@rfjakob While most of gocryptfs operations outperformed encfs (even in standard mode) in the quick benchmark I posted earlier, why is the rm operation a bit behind ?

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

Hi @alphazo, I read your comparison with great interest, thank you! Yes, we are 15% behind EncFS for rm, hmm. To be honest, I'm not sure why. I'll have to profile this!

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

Autodetection has been added to master in 49b597f , the -openssl option now defaults to "auto". It can be overridden by passing true or false.

You can run "gocryptfs -debug -version" to see the result of the autodetection, I get

$ ./gocryptfs -debug -version
openssl=true
gocryptfs v0.10-rc2-7-g49b597f-dirty; on-disk format 2; go-fuse a01ba14

because my CPU does not support AES-NI.

from gocryptfs.

lxp avatar lxp commented on May 22, 2024

Great! Thank you, for integrating it so fast 👍
I added a Skylake CPU to my above benchmark post.
It looks good, on 4 AES-NI CPUs I get (not sure when I will be able to test it on Skylake):

$ ./gocryptfs -debug -version
openssl=false
gocryptfs v0.10-rc1-16-g4ad9d4e; on-disk format 2; go-fuse ed84134

While on the 3 non AES-NI CPUs I get:

$ ./gocryptfs -debug -version
openssl=true
gocryptfs v0.10-rc1-16-g4ad9d4e; on-disk format 2; go-fuse ed84134

I compiled again with Go 1.6 and all systems are running on amd64.

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

Great! Do you want to put the benchmarks into the wiki? Something like https://github.com/rfjakob/gocryptfs/wiki/CPU-Benchmarks ? I think it's valuable information and deserves some visibility.

Same thing for you, @alphazo ! Maybe https://github.com/rfjakob/gocryptfs/wiki/Performance-Comparison ?

from gocryptfs.

rfjakob avatar rfjakob commented on May 22, 2024

Released as v0.10-rc3.

from gocryptfs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.