GithubHelp home page GithubHelp logo

seagate / cortx-motr Goto Github PK

View Code? Open in Web Editor NEW
59.0 59.0 142.0 77.63 MB

CORTX Motr is a distributed object and key-value storage system targeting mass capacity storage configurations. It's the core component of CORTX storage system.

Home Page: https://github.com/Seagate/cortx

License: Apache License 2.0

GDB 0.01% Makefile 0.40% C 87.03% Shell 8.35% M4 0.27% Tcl 0.02% Python 2.75% Awk 0.03% Perl 0.65% Dockerfile 0.03% HTML 0.03% Ruby 0.12% Go 0.17% Jinja 0.02% Groovy 0.10%

cortx-motr's People

Contributors

abhisheksahaseagate avatar alfhad avatar andriytk avatar atulsdeshmukh2312 avatar bdekvadiya avatar hessio avatar imvenkip avatar ivan-alekhin avatar johnbent avatar jugalpatil avatar kanchan-chaudhari avatar madhavemuri avatar max-seagate avatar mukul-seagate11 avatar mukundkanekar avatar papan-singh avatar rajatpatil98 avatar rkothiya avatar sanjognaik avatar sergey-shilov avatar shashank-parulekar avatar swapnil-seagate avatar swatid-seagate avatar t7ko-seagate avatar upendrapatwardhan avatar venkyos avatar vinoth2101 avatar yanqingfu avatar yatin-mahajan avatar yeshpal-jain-seagate avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cortx-motr's Issues

hsm: panic when reading composite object data from older extent

When reading the composite object data which resides in two (or more) different extents - the client app panics. For example, here is the object with two extents:

m0hsm> show 0x1122113:0x11221005
  - gen 2, tier 1, extents:  (writable)
  - gen 1, tier 2, extents: [0->0x3fffffff]
  - gen 0, tier 3, extents: [0->0x1ffffffff]

1 GB (gen 1) extent is in tier2 and, the older (gen 0) 8 GB extent is in tier3.

Let's try to read 2 GB of this object:

$ mcp -prof 0x7000000000000001:0x4e9 -hax 172.18.1.33@o2ib:12345:1:1 -ep 172.18.1.33@o2ib:12345:4:1 -proc 0x7200000000000001:0x376 -v -bsz 128 -threads 32 -osz $((2*1024*1024)) 0x1122113:0x11221005 /dev/null
2021/01/21 16:34:39 mio.go:547: R: off=0 len=134217728 bs=4194304 gs=2097152 speed=790 (Mbytes/sec)
2021/01/21 16:34:39 mio.go:547: R: off=134217728 len=134217728 bs=4194304 gs=2097152 speed=1000 (Mbytes/sec)
2021/01/21 16:34:39 mio.go:547: R: off=268435456 len=134217728 bs=4194304 gs=2097152 speed=977 (Mbytes/sec)
2021/01/21 16:34:39 mio.go:547: R: off=402653184 len=134217728 bs=4194304 gs=2097152 speed=1057 (Mbytes/sec)
2021/01/21 16:34:39 mio.go:547: R: off=536870912 len=134217728 bs=4194304 gs=2097152 speed=1057 (Mbytes/sec)
2021/01/21 16:34:39 mio.go:547: R: off=671088640 len=134217728 bs=4194304 gs=2097152 speed=927 (Mbytes/sec)
2021/01/21 16:34:40 mio.go:547: R: off=805306368 len=134217728 bs=4194304 gs=2097152 speed=1015 (Mbytes/sec)
2021/01/21 16:34:40 mio.go:547: R: off=939524096 len=134217728 bs=4194304 gs=2097152 speed=1040 (Mbytes/sec)
motr[01659]:  87d0  FATAL  [lib/assert.c:50:m0_panic]  panic: (!m0_vec_is_empty(&ivec->iv_vec)) at segments_sort() (motr/io.c:133)  [git: sage-base-1.0-170-g89f7737] 
Motr panic: (!m0_vec_is_empty(&ivec->iv_vec)) at segments_sort() motr/io.c:133 (errno: 0) (last failed: none) [git: sage-base-1.0-170-g89f7737] pid: 1659  
/lib64/libmotr.so.1(m0_arch_backtrace+0x2f)[0x7fc238304c9f]
/lib64/libmotr.so.1(m0_arch_panic+0xf3)[0x7fc238304e83]
/lib64/libmotr.so.1(+0x3353a4)[0x7fc2382f53a4]
/lib64/libmotr.so.1(m0_obj_op+0x51d)[0x7fc23833294d]
/lib64/libmotr.so.1(+0x3830b9)[0x7fc2383430b9]
/lib64/libmotr.so.1(m0_obj_op+0x2f9)[0x7fc238332729]
./mcp(_cgo_25f94b45affb_Cfunc_m0_obj_op+0x31)[0x4ca6a1]

First 1 GB of data is read fine, but when trying to read the 2nd GB of data (which is located in another, older extent) - it panics.

Expected behaviour: 2nd GB of object data should be read from 8 GB gen 0 extent located in tier3 without any errors or panics.

Link to Venky fork?!?

The README is looking nice! However, it is linking to a Venky fork . . .
https://github.com/Seagate/cortx-motr/blob/dev/README.rst

Says this: "Refer Reading - list for complete information."

And then Reading list is a link to a file in Venky's fork: https://github.com/VenkyOS/cortx-motr/blob/dev/doc/reading-list.md and that file contains tons of 404 URL's.

@VenkyOS , please clean all this up. There should not be links to your fork. You should get reading-list.md added to cortx-motr repo and then you need to remove all the links to files which can't be found.

@nikitadanilov @max-seagate , please feel free to provide links to files that the community can read to learn about motr and Venky can get them into github if they aren't already.

44motr-rm-lock-cc-io object got corrupted

[root@ssc-vm-1061 cortx-motr]# ./scripts/m0 run-st 44motr-rm-lock-cc-io
----- run_st 44motr-rm-lock-cc-io -----
<< 44motr-rm-lock-cc-io >>
Motr RM lock CC_IO Test ...
n k p:2 1 4
vm.max_map_count = 30000000
motr_service_start: (N,K,P)=(3,2,20) nr_ios=4 multiple_pools=0
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00156093 s, 672 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00119249 s, 879 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00168613 s, 622 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00145523 s, 721 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00113161 s, 927 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00144263 s, 727 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00112496 s, 932 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00147225 s, 712 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00144883 s, 724 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00152308 s, 688 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00122139 s, 859 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00134393 s, 780 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00104338 s, 1.0 GB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00135974 s, 771 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00147772 s, 710 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00164317 s, 638 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00129957 s, 807 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00154573 s, 678 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00140293 s, 747 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0015702 s, 668 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00141796 s, 739 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00144047 s, 728 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00149196 s, 703 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00105669 s, 992 MB/s

[152:
{0x74| ((^t|1:0), 1, (11, 22), ^o|2:9, ^v|1:20, 1,
[1: "20 3 2"],
[1: ^n|1:2],
[1: ^S|1:6],
[3: ^o|1:9, ^o|20:1, ^o|2:9],
[1: ^p|1:0], [0])},
{0x70| ((^p|1:0), [3: ^o|1:9, ^o|20:1, ^o|2:9])},
{0x6e| ((^n|1:2), 16000, 2, 3, 2, [8: ^r|1:100, ^r|1:0, ^r|1:1, ^r|1:2, ^r|1:3 , ^r|1:4, ^r|1:5, ^r|1:6])},

{0x72| ((^r|1:100), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:1", [1: ^s |1:101])},
{0x72| ((^r|1:0), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:900", [8: ^s |1:0, ^s|11:0, ^s|6:0, ^s|7:0, ^s|3:0, ^s|13:0, ^s|15:0, ^s| 16:0])},
{0x72| ((^r|1:1), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:901", [8: ^s |1:1, ^s|11:1, ^s|6:1, ^s|7:1, ^s|3:1, ^s|13:1, ^s|15:1, ^s| 16:1])},
{0x72| ((^r|1:2), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:902", [8: ^s |1:2, ^s|11:2, ^s|6:2, ^s|7:2, ^s|3:2, ^s|13:2, ^s|15:2, ^s| 16:2])},
{0x72| ((^r|1:3), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:903", [8: ^s |1:3, ^s|11:3, ^s|6:3, ^s|7:3, ^s|3:3, ^s|13:3, ^s|15:3, ^s| 16:3])},
{0x72| ((^r|1:4), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:800", [3: ^s| 2:0, ^s|12:0, ^s|3:4])},
{0x72| ((^r|1:5), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:34:1",
[3: ^s|1:6, ^s|1:7, ^s|3:6])},
{0x72| ((^r|1:6), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:100",
[2: ^s|8:0, ^s|3:7])},
{0x73| ((^s|8:0), @M0_CST_CONFD, [1: "192.168.47.113@tcp:12345:33:100"], [0], [0])},
{0x73| ((^s|1:6), @M0_CST_HA, [1: "192.168.47.113@tcp:12345:34:1"], [0], [0])} ,
{0x73| ((^s|1:7), @M0_CST_FIS, [1: "192.168.47.113@tcp:12345:34:1"], [0], [0]) },
{0x73| ((^s|1:101), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:1"], [0], [0 ])},

{0x73| ((^s|2:0), @M0_CST_MDS, [1: "192.168.47.113@tcp:12345:33:800"], [0], [0] )},
{0x73| ((^s|12:0), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:800"], [0], [0])},
{0x73| ((^s|3:4), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:800"], [0], [0 ])},

{0x73| ((^s|1:0), @M0_CST_IOS, [1: "192.168.47.113@tcp:12345:33:900"], [0], [5 : ^d|1:1, ^d|1:2, ^d|1:3, ^d|1:4, ^d|1:5])},
{0x73| ((^s|11:0), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0])},
{0x73| ((^s|6:0), @M0_CST_SNS_REP, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0])},
{0x73| ((^s|7:0), @M0_CST_SNS_REB, [1: "192.168.47.113@tcp:12345:33:900"], [0] , [0])},
{0x73| ((^s|3:0), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0] )},
{0x73| ((^s|13:0), @M0_CST_CAS, [1: "192.168.47.113@tcp:12345:33:900"], [0], [1 : ^d|20:0])},
{0x73| ((^s|15:0), @M0_CST_DIX_REP, [1: "192.168.47.113@tcp:12345:33:900"], [0 ], [0])},
{0x73| ((^s|16:0), @M0_CST_DIX_REB, [1: "192.168.47.113@tcp:12345:33:900"], [0 ], [0])},
{0x73| ((^s|1:1), @M0_CST_IOS, [1: "192.168.47.113@tcp:12345:33:901"], [0], [5 : ^d|1:6, ^d|1:7, ^d|1:8, ^d|1:9, ^d|1:10])},
{0x73| ((^s|11:1), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0])},
{0x73| ((^s|6:1), @M0_CST_SNS_REP, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0])},
{0x73| ((^s|7:1), @M0_CST_SNS_REB, [1: "192.168.47.113@tcp:12345:33:901"], [0] , [0])},
{0x73| ((^s|3:1), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0] )},
{0x73| ((^s|13:1), @M0_CST_CAS, [1: "192.168.47.113@tcp:12345:33:901"], [0], [1 : ^d|20:1])},
{0x73| ((^s|15:1), @M0_CST_DIX_REP, [1: "192.168.47.113@tcp:12345:33:901"], [0 ], [0])},
{0x73| ((^s|16:1), @M0_CST_DIX_REB, [1: "192.168.47.113@tcp:12345:33:901"], [0 ], [0])},
{0x73| ((^s|1:2), @M0_CST_IOS, [1: "192.168.47.113@tcp:12345:33:902"], [0], [5 : ^d|1:11, ^d|1:12, ^d|1:13, ^d|1:14, ^d|1:15])},
{0x73| ((^s|11:2), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0])},
{0x73| ((^s|6:2), @M0_CST_SNS_REP, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0])},
{0x73| ((^s|7:2), @M0_CST_SNS_REB, [1: "192.168.47.113@tcp:12345:33:902"], [0] , [0])},
{0x73| ((^s|3:2), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0] )},
{0x73| ((^s|13:2), @M0_CST_CAS, [1: "192.168.47.113@tcp:12345:33:902"], [0], [1 : ^d|20:2])},
{0x73| ((^s|15:2), @M0_CST_DIX_REP, [1: "192.168.47.113@tcp:12345:33:902"], [0 ], [0])},
{0x73| ((^s|16:2), @M0_CST_DIX_REB, [1: "192.168.47.113@tcp:12345:33:902"], [0 ], [0])},
{0x73| ((^s|1:3), @M0_CST_IOS, [1: "192.168.47.113@tcp:12345:33:903"], [0], [5 : ^d|1:16, ^d|1:17, ^d|1:18, ^d|1:19, ^d|1:20])},
{0x73| ((^s|11:3), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0])},
{0x73| ((^s|6:3), @M0_CST_SNS_REP, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0])},
{0x73| ((^s|7:3), @M0_CST_SNS_REB, [1: "192.168.47.113@tcp:12345:33:903"], [0] , [0])},
{0x73| ((^s|3:3), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0] )},
{0x73| ((^s|13:3), @M0_CST_CAS, [1: "192.168.47.113@tcp:12345:33:903"], [0], [1 : ^d|20:3])},
{0x73| ((^s|15:3), @M0_CST_DIX_REP, [1: "192.168.47.113@tcp:12345:33:903"], [0 ], [0])},
{0x73| ((^s|16:3), @M0_CST_DIX_REB, [1: "192.168.47.113@tcp:12345:33:903"], [0 ], [0])},

{0x73| ((^s|3:6), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:34:1"], [0], [0])} ,
{0x73| ((^s|3:7), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:100"], [0], [0] )},
{0x64| ((^d|1:1), 0, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop1")},
{0x6b| ((^k|1:1), ^d|1:1, [1: ^v|1:10])},
{0x6a| ((^j|1:1), ^k|1:1, [0])},
{0x64| ((^d|1:2), 1, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop2")},
{0x6b| ((^k|1:2), ^d|1:2, [1: ^v|1:10])},
{0x6a| ((^j|1:2), ^k|1:2, [0])},
{0x64| ((^d|1:3), 2, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop3")},
{0x6b| ((^k|1:3), ^d|1:3, [1: ^v|1:10])},
{0x6a| ((^j|1:3), ^k|1:3, [0])},
{0x64| ((^d|1:4), 3, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop4")},
{0x6b| ((^k|1:4), ^d|1:4, [1: ^v|1:10])},
{0x6a| ((^j|1:4), ^k|1:4, [0])},
{0x64| ((^d|1:5), 4, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop5")},
{0x6b| ((^k|1:5), ^d|1:5, [1: ^v|1:10])},
{0x6a| ((^j|1:5), ^k|1:5, [0])},
{0x64| ((^d|1:6), 5, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop6")},
{0x6b| ((^k|1:6), ^d|1:6, [1: ^v|1:10])},
{0x6a| ((^j|1:6), ^k|1:6, [0])},
{0x64| ((^d|1:7), 6, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop7")},
{0x6b| ((^k|1:7), ^d|1:7, [1: ^v|1:10])},
{0x6a| ((^j|1:7), ^k|1:7, [0])},
{0x64| ((^d|1:8), 7, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop8")},
{0x6b| ((^k|1:8), ^d|1:8, [1: ^v|1:10])},
{0x6a| ((^j|1:8), ^k|1:8, [0])},
{0x64| ((^d|1:9), 8, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop9")},
{0x6b| ((^k|1:9), ^d|1:9, [1: ^v|1:10])},
{0x6a| ((^j|1:9), ^k|1:9, [0])},
{0x64| ((^d|1:10), 9, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop10")},
{0x6b| ((^k|1:10), ^d|1:10, [1: ^v|1:10])},
{0x6a| ((^j|1:10), ^k|1:10, [0])},
{0x64| ((^d|1:11), 10, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop11")},
{0x6b| ((^k|1:11), ^d|1:11, [1: ^v|1:10])},
{0x6a| ((^j|1:11), ^k|1:11, [0])},
{0x64| ((^d|1:12), 11, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop12")},
{0x6b| ((^k|1:12), ^d|1:12, [1: ^v|1:10])},
{0x6a| ((^j|1:12), ^k|1:12, [0])},
{0x64| ((^d|1:13), 12, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop13")},
{0x6b| ((^k|1:13), ^d|1:13, [1: ^v|1:10])},
{0x6a| ((^j|1:13), ^k|1:13, [0])},
{0x64| ((^d|1:14), 13, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop14")},
{0x6b| ((^k|1:14), ^d|1:14, [1: ^v|1:10])},
{0x6a| ((^j|1:14), ^k|1:14, [0])},
{0x64| ((^d|1:15), 14, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop15")},
{0x6b| ((^k|1:15), ^d|1:15, [1: ^v|1:10])},
{0x6a| ((^j|1:15), ^k|1:15, [0])},
{0x64| ((^d|1:16), 15, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop16")},
{0x6b| ((^k|1:16), ^d|1:16, [1: ^v|1:10])},
{0x6a| ((^j|1:16), ^k|1:16, [0])},
{0x64| ((^d|1:17), 16, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop17")},
{0x6b| ((^k|1:17), ^d|1:17, [1: ^v|1:10])},
{0x6a| ((^j|1:17), ^k|1:17, [0])},
{0x64| ((^d|1:18), 17, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop18")},
{0x6b| ((^k|1:18), ^d|1:18, [1: ^v|1:10])},
{0x6a| ((^j|1:18), ^k|1:18, [0])},
{0x64| ((^d|1:19), 18, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop19")},
{0x6b| ((^k|1:19), ^d|1:19, [1: ^v|1:10])},
{0x6a| ((^j|1:19), ^k|1:19, [0])},
{0x64| ((^d|1:20), 19, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop20")},
{0x6b| ((^k|1:20), ^d|1:20, [1: ^v|1:10])},
{0x6a| ((^j|1:20), ^k|1:20, [0])},
{0x53| ((^S|1:6), [1: ^a|1:6], [3: ^v|1:10, ^v|1:20, ^v|2:10])},
{0x61| ((^a|1:6), [1: ^e|1:7], [3: ^v|1:10, ^v|1:20, ^v|2:10])},
{0x65| ((^e|1:7), [1: ^c|1:8], [3: ^v|1:10, ^v|1:20, ^v|2:10])},
{0x63| ((^c|1:8), ^n|1:2, [24: ^k|1:1, ^k|1:2, ^k|1:3, ^k|1:4, ^k|1:5, ^k|1:6, ^k|1:7, ^k|1:8, ^k|1:9, ^k|1:10, ^k|1:11, ^k|1:12, ^k|1:13, ^k|1:14, ^k|1:15, ^ k|1:16, ^k|1:17, ^k|1:18, ^k|1:19, ^k|1:20, ^k|20:0, ^k|20:1, ^k|20:2, ^k|20:3],
[3: ^v|1:10, ^v|1:20, ^v|2:10])},
{0x6f| ((^o|1:9), 0, [3: ^v|1:10, ^v|0x40000000000001:11, ^v|0x40000000000001: 12])},
{0x76| ((^v|1:10), {0| (3, 2,
20,
[5: 0, 0, 0, 0, 2],
[1: ^j|1:21])})},
{0x76| ((^v|0x40000000000001:11), {1| (0, ^v|1:10, [5: 0, 0, 0, 0, 1])})},
{0x76| ((^v|0x40000000000001:12), {1| (1, ^v|1:10, [5: 0, 0, 0, 0, 2])})},
{0x6a| ((^j|1:21), ^S|1:6, [1: ^j|1:22])},
{0x6a| ((^j|1:22), ^a|1:6, [1: ^j|1:23])},
{0x6a| ((^j|1:23), ^e|1:7, [1: ^j|1:24])},
{0x6a| ((^j|1:24), ^c|1:8, [20: ^j|1:1, ^j|1:2, ^j|1:3, ^j|1:4, ^j|1:5, ^j|1:6 , ^j|1:7, ^j|1:8, ^j|1:9, ^j|1:10, ^j|1:11, ^j|1:12, ^j|1:13, ^j|1:14, ^j|1:15, ^j|1:16, ^j|1:17, ^j|1:18, ^j|1:19, ^j|1:20])} ,
{0x6f| ((^o|2:9), 0, [1: ^v|2:10])},
{0x76| ((^v|2:10), {0| (4, 0, 4, [5: 0, 0, 0, 0, 1], [1: ^j|2:21])})},
{0x6a| ((^j|2:21), ^S|1:6, [1: ^j|2:22])},
{0x6a| ((^j|2:22), ^a|1:6, [1: ^j|2:23])},
{0x6a| ((^j|2:23), ^e|1:7, [1: ^j|2:24])},
{0x6a| ((^j|2:24), ^c|1:8, [4: ^j|2:1, ^j|2:6, ^j|2:11, ^j|2:16])},
{0x6a| ((^j|2:1), ^k|1:1, [0])},
{0x6a| ((^j|2:6), ^k|1:6, [0])},
{0x6a| ((^j|2:11), ^k|1:11, [0])},
{0x6a| ((^j|2:16), ^k|1:16, [0])} ,
{0x64| ((^d|20:0), 20, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop25")},
{0x6b| ((^k|20:0), ^d|20:0, [1: ^v|1:20])},
{0x6a| ((^j|20:100), ^k|20:0, [0])},
{0x64| ((^d|20:1), 21, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop26")},
{0x6b| ((^k|20:1), ^d|20:1, [1: ^v|1:20])},
{0x6a| ((^j|20:101), ^k|20:1, [0])},
{0x64| ((^d|20:2), 22, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop27")},
{0x6b| ((^k|20:2), ^d|20:2, [1: ^v|1:20])},
{0x6a| ((^j|20:102), ^k|20:2, [0])},
{0x64| ((^d|20:3), 23, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop28")},
{0x6b| ((^k|20:3), ^d|20:3, [1: ^v|1:20])},
{0x6a| ((^j|20:103), ^k|20:3, [0])},
{0x6f| ((^o|20:1), 0, [1: ^v|1:20])},
{0x76| ((^v|1:20), {0| (1, 1, 4,
[5: 0, 0, 0, 0, 1],
[1: ^j|20:1])})},
{0x6a| ((^j|20:1), ^S|1:6, [1: ^j|20:2])},
{0x6a| ((^j|20:2), ^a|1:6, [1: ^j|20:3])},
{0x6a| ((^j|20:3), ^e|1:7, [1: ^j|20:4])},
{0x6a| ((^j|20:4), ^c|1:8, [4: ^j|20:100, ^j|20:101, ^j|20:102, ^j|20:103])}]
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd && exec /var/cortx/cortx -motr/utils/mkfs/m0mkfs -F -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -T linux -e lnet:192.168.47 .113@tcp:12345:35:1 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf .xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd && exec /var/cortx/cortx -motr/motr/m0d -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 10 0663296 -C 262144 -K 100663296 -k 262144 -f '<0x7200000000000001:6>' -T linux -e lnet:192.168.47.113@tcp:12345:33:100 -c /var/motr/root/sandbox.st-44motr-rm-loc k-cc-io/confd/conf.xc |& tee -a m0d.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ha && exec /var/cortx/cortx-mo tr/utils/mkfs/m0mkfs -F -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -T ad -e lnet:192.168.47.113@t cp:12345:35:1 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
lt-m0d: systemd notifications not allowed

Started
Press CTRL+C to quit.
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/mds1 && exec /var/cortx/cortx- motr/utils/mkfs/m0mkfs -F -T ad -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113 @tcp:12345:35:800 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.x c |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios1 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:900 -f '<0x7200000000000001:0>' -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios2 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:901 -f '<0x7200000000000001:1>' -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios3 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:902 -f '<0x7200000000000001:2>' -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios4 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:903 -f '<0x7200000000000001:3>' -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ha && exec /var/cortx/cortx-motr/motr/m0d -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -T ad -e lnet:192.168.47.113@tcp:12345:34:1 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc -f '<0x7200000000000001:5>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
lt-m0d: systemd notifications not allowed

Started
Press CTRL+C to quit.
Motr HA agent started.
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/mds1 && exec /var/cortx/cortx-motr/motr/m0d -T ad -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:800 -f '<0x7200000000000001:4>' -H 192.168.47.113@tcp:12345:34:1 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0d.log
lt-m0d: systemd notifications not allowed

Started
Press CTRL+C to quit.
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios1 && exec /var/cortx/cortx-motr/motr/m0d -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:900 -f '<0x7200000000000001:0>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios2 && exec /var/cortx/cortx-motr/motr/m0d -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:901 -f '<0x7200000000000001:1>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios3 && exec /var/cortx/cortx-motr/motr/m0d -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:902 -f '<0x7200000000000001:2>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios4 && exec /var/cortx/cortx-motr/motr/m0d -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:903 -f '<0x7200000000000001:3>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
Motr confd started.
Motr mdservices started.
lt-m0d: systemd notifications not allowed

lt-m0d: systemd notifications not allowed

lt-m0d: systemd notifications not allowed

lt-m0d: systemd notifications not allowed

Started
Press CTRL+C to quit.
Started
Press CTRL+C to quit.
Started
Press CTRL+C to quit.
Started
Press CTRL+C to quit.
Motr ioservices started.
motr service started
*** m0dixinit is omitted. Mkfs creates meta indices now.
Read obj while write/update is in process.
Delete obj while write/update is in process.
Delete obj while read is in process.
Test exclusivity among Readers and Writers
Test exclusivity among Writers
Binary files /var/motr/src_file1 and /var/motr/dest_file differ
Files differ, object got corrupted
=== pids of services: 3903 3896 3890 3886 3794 3702 3055 ===
Shutting down services one by one. mdservice is the last.
----- 3903 stopping--------lt-m0d: got signal 1
motr[03903]: 9cd0 ERROR [conf/rconfc.c:1199:rconfc_fail] rconfc: 0x7ffd23d40430, state M0_RCS_IDLE failed with -22
----- 3903 stopped --------
----- 3896 stopping--------lt-m0d: got signal 1
motr[03896]: 4d10 ERROR [conf/rconfc.c:1199:rconfc_fail] rconfc: 0x7ffdd05bb470, state M0_RCS_IDLE failed with -22
----- 3896 stopped --------
----- 3890 stopping--------lt-m0d: got signal 1
motr[03890]: c950 ERROR [conf/rconfc.c:1199:rconfc_fail] rconfc: 0x7ffdc84830b0, state M0_RCS_IDLE failed with -22
----- 3890 stopped --------
----- 3886 stopping--------lt-m0d: got signal 1
motr[03886]: 3670 ERROR [conf/rconfc.c:1199:rconfc_fail] rconfc: 0x7ffddb5a9dd0, state M0_RCS_IDLE failed with -22
----- 3886 stopped --------
----- 3794 stopping--------lt-m0d: got signal 1
----- 3794 stopped --------
----- 3702 stopping--------lt-m0d: got signal 1
motr[03702]: 8860 WARN [ha/link.c:1513:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.113@tcp:12345:33:800
motr[03702]: 8860 WARN [ha/link.c:1513:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.113@tcp:12345:33:900
----- 3702 stopped --------
----- 3055 stopping--------lt-m0d: got signal 1
----- 3055 stopped --------
m0tr 13943182 0
galois 22944 1 m0tr
lnet 586401 3 m0tr,ksocklnd
Motr services stopped.

Test log file available at /var/motr/motr_2020-10-24_21:57:36.log
Motr trace files are available at: /var/motr/motr
178.28user 23.14system 4:17.58elapsed 78%CPU (0avgtext+0avgdata 768680maxresident)k
254136inputs+4682744outputs (95major+2196853minor)pagefaults 0swaps

seems read/write process got problem
is this expected?

Improve links in developer's guide

The developer's guide (https://github.com/Seagate/cortx-motr/blob/main/doc/motr-developer-guide.md) has three main sections:

  1. A simple Cortx Motr application (object)
  2. A simple Cortx Motr application (index)
  3. More examples, utilities, and applications

Right at the bottom of the intro (right before the first section A simple Cortx Motr application (object)), we should add:
These document contains the following sections:

  1. A simple Cortx Motr application (object)
  2. A simple Cortx Motr application (index)
  3. More examples, utilities, and applications
    And have these be (relevant) links to jump directly to those sections.

In the third section (More examples, utilities, and applications), there are a bunch of references to files (e.g. motr/st/utils/touch.c) which aren't links. These should be (relative) links to jump directly to the files.

FYI @huanghua78 and @andriytk and @VenkyOS

cortx-motr 51kem missing x86_64 kernel/module

why test status is still SUCCESS?

[root@ssc-vm-0925 cortx-motr]# ./scripts/m0 run-st 51kem
----- run_st 51kem -----
<< 51kem >>
make -C /lib/modules/3.10.0-1062.el7.x86_64/build M=/home/670951/cortx-motr/scripts/systemtap/kem modules
make[1]: Entering directory /usr/src/kernels/3.10.0-1062.el7.x86_64' CC [M] /home/670951/cortx-motr/scripts/systemtap/kem/kemd.o Building modules, stage 2. MODPOST 1 modules CC /home/670951/cortx-motr/scripts/systemtap/kem/kemd.mod.o LD [M] /home/670951/cortx-motr/scripts/systemtap/kem/kemd.ko make[1]: Leaving directory /usr/src/kernels/3.10.0-1062.el7.x86_64'
Inserting kemd.ko
Running Systemtap
semantic error: while resolving probe point: identifier 'kernel' at /usr/share/systemtap/tapset/linux/memory.stp:68:8
source: kernel.function("__handle_mm_fault@mm/memory.c").call
^

semantic error: missing x86_64 kernel/module debuginfo [man warning::debuginfo] under '/lib/modules/3.10.0-1062.el7.x86_64/build'

semantic error: while resolving probe point: identifier 'kernel' at :67:22
source: probe vm.pagefault = kernel.function("handle_mm_fault@mm/memory.c").call !,
^

semantic error: missing x86_64 kernel/module debuginfo [man warning::debuginfo] under '/lib/modules/3.10.0-1062.el7.x86_64/build'

semantic error: while resolving probe point: identifier 'vm' at /home/670951/cortx-motr/scripts/systemtap/kem/kemd.stp:137:7
source: probe vm.pagefault {
^

semantic error: no match

Missing separate debuginfos, use: debuginfo-install kernel-3.10.0-1062.el7.x86_64
Pass 2: analysis failed. [man error::pass2]
Number of similar error messages suppressed: 3.
Rerun with -v to see them.
Running KEM clients
Collecting kernel events...
Shutdown Systemtap
/home/670951/cortx-motr/scripts/systemtap/kem/kem_run.sh: line 43: kill: (32613) - No such process
Removing kemd.ko
make -C /lib/modules/3.10.0-1062.el7.x86_64/build M=/home/670951/cortx-motr/scripts/systemtap/kem clean
make[1]: Entering directory /usr/src/kernels/3.10.0-1062.el7.x86_64' CLEAN /home/670951/cortx-motr/scripts/systemtap/kem/.tmp_versions CLEAN /home/670951/cortx-motr/scripts/systemtap/kem/Module.symvers make[1]: Leaving directory /usr/src/kernels/3.10.0-1062.el7.x86_64'
kem_run: test status: SUCCESS
121.88user 108.38system 1:17.40elapsed 297%CPU (0avgtext+0avgdata 314756maxresident)k
42800inputs+839600outputs (53major+1952793minor)pagefaults 0swaps
[root@ssc-vm-0925 cortx-motr]# uname -r
3.10.0-1062.el7.x86_64
[root@ssc-vm-0925 cortx-motr]# cd /lib/modules
[root@ssc-vm-0925 modules]# ls
3.10.0-1062.9.1.el7.x86_64 3.10.0-1062.el7.x86_64
[root@ssc-vm-0925 modules]# cd 3.10.0-1062.el7.x86_64
[root@ssc-vm-0925 3.10.0-1062.el7.x86_64]# ls
build modules.alias modules.builtin modules.dep.bin modules.modesetting modules.softdep source weak-updates
extra modules.alias.bin modules.builtin.bin modules.devname modules.networking modules.symbols updates
kernel modules.block modules.dep modules.drm modules.order modules.symbols.bin vdso
[root@ssc-vm-0925 3.10.0-1062.el7.x86_64]# cd build
[root@ssc-vm-0925 build]# ls
arch crypto firmware include ipc kernel Makefile mm net scripts sound tools virt
block drivers fs init Kconfig lib Makefile.qlock Module.symvers samples security System.map usr vmlinux.id
[root@ssc-vm-0925 build]# cd kernel
[root@ssc-vm-0925 kernel]# ls
bpf debug gcov Kconfig.freezer Kconfig.locks livepatch power time
cpu events irq Kconfig.hz Kconfig.preempt Makefile sched trace

cortx-motr 08spiel-sns-repair-quiesce m0t1fs read failed

SNS Repair done.
verifying ...
m0t1fs read failed
Failed: SNS repair failed..
/var/motr/root/sandbox.st-08spiel-sns-repair-quiesce
unmounting and cleaning..
=== pids of services: 2973 2966 2960 2956 2864 2772 2128 ===
Shutting down services one by one. mdservice is the last.
----- 2973 stopping--------lt-m0d: got signal 1
motr[02973]: 87e0 ERROR [conf/rconfc.c:1181:rconfc_fail] rconfc: 0x7fffb79def38, state M0_RCS_IDLE failed with -22
motr[02973]: c920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02973]: c920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02973]: c920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02973]: c920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
----- 2973 stopped --------
----- 2966 stopping--------lt-m0d: got signal 1
motr[02966]: f20 ERROR [conf/rconfc.c:1181:rconfc_fail] rconfc: 0x7ffc70317678, state M0_RCS_IDLE failed with -22
motr[02966]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02966]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02966]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02966]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
----- 2966 stopped --------
----- 2960 stopping--------lt-m0d: got signal 1
motr[02960]: 4f00 ERROR [conf/rconfc.c:1181:rconfc_fail] rconfc: 0x7ffcee29b658, state M0_RCS_IDLE failed with -22
motr[02960]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02960]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02960]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02960]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
----- 2960 stopped --------
----- 2956 stopping--------lt-m0d: got signal 1
motr[02956]: e00 ERROR [conf/rconfc.c:1181:rconfc_fail] rconfc: 0x7fff83e07558, state M0_RCS_IDLE failed with -22
motr[02956]: 9920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02956]: 9920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02956]: 9920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
motr[02956]: 9920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:1003
----- 2956 stopped --------
----- 2864 stopping--------lt-m0d: got signal 1
----- 2864 stopped --------
----- 2772 stopping--------lt-m0d: got signal 1
motr[02772]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:800
motr[02772]: d920 WARN [ha/link.c:1511:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.76@tcp:12345:33:900
----- 2772 stopped --------
----- 2128 stopping--------lt-m0d: got signal 1
----- 2128 stopped --------
m0tr 13920594 0
galois 22944 1 m0tr
lnet 586401 3 m0tr,ksocklnd
Motr services stopped.
Test log available at /var/motr/root/sandbox.st-08spiel-sns-repair-quiesce/motr_2020-08-21_00:28:54.log.
spiel-sns-repair-quiesce: FAILURE 1
183.60user 105.39system 4:43.16elapsed 102%CPU (0avgtext+0avgdata 830480maxresident)k
5852256inputs+10666112outputs (100major+2593141minor)pagefaults 0swaps

/var/motr/root/sandbox.st-08spiel-sns-repair-quiesce/motr_2020-08-21_00:28:54.log:

500+0 records in
500+0 records out
2048000 bytes (2.0 MB) copied, 5.10242 s, 401 kB/s
700+0 records in
700+0 records out
5734400 bytes (5.7 MB) copied, 0.00718364 s, 798 MB/s
700+0 records in
700+0 records out
5734400 bytes (5.7 MB) copied, 8.00079 s, 717 kB/s
300+0 records in
300+0 records out
4915200 bytes (4.9 MB) copied, 0.00545938 s, 900 MB/s
dd: error reading โ€˜/tmp/test_m0t1fs_21-08-2020_00:28:54/0:10002โ€™: Input/output error
138+0 records in
138+0 records out
2260992 bytes (2.3 MB) copied, 1.36536 s, 1.7 MB/s
Unmounting file system ...
mount | grep m0t1fs
Cleaning up mount test directory...
Unmounting file system ...
mount | grep m0t1fs
Cleaning up mount test directory...

c0cat crashes when there are no extents already created by HSM tool

m0hsm> create 800:800 1
Composite object successfully created with id=0x320:0x320
m0hsm> show 800:800
  - gen 0, tier 1, extents:  (writable)
m0hsm> show 800:800
  - gen 0, tier 1, extents:  (writable)
m0hsm> show 800:800
  - gen 0, tier 1, extents:  (writable)
umanesan1@client-21~[1020]c0cat 800 800 ./tmp/out 1024 1073741824 -p
bw = 000.0000 MB/s                                                                                   [ 00/01 ]
motr[04013]:  fca0  FATAL  [lib/assert.c:50:m0_panic]  panic: (valid_subobj_cnt != 0) at composite_io_divide() (motr/composite_layout.c:738)  [git: sage-base-1.0-170-g89f7737] 
Motr panic: (valid_subobj_cnt != 0) at composite_io_divide() motr/composite_layout.c:738 (errno: 0) (last failed: none) [git: sage-base-1.0-170-g89f7737] pid: 4013  
/lib64/libmotr.so.1(m0_arch_backtrace+0x2f)[0x7f2db09e8c9f]
/lib64/libmotr.so.1(m0_arch_panic+0xf3)[0x7f2db09e8e83]
/lib64/libmotr.so.1(+0x3353a4)[0x7f2db09d93a4]
/lib64/libmotr.so.1(+0x382bf0)[0x7f2db0a26bf0]
/lib64/libmotr.so.1(m0_obj_op+0x2f9)[0x7f2db0a16729]
c0cat[0x405e72]
c0cat(c0appz_cat+0x217)[0x407db7]
c0cat(main+0x54e)[0x40464e]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f2db02f8555]
c0cat[0x4049e8]
Aborted

motr kernel unit tests reports memory leaks

Hello-

The motr kernel unit tests ./scripts/mo run-kut reports memory leaks. Is this explained or expected?

Thank you.

Details:

Jul 28 12:30:41 cortx-builder-vm1 kernel: #012Time: 105.15 sec, Mem: 369 MiB, Leaked: 13 MiB, Asserts: 36605691#012Unit tests status: SUCCESS

Full output: kut.txt

motr unit tests reports memory leaks

ut.txt

Hello-

the motr unit tests report memory leaks. Are these explained or expected?

Thank you.

Details:
coroutine [Anatoliy] 0.00 sec 0 B
[ time: 9.08 sec, mem: 72 MiB, leaked: 62 MiB ]

sm-add [Nikita] 0.00 sec 332 KiB
[ time: 0.01 sec, mem: 38 MiB, leaked: 64 KiB ]

balloc 5.46 sec 1 GiB
[ time: 5.46 sec, mem: 1 GiB, leaked: 1 MiB ]

emap 0.81 sec 1 GiB
[ time: 165.85 sec, mem: 17 GiB, leaked: 7 MiB ]

bulkclient_test 16.00 sec 12 KiB
[ time: 16.00 sec, mem: 12 KiB, leaked: 56 B ]

server-restart-nomkfs [Egor] 0.77 sec 513 MiB
[ time: 36.50 sec, mem: 5 GiB, leaked: 8 MiB ]

m0_clovis_op_kick 0.00 sec 68 KiB
[ time: 2.05 sec, mem: 1 MiB, leaked: 12 KiB ]

clovis_obj_namei_cb_fini 0.00 sec 68 KiB
[ time: 0.09 sec, mem: 1 MiB, leaked: 24 KiB ]

m0_clovis_obj_op 0.00 sec 3 KiB
[ time: 0.00 sec, mem: 92 KiB, leaked: 2 KiB ]

pargrp_iomap_dgmode_recover 0.00 sec 4 KiB
[ time: 0.00 sec, mem: 153 KiB, leaked: 1 KiB ]

ioreq_dgmode_write 0.00 sec 7 KiB
[ time: 0.00 sec, mem: 166 KiB, leaked: 1 KiB ]

ioreq_fop_dgmode_read 0.00 sec 0 B
[ time: 0.00 sec, mem: 97 KiB, leaked: 1 KiB ]

clovis_sync_request_fop_send 0.00 sec 2 KiB
[ time: 0.00 sec, mem: 39 KiB, leaked: 36 KiB ]

m0_clovis_idx_service_config 0.00 sec 0 B
[ time: 0.00 sec, mem: 77 KiB, leaked: 992 B ]

lsf [Anatoliy] 0.01 sec 880 KiB
[ time: 51.97 sec, mem: 786 MiB, leaked: 299 KiB ]

composite_layer_idx_scan 0.00 sec 0 B
[ time: 0.00 sec, mem: 319 KiB, leaked: 5 KiB ]

cobfoms_utfini 0.40 sec 2 MiB
[ time: 1.46 sec, mem: 360 MiB, leaked: 12 KiB ]

dir-add-del 0.00 sec 528 B
[ time: 0.00 sec, mem: 66 KiB, leaked: 64 B ]

pver-find 0.00 sec 45 KiB
[ time: 0.00 sec, mem: 45 KiB, leaked: 24 B ]

glob-errors 0.00 sec 0 B
[ time: 0.00 sec, mem: 60 KiB, leaked: 96 B ]

validation 0.02 sec 601 KiB
[ time: 0.02 sec, mem: 601 KiB, leaked: 704 B ]

walk 0.00 sec 60 KiB
[ time: 0.00 sec, mem: 60 KiB, leaked: 64 B ]

user-concur-reb [Sergey] 0.89 sec 226 MiB
[ time: 14.28 sec, mem: 3 GiB, leaked: 4 MiB ]

dtx-fix 0.01 sec 4 KiB
[ time: 0.03 sec, mem: 21 KiB, leaked: 4 KiB ]

test_fd_mapping 0.50 sec 1 MiB
[ time: 0.52 sec, mem: 2 MiB, leaked: 344 B ]

fdmi-pd-fake-rec-reg 0.02 sec 48 KiB
[ time: 0.08 sec, mem: 240 KiB, leaked: 3 KiB ]

fdmi-sd-send-notif 0.02 sec 52 KiB
[ time: 0.04 sec, mem: 106 KiB, leaked: 3 KiB ]

filter-str-ops 0.00 sec 1 KiB
[ time: 0.00 sec, mem: 7 KiB, leaked: 72 B ]

delayed-unreachable 0.00 sec 2 KiB
[ time: 0.10 sec, mem: 7 MiB, leaked: 7 MiB ]

ha-usecase 0.02 sec 739 KiB
[ time: 1.72 sec, mem: 233 MiB, leaked: 441 KiB ]

layout-pdclust-instance-failure 0.00 sec 2 KiB
[ time: 0.00 sec, mem: 12 KiB, leaked: 1 KiB ]

client-server-bulk 11.23 sec 248 KiB
[ time: 29.66 sec, mem: 1 MiB, leaked: 64 KiB ]

mdservice-fini 0.58 sec 1 MiB
[ time: 1.08 sec, mem: 193 MiB, leaked: 194 KiB ]

reqh 0.89 sec 297 MiB
[ time: 0.89 sec, mem: 297 MiB, leaked: 126 KiB ]

creditor-reset 0.76 sec 232 MiB
[ time: 17.96 sec, mem: 4 GiB, leaked: 4 MiB ]

two-read-locks 0.77 sec 232 MiB
[ time: 6.95 sec, mem: 2 GiB, leaked: 5 MiB ]

pool-dix-rebalance 0.72 sec 92 MiB
[ time: 14.04 sec, mem: 1 GiB, leaked: 38 KiB ]

m0d-confd process starts killing cpu after some time - connections leakage

On SAGE cluster the only configured confd process got stuck killing CPU with 100% consumption. m0trace log is full of the following lines:

2662840296390  4941007.836.524400  7f7d94ff8710  DEBUG   rpc/formation2.c:585:frm_fill_packet_from_item_sources   conn: 0x7f7ad4413f10
2662840296391  4941007.836.524739  7f7d94ff8710  DEBUG   rpc/formation2.c:585:frm_fill_packet_from_item_sources   conn: 0x7f7ad440d170
2662840296392  4941007.836.525108  7f7d94ff8710  DEBUG   rpc/formation2.c:585:frm_fill_packet_from_item_sources   conn: 0x7f7ad4405790
2662840296393  4941007.836.525495  7f7d94ff8710  DEBUG   rpc/formation2.c:585:frm_fill_packet_from_item_sources   conn: 0x7f7ad43feaf0
2662840296394  4941007.836.525861  7f7d94ff8710  DEBUG   rpc/formation2.c:585:frm_fill_packet_from_item_sources   conn: 0x7f7ad43f7d70
2662840296395  4941007.836.526245  7f7d94ff8710  DEBUG   rpc/formation2.c:585:frm_fill_packet_from_item_sources   conn: 0x7f7ad43f0ef0
2662840296396  4941007.836.526584  7f7d94ff8710  DEBUG   rpc/formation2.c:585:frm_fill_packet_from_item_sources   conn: 0x7f7ad43ea170
2662840296397  4941007.836.526917  7f7d94ff8710  DEBUG   rpc/formation2.c:585:frm_fill_packet_from_item_sources   conn: 0x7f7ad43dd0f0

m0trace file - sage-confd-kills-cpu-m0trace-af6819c.36917.zip

Versions:

  • Motr - commit af6819c
  • Hare - commit 571ae39

The cluster configuration file - sage-proto-cdf.yaml.txt

Create artificial aging tool

It is very important to be able to test a system that is aged and that is fragmented and that is almost capacity full. There's been a ton of excellent academic research about this. For example,
https://research.cs.wisc.edu/adsl/Publications/impressions-fast09.pdf
https://www.usenix.org/system/files/hotstorage19-paper-conway.pdf

However, what I believe most people do when they want to study an aged file system is they write some data generation tool that basically just writes and deletes and writes and deletes real objects for a really long time until the system is aged and full. But with mass-capacity storage, this can take days or even weeks to age and fill a system. It works (at the end of the process you have an aged and full system that has appropriate metadata structures describing all of the data that has been put into the system and not deleted) but it is very slow to actually put all that data.

However, you maybe don't really need to actually put all that data; all you really need are the metadata structures describing that data. For example, using file system terms: at the end of the aging process you'll have a ton of inodes and a ton of data blocks but all you really need to study the full and aged system are the inodes whereas the data blocks themselves are not as important(*).

Instead of this very slow process, would it be possible to merely write a set of metadata structures that describe a system as it would look after being aged and fragmented? This could be done much much more quickly presumably.

(*) if you're studying compression or dedup or some other aspect of the system related to the data content, then yeah, the data blocks would be important. But for many aging/filling studies, they probably aren't.

cortx-motr 08spiel-sns-repair-quiesce seems enter an infinite loop

it runs more than three hours and still running, also it used more than 40G of the space. system info 8vCPU, 16GB RAM, 8x100GB Disks. This behavior happens randomly.

Filesystem Size Used Avail Use% Mounted on
devtmpfs 7.8G 0 7.8G 0% /dev
tmpfs 7.8G 0 7.8G 0% /dev/shm
tmpfs 7.8G 1.0M 7.8G 1% /run
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/mapper/vg_sysvol-lv_root 25G 3.9G 20G 17% /
/dev/sda1 976M 119M 791M 14% /boot
/dev/mapper/vg_sysvol-lv_tmp 976M 12M 898M 2% /tmp
/dev/mapper/vg_sysvol-lv_var 406G 47G 343G 12% /var
/dev/mapper/vg_sysvol-lv_log 9.1G 121M 8.5G 2% /var/log
/dev/mapper/vg_sysvol-lv_audit 240M 4.2M 219M 2% /var/log/audit
ssc-nfs-home1.colo.seagate.com:/home/670951 1.5T 1.3T 214G 86% /home/670951
tmpfs 1.6G 0 1.6G 0% /run/user/670951
none 1.9T 671M 1.9T 1% /tmp/test_m0t1fs_16-08-2020_20:51:50
tmpfs 1.6G 0 1.6G 0% /run/user/0

from the source code

spiel_wait_for_sns_repair()
{
echo $M0_SRC_DIR/utils/spiel/m0spiel $SPIEL_OPTS
    $M0_SRC_DIR/utils/spiel/m0spiel $SPIEL_OPTS <<EOF
import time
$SPIEL_FIDS_LIST

$SPIEL_RCONF_START

one_status = SpielSnsStatus()
ppstatus = pointer(one_status)
active = 0
while (1):
    active = 0
    rc = spiel.sns_repair_status(fids['pool'], ppstatus)
    print ("sns repair status responded servers: " + str(rc))
    for i in range(0, rc):
        print "status of ", ppstatus[i].sss_fid, " is: ", ppstatus[i].sss_state
        if (ppstatus[i].sss_state == 2) :
            print "sns is still active on ", ppstatus[i].sss_fid
            active = 1
    if (active == 0):
        break;
    time.sleep(3)

$SPIEL_RCONF_STOP
EOF
}

if the active in the while loop is not equal to 0, it will run forever. Maybe we need to add a time limit to the code? for example, if after one hour the test was not finished, break.

08spiel-sns-repair was running ok
spiel-sns-repair: test status: SUCCESS
620.99user 316.33system 11:08.12elapsed 140%CPU (0avgtext+0avgdata 1726428maxresident)k
23180440inputs+26713280outputs (562major+9612693minor)pagefaults 0swaps

Can/should we create Doxygen?

Hello @nikitadanilov ,

In https://github.com/Seagate/cortx-motr/pull/233/files/ba835da44d825a329c7c4694087f9e3b7ebea369..dd549c1eec91f9c8bae2d4be430547aea3957c08, I saw that you mentioned Doxygen. Are the motr source files set up for Doxygen? If so, are the developers currently using Doxygen auto-generated servers? If so, where? If it is not currently set up, or if it is set up but only internal, what do you think about having someone follow this guide and set up public Doxygen for us?

https://gist.github.com/francesco-romano/351a6ae457860c14ee7e907f2b0fc1a5

Thanks,

John

Improvements in m0crate utility (KV test run).

There are improvement areas such as,

  1. Correctness in output logs (for all log levels) of m0crate KV test run.
	a. Following parameters are not being used by m0crate kv test but still displayed in the output logs.
		Before Fix:
			info: average size:          65536
			info: maximal size:          1048576
			info: block size:            0
		After Fix:
			No need to display this if KV test run is running.
	b. Following parameter shows wrong output.
		Before Fix:
			info: number of operations:  1000 (OP_COUT: 32 was passed. always shows the same value.). 
		After Fix:
			info: number of operations:  32
	c. If we read following output it looks like percentage of ops but its actually 
            count of number of operations remaining. 
		Before Fix:
			trace: dix: prcnt: [5, 4, 0, 0]		
		After Fix: 
			trace: dix: ops remaining: [5, 4, 0, 0]
	d.  Result should shows details about the test run,
            Before Fix:
                            result: all, 0.133912
            After Fix:
                          result: Total, 0.133912 s
	                          Avg time per ops.
	                          PUT, <time taken by put ops> s
	                          GET, <time taken by get ops> s and so on.

  1. Implementation of specifying the specific KEY and VALUE sizes in kv test run, sizes can be number (int) or random.
    key_size=sizeof(struct m0_fid)
    value_size=RECORD_SIZE-key_size;
    here RECORD_SIZE is tunable parameter available in .yaml file.
    There is no way/option available to set specific size of key and value.

ERROR [be/engine.c:312] when running example1.c with > 300 blocks

Got this error when running example1.c with more than 300 blocks. Originally, the example1.c only read and write 2 blocks of data, but I want to test the throughput, so I need to increase the data size.
The exact code that I run is here (on my own repo):
https://github.com/daniarherikurniawan/cortx-motr-all/blob/master/motr/examples/example1.c

  motr[00757]:  b860  ERROR  [be/engine.c:312:be_engine_got_tx_open]  tx=0x7f8edc036928 engine=0x7ffef801e160 t_prepared=(268026,88716280) t_payload_prepared=131072 bec_tx_size_max=(262144,100663296) bec_tx_payload_max=2097152
   motr[00751]:  c860  ERROR  [be/engine.c:312:be_engine_got_tx_open]  tx=0x7f91a0033aa8 engine=0x7ffe6eb79730 t_prepared=(302142,100088668) t_payload_prepared=131072 bec_tx_size_max=(262144,100663296) bec_tx_payload_max=2097152
   motr[00757]:  b860  ERROR  [be/engine.c:312:be_engine_got_tx_open]  tx=0x7f8edc039338 engine=0x7ffef801e160 t_prepared=(378903,125676541) t_payload_prepared=131072 bec_tx_size_max=(262144,100663296) bec_tx_payload_max=2097152
   motr[00764]:  e860  ERROR  [be/engine.c:312:be_engine_got_tx_open]  tx=0x7f96c403b208 engine=0x7ffc9ce02d30 t_prepared=(276217,89536799) t_payload_prepared=131072 bec_tx_size_max=(262144,100663296) bec_tx_payload_max=2097152
   motr[00747]:  7860  ERROR  [be/engine.c:312:be_engine_got_tx_open]  tx=0x7f4fb80345c8 engine=0x7fffc29fdaf0 t_prepared=(299299,99140969) t_payload_prepared=131072 bec_tx_size_max=(262144,100663296) bec_tx_payload_max=2097152
   motr[00751]:  c860  ERROR  [be/engine.c:312:be_engine_got_tx_open]  tx=0x7f91a0036508 engine=0x7ffe6eb79730 t_prepared=(393118,130415036) t_payload_prepared=131072 bec_tx_size_max=(262144,100663296) bec_tx_payload_max=2097152
   motr[00764]:  e860  ERROR  [be/engine.c:312:be_engine_got_tx_open]  tx=0x7f96c403d5c8 engine=0x7ffc9ce02d30 t_prepared=(406927,131811159) t_payload_prepared=131072 bec_tx_size_max=(262144,100663296) bec_tx_payload_max=2097152
   motr[00747]:  7860  ERROR  [be/engine.c:312:be_engine_got_tx_open]  tx=0x7f4fb8036fe8 engine=0x7fffc29fdaf0 t_prepared=(381746,126624240) t_payload_prepared=131072 bec_tx_size_max=(262144,100663296) bec_tx_payload_max=2097152

If I just use 2 blocks, each has 4KB, everything runs well.

LNet reporting drops.

Hello-

While operating under S3 load, a VM instance is logging these errors to the system log:

[ 435.699805] LNetError: 4944:0:(lib-ptl.c:190:lnet_try_match_md()) Matching packet from 12345-192.168.10 .13@tcp, match 9007199254740992 length 69632 too big: 1048576 left, 65536 allowed
[ 435.699862] LNetError: 4944:0:(lib-ptl.c:190:lnet_try_match_md()) Skipped 45 previous similar messages
[ 494.313371] LNet: 4943:0:(lib-move.c:3961:lnet_parse_put()) Dropping PUT from 12345-192.168.10.13@tcp portal 2 match 9007199254740992 offset 0 length 69632: 4
[ 494.313376] LNet: 4943:0:(lib-move.c:3961:lnet_parse_put()) Skipped 79 previous similar messages
[ 499.821820] LNetError: 4943:0:(lib-ptl.c:190:lnet_try_match_md()) Matching packet from 12345-192.168.10.13@tcp, match 9007199254740992 length 69632 too big: 1046376 left, 65536 allowed
[ 499.821873] LNetError: 4943:0:(lib-ptl.c:190:lnet_try_match_md()) Skipped 75 previous similar messages

Is anybody familiar with this error? Could it be related to TCP frame size?

Thanks,
-Tim

It takes several minutes to stop the last confd during hctl shutdown

On multi-nodes cluster (3+ server nodes) with 3+ confd-s configured hctl shutdown command would get stuck for several minutes when stopping the last confd process:

[vagrant@cmu motr]$ hctl shutdown
Stopping m0d@0x7200000000000001:0x1c (ios) at ssu1.local...
Stopping m0d@0x7200000000000001:0x42 (ios) at ssu2.local...
Stopping m0d@0x7200000000000001:0x65 (ios) at ssu3.local...
Stopping m0d@0x7200000000000001:0x88 (ios) at ssu4.local...
Stopping m0d@0x7200000000000001:0xab (ios) at ssu5.local...
Stopping m0d@0x7200000000000001:0xce (ios) at ssu6.local...
Stopping m0d@0x7200000000000001:0xf1 (ios) at ssu7.local...
Stopped m0d@0x7200000000000001:0x1c (ios) at ssu1.local
Stopped m0d@0x7200000000000001:0x42 (ios) at ssu2.local
Stopped m0d@0x7200000000000001:0xce (ios) at ssu6.local
Stopped m0d@0x7200000000000001:0xab (ios) at ssu5.local
Stopped m0d@0x7200000000000001:0xf1 (ios) at ssu7.local
Stopped m0d@0x7200000000000001:0x88 (ios) at ssu4.local
Stopped m0d@0x7200000000000001:0x65 (ios) at ssu3.local
Stopping m0d@0x7200000000000001:0x7 (confd) at cmu.local...
Stopping m0d@0x7200000000000001:0x19 (confd) at ssu1.local...
Stopping m0d@0x7200000000000001:0x3f (confd) at ssu2.local...
Stopped m0d@0x7200000000000001:0x19 (confd) at ssu1.local
Stopped m0d@0x7200000000000001:0x3f (confd) at ssu2.local

!!!!!!!!!!!!!!!!!!!   IT GETS STUCK HERE   !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Stopped m0d@0x7200000000000001:0x7 (confd) at cmu.local
Stopping hare-hax at client1.local...
Stopping hare-hax at cmu.local...
Stopping hare-hax at ssu1.local...
Stopping hare-hax at ssu2.local...
Stopping hare-hax at ssu3.local...
Stopping hare-hax at ssu4.local...
Stopping hare-hax at ssu5.local...
Stopping hare-hax at ssu6.local...
Stopped hare-hax at client1.local
Stopping hare-hax at ssu7.local...
Stopped hare-hax at ssu3.local
Stopped hare-hax at ssu6.local
Stopped hare-hax at ssu4.local
Stopped hare-hax at ssu5.local
Stopped hare-hax at ssu2.local
Stopped hare-hax at ssu1.local
Stopped hare-hax at ssu7.local
Stopped hare-hax at cmu.local
Stopping hare-consul-agent at client1.local...
Stopping hare-consul-agent at cmu.local...
Stopping hare-consul-agent at ssu1.local...
Stopping hare-consul-agent at ssu2.local...
Stopping hare-consul-agent at ssu3.local...
Stopping hare-consul-agent at ssu4.local...
Stopping hare-consul-agent at ssu5.local...
Stopping hare-consul-agent at ssu6.local...
Stopped hare-consul-agent at ssu4.local
Stopping hare-consul-agent at ssu7.local...
Stopped hare-consul-agent at ssu3.local
Stopped hare-consul-agent at ssu6.local
Stopped hare-consul-agent at ssu5.local
Stopped hare-consul-agent at client1.local
Stopped hare-consul-agent at ssu7.local
Stopped hare-consul-agent at cmu.local
Stopped hare-consul-agent at ssu1.local
Stopped hare-consul-agent at ssu2.local
Killing RC Leader at cmu.local... **ERROR**

motr-build Error

I am following "Quick Start Guide" but got below error when ran "sudo ./scripts/install-build-deps":
image

Here is my kernel and Linux version:

[root@node-4 home]# uname -r
3.10.0-1062.el7.x86_64
[root@node-4 home]# lsb_release -r
Release:        7.7.1908

Thanks!

Unit tests failure while following the guide

I am following the guide at,

doc/Quick-Start-Guide.rst

While I try to run the Unit tests, I see a possible assert failure

 tx_bulk-empty                                   1.43 sec   64 MiB
  tx_bulk-error_reg  motr[08257]:   570  FATAL  [lib/assert.c:50:m0_panic]  panic: (!done) at m0_be_tx_bulk_put() (be/tx_bulk.c:618)  [git: sage-base-1.0-107-g780ec01] /var/motr/m0ut/m0trace.8257
Motr panic: (!done) at m0_be_tx_bulk_put() be/tx_bulk.c:618 (errno: 4) (last failed: none) [git: sage-base-1.0-107-g780ec01] pid: 8257  /var/motr/m0ut/m0trace.8257

The relevant output and system info are

cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
 cat /proc/modules | grep lnet
lnet 586401 2 ksocklnd, Live 0xffffffffc06ad000 (OE)
libcfs 415252 2 ksocklnd,lnet, Live 0xffffffffc062e000 (OE)
$ uname -r
3.10.0-1062.18.1.el7.x86_64

Compiling error with demo-fdmi/m0-instance program

For reference:
FDMI+containers demonstration was done for Sage-1. The details and source code are available here :
https://jts.seagate.com/browse/EURND-91

demo-fdmi code has two parts: An FDMI plugin and a clovis app. Refactored the clovis-app to make it work with the opensource version. However, the FDMI plugin is giving few compiling errors which I am not sure how to take care of. Some of the compiling errors was simple as that had to do with the new naming conventions in opensource code - 'mero' changed to 'motr', 'clovis' name completely removed. There were few more which had a change in the function call and definition. Took care of those. However, there are few function with no reference in the opensource code. Below are the compiling errors showing those function calls:

make
gcc -c -g -std=gnu99 -Wall -Werror -Wno-attributes -Wno-unused-variable -Wno-unused-but-set-variable -D_REENTRANT -D_GNU_SOURCE -DM0_INTERNAL='' -DM0_EXTERN=extern -fno-strict-aliasing -fno-omit-frame-pointer -fno-common -fPIC -include config.h -I/home/rroy/cortx/cortx-motr -I/usr/src/lustre-client-2.12.4/lnet/include -I/usr/src/lustre-client-2.12.4/lustre/include -o src/main.o src/main.c
src/main.c: In function โ€˜inst_initโ€™:
src/main.c:405:5: error: implicit declaration of function โ€˜m0_conf_fs_getโ€™ [-Werror=implicit-function-declaration]
     rc = m0_conf_fs_get(m0_reqh2profile(reqh), m0_reqh2confc(reqh), &fs);
     ^
src/main.c: In function โ€˜init_containersโ€™:
src/main.c:490:5: error: implicit declaration of function โ€˜m0_clovis_container_createโ€™ [-Werror=implicit-function-declaration]
     m0_clovis_container_create(&small_cont, &ops[0]);
     ^
src/main.c: In function โ€˜mainโ€™:
src/main.c:606:9: error: implicit declaration of function โ€˜_initโ€™ [-Werror=implicit-function-declaration]
         rc = _init(&m0c, &cfg, true);
         ^
src/main.c:610:9: error: implicit declaration of function โ€˜m0_default_layout_idโ€™ [-Werror=implicit-function-declaration]
         default_layout = m0_default_layout_id(m0c);
         ^
src/main.c:613:9: error: implicit declaration of function โ€˜m0_realm_initโ€™ [-Werror=implicit-function-declaration]
         m0_realm_init(&my_realm,
         ^
src/main.c:659:17: error: implicit declaration of function โ€˜m0_container_opโ€™ [-Werror=implicit-function-declaration]
                 m0_container_op(target, M0_IC_PUT,
                 ^
cc1: all warnings being treated as errors
make: *** [src/main.o] Error 1

According to @nikitadanilov these functions are gone due to interface refactoring and cleaning. Have attached the latest demo-fdmi code to which I have made changes according to the opensource code:

updated_demo_fdmi.tar.gz

Improvements in Reading List

https://github.com/Seagate/cortx-motr/blob/main/doc/motr-design-doc-list.rst

@VenkyOS , please do the following:

  1. Add a Column for Title
  2. Replace the titles in the actual links with 'PDF', 'RST', 'GDOC', 'DOCX' or 'PPTX' or whatever. This will make the table much skinnier and more readable.
  3. Change the Column Headers to the following:
    'Sr. No', 'Source File', 'Source Line', 'Title', 'PDF', 'GitHub', 'Google', 'SharePoint'.

Also, there are some issues with the numbering I saw.

motr BE segment size should be configurable

motr BE segment size should be configurable via provisioner. This is required for Cortx to run on 4U100 where the metadata disks are around ~300GB. Such can be a case in open source community where users may have different disks sizes.

Currently for physical setups are using 5T which is default in motr.conf
/opt/seagate/cortx/motr/conf/motr.conf:MOTR_M0D_BESEG_SIZE=5497558138880
/etc/sysconfig/motr

Provisioner should have mechanism to update these values for uses via say cluster.sls

cc @johnbent @ipoddubnyy @ypise

ERROR [net/lnet/ulnet_core.c:674:nlx_core_dom_init] when running example1.c

I successfully run the motr, and it gives me this output:

    MOTR is UP.
      Motr client config:

      HA_addr    : 155.98.36.46@tcp:12345:34:1
      Client_addr: 155.98.36.46@tcp:12345:33:1000
      Profile_fid: <0x7000000000000001:0>
      Process_fid: <0x7200000000000001:64>

      Now please use another terminal to run the example
      with the above command line arguments.
      Please add double quote "" around the arguments

So, I run the example1.c by using this following args:
./example1 "155.98.36.46@tcp:12345:34:1" "155.98.36.46@tcp:12345:33:1000" "<0x7000000000000001:0>" "<0x7200000000000001:64>" "1148576"

However, I got this error below

motr[05880]:  2f30  ERROR  [net/lnet/ulnet_core.c:674:nlx_core_dom_init]  open("/dev/m0lnet", O_RDWR|O_CLOEXEC) failed: errno=13 (EACCES). Please check permissions.
error in m0_client_init: -13

I am running this on "dev" branch because the current "main" branch is broken. Please let me know what that error is about and where to debug.

Change name of branch please

@jadhavik1, I see you are working on a branch called 'seq_balloc_1mb_master'. This violates our naming rules. If you are still working on this branch please change the name to 'seq_balloc_1mb_main'. If the branch is no longer needed, then please either rename it or delete it.

Thanks!

cortx-motr ut net-lnet panic

./scripts/m0 run-ut failed

fop-lock-ut
fop-lock-1reqh 0.00 sec 51 KiB
fop-lock-2reqh 0.00 sec 51 KiB
[ time: 0.54 sec, mem: 154 MiB, leaked: 0 B ]
fom-stats-ut
stats 0.00 sec 1 KiB
[ time: 0.25 sec, mem: 77 MiB, leaked: 0 B ]
net-bulk-if
net_bulk_if 0.02 sec 50 KiB
[ time: 0.02 sec, mem: 50 KiB, leaked: 0 B ]
net-bulk-mem
net_bulk_mem_buf_copy_test 0.00 sec 1 KiB
net_bulk_mem_tm_test 0.00 sec 352 B
net_bulk_mem_ep 0.00 sec 744 B
net_bulk_mem_failure_tests 0.00 sec 4 KiB
net_bulk_mem_ping_tests 7.00 sec 200 KiB
[ time: 7.00 sec, mem: 207 KiB, leaked: 0 B ]
net-lnet
net_lnet_fail 0.00 sec 288 B
net_lnet_tm_initfini 0.00 sec 232 B
net_lnet_tm_startstop 24.43 sec 6 KiB
net_lnet_msg 12.21 sec 5 KiB
net_lnet_buf_desc 12.20 sec 4 KiB
net_lnet_bulk 22.20 sec 5 KiB
net_lnet_sync 6.10 sec 4 KiB
net_lnet_timeout motr[06394]: 4290 FATAL [lib/assert.c:48:m0_panic] panic: Unit-test assertion failed: m0_atomic64_get(&test_timeout_ttb_called) >= 2 * timeout_secs at test_timeout_body() (net/lnet/ut/lnet_ut.c:2033) [git: cortx1.0-rc1-165-g1efc5498e] /var/motr/m0ut/m0trace.6394
Motr panic: Unit-test assertion failed: m0_atomic64_get(&test_timeout_ttb_called) >= 2 * timeout_secs at test_timeout_body() net/lnet/ut/lnet_ut.c:2033 (errno: 0) (last failed: none) [git: cortx1.0-rc1-165-g1efc5498e] pid: 6394 /var/motr/m0ut/m0trace.6394
/var/cortx/cortx-motr/motr/.libs/libmotr.so.1(m0_arch_backtrace+0x20)[0x7ff418209c20]
/var/cortx/cortx-motr/motr/.libs/libmotr.so.1(m0_arch_panic+0xe6)[0x7ff418209dd6]
/var/cortx/cortx-motr/motr/.libs/libmotr.so.1(+0x374004)[0x7ff4181f9004]
/var/cortx/cortx-motr/ut/.libs/libmotr-ut.so.0(+0x26b442)[0x7ff4198df442]
/var/cortx/cortx-motr/ut/.libs/libmotr-ut.so.0(+0x214406)[0x7ff419888406]
/var/cortx/cortx-motr/ut/.libs/libmotr-ut.so.0(+0x212e77)[0x7ff419886e77]
/var/cortx/cortx-motr/ut/.libs/libmotr-ut.so.0(+0x2131f0)[0x7ff4198871f0]
/var/cortx/cortx-motr/ut/.libs/libmotr-ut.so.0(m0_ut_run+0x25e)[0x7ff4198ded7e]
/var/cortx/cortx-motr/ut/.libs/lt-m0ut(main+0x1156)[0x4044d6]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7ff416983555]
/var/cortx/cortx-motr/ut/.libs/lt-m0ut[0x40462a]
/var/cortx/cortx-motr/utils/m0run: line 397: 6394 Aborted (core dumped) $(srcdir_path_of $binary) "$@"

but run net-lnet alone was not failed.

[root@ssc-vm-0958 cortx-motr]# ./scripts/m0 run-ut -t net-lnet
----- run_ut -t net-lnet -----
net-lnet
net_lnet_fail 0.00 sec 288 B
net_lnet_tm_initfini 0.00 sec 232 B
net_lnet_tm_startstop 4.01 sec 6 KiB
net_lnet_msg 2.00 sec 5 KiB
net_lnet_buf_desc 2.00 sec 4 KiB
net_lnet_bulk 12.00 sec 5 KiB
net_lnet_sync 1.00 sec 4 KiB
net_lnet_timeout 5.50 sec 4 KiB
[ time: 26.54 sec, mem: 31 KiB, leaked: 0 B ]

Time: 26.54 sec, Mem: 31 KiB, Leaked: 0 B, Asserts: 1248
Unit tests status: SUCCESS
utime 0.158871 stime 0.075492 maxrss 44732 nvcsw 2716 nivcsw 27
minflt 10524 majflt 2 inblock 0 oublock 33584
rchar 95461 wchar 1540 syscr 234 syscw 56
read_bytes 0 write_bytes 17195008 cancelled_write_bytes 0

3+ confd cluster configuration does not work

On SAGE prototype cluster with 3 confd instances:

$ ./c0cp 228395 11947215 /tmp/128M 16384 -f -x 2 -p -v
/home/users/jusers/tkachuk1/sage/.c0appz/c0cprc/client-23
c0cp: read_params():157: 6: name='HA_ENDPOINT_ADDR' value='172.18.1.23@o2ib:12345:1:1'
c0cp: read_params():157: 7: name='PROFILE_FID' value='0x7000000000000001:0x4e0'
c0cp: read_params():157: 9: name='M0_POOL_TIER1' value='0x6f00000000000001:0x44a '
c0cp: read_params():157: 10: name='M0_POOL_TIER2' value='0x6f00000000000001:0x457 '
c0cp: read_params():157: 11: name='M0_POOL_TIER3' value='0x6f00000000000001:0x472 '
c0cp: read_params():157: 13: name='LOCAL_ENDPOINT_ADDR0' value='172.18.1.23@o2ib:12345:4:1'
c0cp: read_params():157: 14: name='LOCAL_PROC_FID0' value='0x7200000000000001:0x176'
motr[01058]:  29e0  ERROR  [conf/confc.c:557:m0_confc_init_wait]  <! rc=-110 confc=0x1e290b8 confd_addr=172.18.1.2@o2ib:12345:2:1
motr[01058]:  2af0  ERROR  [conf/rconfc.c:1762:rconfc_conductor_connect]  <! rc=-110
motr[01058]:  2af0  ERROR  [conf/rconfc.c:1826:rconfc_conductor_iterate]  Failed to connect to confd_addr = 172.18.1.2@o2ib:12345:2:1 rc = -110
motr[01058]:  29e0  ERROR  [conf/confc.c:557:m0_confc_init_wait]  <! rc=-110 confc=0x1e290b8 confd_addr=172.18.1.3@o2ib:12345:2:1
motr[01058]:  2af0  ERROR  [conf/rconfc.c:1762:rconfc_conductor_connect]  <! rc=-110
motr[01058]:  2af0  ERROR  [conf/rconfc.c:1826:rconfc_conductor_iterate]  Failed to connect to confd_addr = 172.18.1.3@o2ib:12345:2:1 rc = -110
motr[01058]:  2af0  ERROR  [conf/rconfc.c:1819:rconfc_conductor_iterate]  <! rc=-2
motr[01058]:  2bd0  ERROR  [conf/rconfc.c:2571:rconfc_version_elected]  <! rc=-2 re-election started
motr[01058]:  2bd0  ERROR  [conf/rconfc.c:2574:rconfc_version_elected]  <! rc=1 herd_destroy() failed
motr[01058]:  2ba0  ERROR  [conf/rconfc.c:937:rlock_ctx_disconnect]  Failed to destroy rlock connection
motr[01058]:  4e00  FATAL  [lib/assert.c:50:m0_panic]  panic: (confc->cc_root != ((void *)0)) at m0_confc_root_open() (conf/helpers.c:226)  [git: sage-base-1.0-132-gaf6819c] 
Motr panic: (confc->cc_root != ((void *)0)) at m0_confc_root_open() conf/helpers.c:226 (errno: 4) (last failed: none) [git: sage-base-1.0-132-gaf6819c] pid: 1058  
/lib64/libmotr.so.1(m0_arch_backtrace+0x2f)[0x7f524b3ebc5f]
/lib64/libmotr.so.1(m0_arch_panic+0xf3)[0x7f524b3ebe43]
/lib64/libmotr.so.1(+0x334364)[0x7f524b3dc364]
/lib64/libmotr.so.1(m0_confc_root_open+0xf7)[0x7f524b36b0b7]
/lib64/libmotr.so.1(+0x35d9bf)[0x7f524b4059bf]
/lib64/libmotr.so.1(+0x3cfe3c)[0x7f524b477e3c]
/lib64/libmotr.so.1(m0_client_init+0x280)[0x7f524b407080]
./c0cp(c0appz_init+0x4d8)[0x406d98]
./c0cp(main+0x39b)[0x4040cb]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f524acfc555]
./c0cp[0x4048b1]
Aborted
$ sudo lctl ping 172.18.1.2@o2ib
12345-0@lo
12345-172.18.1.2@o2ib
12345-172.18.1.2@tcp
$ hctl status | grep confd
    [started]  confd      0x7200000000000001:0x24  172.18.1.2@o2ib:12345:2:1
    [started]  confd      0x7200000000000001:0x9   172.18.1.1@o2ib:12345:2:1
    [started]  confd      0x7200000000000001:0x3f  172.18.1.3@o2ib:12345:2:1
$ rpm -q cortx-motr cortx-hare
cortx-motr-1.0.0-1_gitaf6819c_3.10.0_1127.19.1.el7.x86_64
cortx-hare-1.0.0-1_git571ae39.el7.x86_64

ADDB 'how to' guide

@VenkyOS , please talk to @just-now and get any and all existing documentation about how to use ADDB. Then attempt to use that documentation to try to use ADDB yourself. Then please add a 'How to Use ADDB' page into this repo and please link it appropriately from existing documentation. Thanks!

Lustre network failed to start

I was trying to follow this Quick Start Guide, Building the Source Code. I got this error at the second step: IOC_LIBCFS_GET_NI error 100: Network is down when running sudo lctl list_nids. The wiki said "... to check if the lustre network is functioning accurately...", would be great if it also provides some hints on how to debug when lustre doesn't work properly.

I am running on the CentOS Linux release 7.7.1908 (Core) with kernel: Linux 3.10.0-1062.12.1.el7.x86_64.

The lustre installation seems to be successfully executed:

Transaction test succeeded
Running transaction
  Installing : lustre-client-devel-2.12.4-99.el7.x86_64                     1/1
  Verifying  : lustre-client-devel-2.12.4-99.el7.x86_64                     1/1

Installed:
  lustre-client-devel.x86_64 0:2.12.4-99.el7

Failed to install kernel-devel for the current kernel

Issue

Unable to setup S3 Server as one of the task failed (install kernel-devel for the current kernel).

Description

Following the Startup Guide > 1.2 Installing Dependencies, I have encountered 1 failure after running ./init.sh -a.

Running uname -r gives me this

[root@ip-172-31-33-177 dev]# uname -r
3.10.0-1062.12.1.el7.x86_64

Steps to reproduce

sudo su -

yum install -y git
yum install -y ansible
yum install -y python3
yum install -y python-pip
yum install -y epel-release

setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

# ============= REBOOT VM =============

sudo su -

git clone --recursive https://github.com/Seagate/cortx-s3server.git -b main
cd cortx-s3server
git submodule update --init --recursive && git status

cd ./scripts/env/dev
./init.sh -a

# ============= ENTER PAT =============

Error Result

TASK [base-os : install kernel-devel for the current kernel]************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "No package matching 'kernel-devel-3.10.0-1062.12.1.el7.x86_64' found available, installed or updated", "rc": 126, "results": ["No package matching 'kernel-devel-3.10.0-1062.12.1.el7.x86_64' found available, installed or updated"]}
...ignoring
msg: No package matching 'kernel-devel-3.10.0-1062.12.1.el7.x86_64' found available, installed or updated
results: No package matching 'kernel-devel-3.10.0-1062.12.1.el7.x86_64' found available, installed or updated

PLAY RECAP **************************
localhost : ok=42 changed=29 unreachable=0 failed=1 skipped=5 rescued=0 ignored=1

Not sure what went wrong...

Test hung or taking more time to pass while "run-st"

I did successfully build the motr. I ran unit tests(run-ut) and kernel unit tests(run-kut) which were passed.
then I ran one system test "52motr-singlenode-sanity" which also got passed.
after that when started to run all system tests(run-st), the test was stuck/hung at a point shown in the screenshot.
motr_system_test_hung

I waited for more than 30 minutes but the test could not proceed further.
I could see the test title "06ham" before hung.

as per comment from "Hua Huang" on teams chat:
There might be some bug with this test. You can skip it at this moment. Usually, it takes about ~5 hours to run all the ST tests.
But some of these test suites have known issues. So they are intentionally skipped. Please refer to ".xperior/testds/motr-single_tests.yaml". If some test cases are marked as "executor : Xperior::Executor::Skip", they are skippped.

System: CentOS 7.7.1908 x86_64 VM
Log attached: motr_system_test_hung_log.txt

m0cp write need to create .img file first?

followed https://github.com/Seagate/cortx/blob/main/doc/Cluster_Setup.md#16-perform-the-io

do we need to create .img file first?

[root@ssc-vm-1091 cortx-hare]# hctl status
Profile: 0x7000000000000001:0x22
Data pools:
0x6f00000000000001:0x23
Services:
localhost (RC)
[started] hax 0x7200000000000001:0x6 10.230.247.138@tcp:12345:1:1
[started] confd 0x7200000000000001:0x9 10.230.247.138@tcp:12345:2:1
[started] ioservice 0x7200000000000001:0xc 10.230.247.138@tcp:12345:2:2
[unknown] m0_client 0x7200000000000001:0x1c 10.230.247.138@tcp:12345:4:1
[unknown] m0_client 0x7200000000000001:0x1f 10.230.247.138@tcp:12345:4:2

[root@ssc-vm-1091 cortx-hare]# cortx/cortx-motr/motr/st/utils/m0cp -l 10.230.247.138@tcp:12345:4:1 -H 10.230.2 47.138@tcp:12345:1:1 \

 -p 0x7000000000000001:0x22 -P 0x7200000000000001:0x1c -o 12:10 \
 -s 1m -c 128 /var/misc/random.img -L 9

-bash: cortx/cortx-motr/motr/st/utils/m0cp: No such file or directory
[root@ssc-vm-1091 cortx-hare]# /var/cortx/cortx-motr/motr/st/utils/m0cp -l 10.230.247.138@tcp:12345:4:1 -H 10. 230.247.138@tcp:12345:1:1 -p 0x7000000000000001:0x22 -P 0x7200000000000001:0x1c -o 12:10 -s 1m -c 12 8 /var/misc/random.img -L 9
m0_write failed! Object id 12:10rc = -1
[root@ssc-vm-1091 cortx-hare]# /var/cortx/cortx-motr/motr/st/utils/m0cp -l 10.230.247.138@tcp:12345:4:1 -H 10. 230.247.138@tcp:12345:1:1 -p 0x7000000000000001:0x22 -P 0x7200000000000001:0x1c -o 14:10 -s 1m -c 12 8 /var/misc/random.img -L 9
m0_write failed! Object id 14:10rc = -1

cortx-motr repetition run libconsole-ut Lockers table overflow

  1. ./scripts/m0 run-ut
  • SUCCESS
  1. ./scripts/m0 run-ut -n 1
  • SUCCESS
  1. ./scripts/m0 run-ut -n 2

libconsole-ut
yaml_basic_test 0.00 sec 0 B
input_test 0.00 sec 1 KiB
console_input_test 0.00 sec 3 KiB
output_test 0.00 sec 1 KiB
yaml_file_test 0.00 sec 0 B
yaml_parser_test 0.00 sec 0 B
yaml_root_get_test 0.00 sec 0 B
yaml_get_value_test 0.00 sec 0 B
conn_basic_test 1.48 sec 261 MiB
conn_success_test 0.38 sec 82 MiB
mesg_send_test motr[02839]: f960 FATAL [lib/assert.c:48:m0_panic] panic: Impossible at m0_lockers_allot() (lib/lockers.c:52) [git: cortx1.0-rc1-161-g2bdfa58f0] /var/motr/m0ut/m0trace.2839
Motr panic: Impossible at m0_lockers_allot() lib/lockers.c:52 (errno: 12) (last failed: none) [git: cortx1.0-rc1-161-g2bdfa58f0] pid: 2839 /var/motr/m0ut/m0trace.2839
Motr panic reason: Impossible happened! Lockers table overflow.
/home/670951/cortx-motr/motr/.libs/libmotr.so.1(m0_arch_backtrace+0x20)[0x7f8542da27a0]
/home/670951/cortx-motr/motr/.libs/libmotr.so.1(m0_arch_panic+0xe6)[0x7f8542da2956]
/home/670951/cortx-motr/motr/.libs/libmotr.so.1(+0x373b84)[0x7f8542d91b84]
/home/670951/cortx-motr/motr/.libs/libmotr.so.1(+0x3790c7)[0x7f8542d970c7]
/home/670951/cortx-motr/ut/.libs/libmotr-ut.so.0(test_lockers+0x1e)[0x7f85443f7cae]
/home/670951/cortx-motr/ut/.libs/libmotr-ut.so.0(m0_ut_run+0x25e)[0x7f8544477cfe]
/home/670951/cortx-motr/ut/.libs/lt-m0ut(main+0x1156)[0x4044d6]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f854151c555]
/home/670951/cortx-motr/ut/.libs/lt-m0ut[0x40462a]
0.39 sec 82 MiB
[ time: 2.26 sec, mem: 425 MiB, leaked: 0 B ]

Time: 1228.75 sec, Mem: 61 GiB, Leaked: 18 MiB, Asserts: 60690724
Unit tests status: SUCCESS
libm0-ut [Nikita]
0C 0.00 sec 0 B
atomic 0.17 sec 0 B
bitmap 0.00 sec 112 B
onwire-bitmap 0.00 sec 72 B
bob 0.00 sec 0 B
buf 0.00 sec 496 B
chan 0.80 sec 0 B
cookie 0.15 sec 48 B
finject [Dima] 0.00 sec 0 B
getopts 0.00 sec 168 B
hash 0.00 sec 648 B
list 0.00 sec 0 B
locality [Nikita] 0.06 sec 1 MiB
locality-chore [Nikita] 0.00 sec 88 KiB
lockers /home/670951/cortx-motr/utils/m0run: line 397: 2839 Aborted (core dumped) $(srcdir_path_of $binary) "$@"

  1. ./scripts/m0 run-ut -t libconsole-ut
  • SUCCESS

abnormal behavior for n>1, this behavior is not random, it's very repetitive

motr system test 44motr-rm-lock-cc-io Binary files /var/motr/src_file and /var/motr/dest_file2 differ Files differ, object got corrupted

[root@ssc-vm-1061 cortx-motr]# ./scripts/m0 run-st 44motr-rm-lock-cc-io
----- run_st 44motr-rm-lock-cc-io -----
<< 44motr-rm-lock-cc-io >>
Motr RM lock CC_IO Test ...
n k p:2 1 4
vm.max_map_count = 30000000
motr_service_start: (N,K,P)=(3,2,20) nr_ios=4 multiple_pools=0
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00171417 s, 612 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00102611 s, 1.0 GB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00197831 s, 530 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00176288 s, 595 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00205684 s, 510 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00176585 s, 594 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00105214 s, 997 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00143888 s, 729 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00125449 s, 836 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00162947 s, 644 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00153473 s, 683 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00136677 s, 767 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00104768 s, 1.0 GB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00122198 s, 858 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00161536 s, 649 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00138145 s, 759 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00170309 s, 616 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00128074 s, 819 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00133179 s, 787 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00128733 s, 815 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00149068 s, 703 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0016347 s, 641 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00149312 s, 702 MB/s
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00142643 s, 735 MB/s

[152:
{0x74| ((^t|1:0), 1, (11, 22), ^o|2:9, ^v|1:20, 1,
[1: "20 3 2"],
[1: ^n|1:2],
[1: ^S|1:6],
[3: ^o|1:9, ^o|20:1, ^o|2:9],
[1: ^p|1:0], [0])},
{0x70| ((^p|1:0), [3: ^o|1:9, ^o|20:1, ^o|2:9])},
{0x6e| ((^n|1:2), 16000, 2, 3, 2, [8: ^r|1:100, ^r|1:0, ^r|1:1, ^r|1:2, ^r|1:3, ^r|1:4, ^r|1:5, ^r|1:6])},

{0x72| ((^r|1:100), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:1", [1: ^s|1:101])},
{0x72| ((^r|1:0), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:900", [8: ^s|1:0, ^s|11:0, ^s|6:0, ^s|7:0, ^s|3:0, ^s|13:0, ^s|15:0, ^s|16:0])},
{0x72| ((^r|1:1), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:901", [8: ^s|1:1, ^s|11:1, ^s|6:1, ^s|7:1, ^s|3:1, ^s|13:1, ^s|15:1, ^s|16:1])},
{0x72| ((^r|1:2), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:902", [8: ^s|1:2, ^s|11:2, ^s|6:2, ^s|7:2, ^s|3:2, ^s|13:2, ^s|15:2, ^s|16:2])},
{0x72| ((^r|1:3), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:903", [8: ^s|1:3, ^s|11:3, ^s|6:3, ^s|7:3, ^s|3:3, ^s|13:3, ^s|15:3, ^s|16:3])},
{0x72| ((^r|1:4), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:800", [3: ^s|2:0, ^s|12:0, ^s|3:4])},
{0x72| ((^r|1:5), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:34:1",
[3: ^s|1:6, ^s|1:7, ^s|3:6])},
{0x72| ((^r|1:6), [1:3], 0, 0, 0, 0, "192.168.47.113@tcp:12345:33:100",
[2: ^s|8:0, ^s|3:7])},
{0x73| ((^s|8:0), @M0_CST_CONFD, [1: "192.168.47.113@tcp:12345:33:100"], [0], [0])},
{0x73| ((^s|1:6), @M0_CST_HA, [1: "192.168.47.113@tcp:12345:34:1"], [0], [0])},
{0x73| ((^s|1:7), @M0_CST_FIS, [1: "192.168.47.113@tcp:12345:34:1"], [0], [0])},
{0x73| ((^s|1:101), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:1"], [0], [0])},

{0x73| ((^s|2:0), @M0_CST_MDS, [1: "192.168.47.113@tcp:12345:33:800"], [0], [0])},
{0x73| ((^s|12:0), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:800"], [0], [0])},
{0x73| ((^s|3:4), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:800"], [0], [0])},

{0x73| ((^s|1:0), @M0_CST_IOS, [1: "192.168.47.113@tcp:12345:33:900"], [0], [5: ^d|1:1, ^d|1:2, ^d|1:3, ^d|1:4, ^d|1:5])},
{0x73| ((^s|11:0), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0])},
{0x73| ((^s|6:0), @M0_CST_SNS_REP, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0])},
{0x73| ((^s|7:0), @M0_CST_SNS_REB, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0])},
{0x73| ((^s|3:0), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0])},
{0x73| ((^s|13:0), @M0_CST_CAS, [1: "192.168.47.113@tcp:12345:33:900"], [0], [1: ^d|20:0])},
{0x73| ((^s|15:0), @M0_CST_DIX_REP, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0])},
{0x73| ((^s|16:0), @M0_CST_DIX_REB, [1: "192.168.47.113@tcp:12345:33:900"], [0], [0])},
{0x73| ((^s|1:1), @M0_CST_IOS, [1: "192.168.47.113@tcp:12345:33:901"], [0], [5: ^d|1:6, ^d|1:7, ^d|1:8, ^d|1:9, ^d|1:10])},
{0x73| ((^s|11:1), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0])},
{0x73| ((^s|6:1), @M0_CST_SNS_REP, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0])},
{0x73| ((^s|7:1), @M0_CST_SNS_REB, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0])},
{0x73| ((^s|3:1), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0])},
{0x73| ((^s|13:1), @M0_CST_CAS, [1: "192.168.47.113@tcp:12345:33:901"], [0], [1: ^d|20:1])},
{0x73| ((^s|15:1), @M0_CST_DIX_REP, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0])},
{0x73| ((^s|16:1), @M0_CST_DIX_REB, [1: "192.168.47.113@tcp:12345:33:901"], [0], [0])},
{0x73| ((^s|1:2), @M0_CST_IOS, [1: "192.168.47.113@tcp:12345:33:902"], [0], [5: ^d|1:11, ^d|1:12, ^d|1:13, ^d|1:14, ^d|1:15])},
{0x73| ((^s|11:2), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0])},
{0x73| ((^s|6:2), @M0_CST_SNS_REP, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0])},
{0x73| ((^s|7:2), @M0_CST_SNS_REB, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0])},
{0x73| ((^s|3:2), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0])},
{0x73| ((^s|13:2), @M0_CST_CAS, [1: "192.168.47.113@tcp:12345:33:902"], [0], [1: ^d|20:2])},
{0x73| ((^s|15:2), @M0_CST_DIX_REP, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0])},
{0x73| ((^s|16:2), @M0_CST_DIX_REB, [1: "192.168.47.113@tcp:12345:33:902"], [0], [0])},
{0x73| ((^s|1:3), @M0_CST_IOS, [1: "192.168.47.113@tcp:12345:33:903"], [0], [5: ^d|1:16, ^d|1:17, ^d|1:18, ^d|1:19, ^d|1:20])},
{0x73| ((^s|11:3), @M0_CST_ADDB2, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0])},
{0x73| ((^s|6:3), @M0_CST_SNS_REP, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0])},
{0x73| ((^s|7:3), @M0_CST_SNS_REB, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0])},
{0x73| ((^s|3:3), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0])},
{0x73| ((^s|13:3), @M0_CST_CAS, [1: "192.168.47.113@tcp:12345:33:903"], [0], [1: ^d|20:3])},
{0x73| ((^s|15:3), @M0_CST_DIX_REP, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0])},
{0x73| ((^s|16:3), @M0_CST_DIX_REB, [1: "192.168.47.113@tcp:12345:33:903"], [0], [0])},

{0x73| ((^s|3:6), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:34:1"], [0], [0])},
{0x73| ((^s|3:7), @M0_CST_RMS, [1: "192.168.47.113@tcp:12345:33:100"], [0], [0])},
{0x64| ((^d|1:1), 0, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop1")},
{0x6b| ((^k|1:1), ^d|1:1, [1: ^v|1:10])},
{0x6a| ((^j|1:1), ^k|1:1, [0])},
{0x64| ((^d|1:2), 1, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop2")},
{0x6b| ((^k|1:2), ^d|1:2, [1: ^v|1:10])},
{0x6a| ((^j|1:2), ^k|1:2, [0])},
{0x64| ((^d|1:3), 2, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop3")},
{0x6b| ((^k|1:3), ^d|1:3, [1: ^v|1:10])},
{0x6a| ((^j|1:3), ^k|1:3, [0])},
{0x64| ((^d|1:4), 3, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop4")},
{0x6b| ((^k|1:4), ^d|1:4, [1: ^v|1:10])},
{0x6a| ((^j|1:4), ^k|1:4, [0])},
{0x64| ((^d|1:5), 4, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop5")},
{0x6b| ((^k|1:5), ^d|1:5, [1: ^v|1:10])},
{0x6a| ((^j|1:5), ^k|1:5, [0])},
{0x64| ((^d|1:6), 5, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop6")},
{0x6b| ((^k|1:6), ^d|1:6, [1: ^v|1:10])},
{0x6a| ((^j|1:6), ^k|1:6, [0])},
{0x64| ((^d|1:7), 6, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop7")},
{0x6b| ((^k|1:7), ^d|1:7, [1: ^v|1:10])},
{0x6a| ((^j|1:7), ^k|1:7, [0])},
{0x64| ((^d|1:8), 7, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop8")},
{0x6b| ((^k|1:8), ^d|1:8, [1: ^v|1:10])},
{0x6a| ((^j|1:8), ^k|1:8, [0])},
{0x64| ((^d|1:9), 8, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop9")},
{0x6b| ((^k|1:9), ^d|1:9, [1: ^v|1:10])},
{0x6a| ((^j|1:9), ^k|1:9, [0])},
{0x64| ((^d|1:10), 9, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop10")},
{0x6b| ((^k|1:10), ^d|1:10, [1: ^v|1:10])},
{0x6a| ((^j|1:10), ^k|1:10, [0])},
{0x64| ((^d|1:11), 10, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop11")},
{0x6b| ((^k|1:11), ^d|1:11, [1: ^v|1:10])},
{0x6a| ((^j|1:11), ^k|1:11, [0])},
{0x64| ((^d|1:12), 11, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop12")},
{0x6b| ((^k|1:12), ^d|1:12, [1: ^v|1:10])},
{0x6a| ((^j|1:12), ^k|1:12, [0])},
{0x64| ((^d|1:13), 12, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop13")},
{0x6b| ((^k|1:13), ^d|1:13, [1: ^v|1:10])},
{0x6a| ((^j|1:13), ^k|1:13, [0])},
{0x64| ((^d|1:14), 13, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop14")},
{0x6b| ((^k|1:14), ^d|1:14, [1: ^v|1:10])},
{0x6a| ((^j|1:14), ^k|1:14, [0])},
{0x64| ((^d|1:15), 14, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop15")},
{0x6b| ((^k|1:15), ^d|1:15, [1: ^v|1:10])},
{0x6a| ((^j|1:15), ^k|1:15, [0])},
{0x64| ((^d|1:16), 15, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop16")},
{0x6b| ((^k|1:16), ^d|1:16, [1: ^v|1:10])},
{0x6a| ((^j|1:16), ^k|1:16, [0])},
{0x64| ((^d|1:17), 16, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop17")},
{0x6b| ((^k|1:17), ^d|1:17, [1: ^v|1:10])},
{0x6a| ((^j|1:17), ^k|1:17, [0])},
{0x64| ((^d|1:18), 17, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop18")},
{0x6b| ((^k|1:18), ^d|1:18, [1: ^v|1:10])},
{0x6a| ((^j|1:18), ^k|1:18, [0])},
{0x64| ((^d|1:19), 18, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop19")},
{0x6b| ((^k|1:19), ^d|1:19, [1: ^v|1:10])},
{0x6a| ((^j|1:19), ^k|1:19, [0])},
{0x64| ((^d|1:20), 19, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop20")},
{0x6b| ((^k|1:20), ^d|1:20, [1: ^v|1:10])},
{0x6a| ((^j|1:20), ^k|1:20, [0])},
{0x53| ((^S|1:6), [1: ^a|1:6], [3: ^v|1:10, ^v|1:20, ^v|2:10])},
{0x61| ((^a|1:6), [1: ^e|1:7], [3: ^v|1:10, ^v|1:20, ^v|2:10])},
{0x65| ((^e|1:7), [1: ^c|1:8], [3: ^v|1:10, ^v|1:20, ^v|2:10])},
{0x63| ((^c|1:8), ^n|1:2, [24: ^k|1:1, ^k|1:2, ^k|1:3, ^k|1:4, ^k|1:5, ^k|1:6, ^k|1:7, ^k|1:8, ^k|1:9, ^k|1:10, ^k|1:11, ^k|1:12, ^k|1:13, ^k|1:14, ^k|1:15, ^k|1:16, ^k|1:17, ^k|1:18, ^k|1:19, ^k|1:20, ^k|20:0, ^k|20:1, ^k|20:2, ^k|20:3],
[3: ^v|1:10, ^v|1:20, ^v|2:10])},
{0x6f| ((^o|1:9), 0, [3: ^v|1:10, ^v|0x40000000000001:11, ^v|0x40000000000001:12])},
{0x76| ((^v|1:10), {0| (3, 2,
20,
[5: 0, 0, 0, 0, 2],
[1: ^j|1:21])})},
{0x76| ((^v|0x40000000000001:11), {1| (0, ^v|1:10, [5: 0, 0, 0, 0, 1])})},
{0x76| ((^v|0x40000000000001:12), {1| (1, ^v|1:10, [5: 0, 0, 0, 0, 2])})},
{0x6a| ((^j|1:21), ^S|1:6, [1: ^j|1:22])},
{0x6a| ((^j|1:22), ^a|1:6, [1: ^j|1:23])},
{0x6a| ((^j|1:23), ^e|1:7, [1: ^j|1:24])},
{0x6a| ((^j|1:24), ^c|1:8, [20: ^j|1:1, ^j|1:2, ^j|1:3, ^j|1:4, ^j|1:5, ^j|1:6, ^j|1:7, ^j|1:8, ^j|1:9, ^j|1:10, ^j|1:11, ^j|1:12, ^j|1:13, ^j|1:14, ^j|1:15, ^j|1:16, ^j|1:17, ^j|1:18, ^j|1:19, ^j|1:20])} ,
{0x6f| ((^o|2:9), 0, [1: ^v|2:10])},
{0x76| ((^v|2:10), {0| (4, 0, 4, [5: 0, 0, 0, 0, 1], [1: ^j|2:21])})},
{0x6a| ((^j|2:21), ^S|1:6, [1: ^j|2:22])},
{0x6a| ((^j|2:22), ^a|1:6, [1: ^j|2:23])},
{0x6a| ((^j|2:23), ^e|1:7, [1: ^j|2:24])},
{0x6a| ((^j|2:24), ^c|1:8, [4: ^j|2:1, ^j|2:6, ^j|2:11, ^j|2:16])},
{0x6a| ((^j|2:1), ^k|1:1, [0])},
{0x6a| ((^j|2:6), ^k|1:6, [0])},
{0x6a| ((^j|2:11), ^k|1:11, [0])},
{0x6a| ((^j|2:16), ^k|1:16, [0])} ,
{0x64| ((^d|20:0), 20, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop25")},
{0x6b| ((^k|20:0), ^d|20:0, [1: ^v|1:20])},
{0x6a| ((^j|20:100), ^k|20:0, [0])},
{0x64| ((^d|20:1), 21, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop26")},
{0x6b| ((^k|20:1), ^d|20:1, [1: ^v|1:20])},
{0x6a| ((^j|20:101), ^k|20:1, [0])},
{0x64| ((^d|20:2), 22, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop27")},
{0x6b| ((^k|20:2), ^d|20:2, [1: ^v|1:20])},
{0x6a| ((^j|20:102), ^k|20:2, [0])},
{0x64| ((^d|20:3), 23, 4, 1, 4096, 596000000000, 3, 4, "/dev/loop28")},
{0x6b| ((^k|20:3), ^d|20:3, [1: ^v|1:20])},
{0x6a| ((^j|20:103), ^k|20:3, [0])},
{0x6f| ((^o|20:1), 0, [1: ^v|1:20])},
{0x76| ((^v|1:20), {0| (1, 1, 4,
[5: 0, 0, 0, 0, 1],
[1: ^j|20:1])})},
{0x6a| ((^j|20:1), ^S|1:6, [1: ^j|20:2])},
{0x6a| ((^j|20:2), ^a|1:6, [1: ^j|20:3])},
{0x6a| ((^j|20:3), ^e|1:7, [1: ^j|20:4])},
{0x6a| ((^j|20:4), ^c|1:8, [4: ^j|20:100, ^j|20:101, ^j|20:102, ^j|20:103])}]
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -T linux -e lnet:192.168.47.113@tcp:12345:35:1 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd && exec /var/cortx/cortx-motr/motr/m0d -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -f '<0x7200000000000001:6>' -T linux -e lnet:192.168.47.113@tcp:12345:33:100 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0d.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ha && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -T ad -e lnet:192.168.47.113@tcp:12345:35:1 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
lt-m0d: systemd notifications not allowed

Started
Press CTRL+C to quit.
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/mds1 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:800 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios1 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:900 -f '<0x7200000000000001:0>' -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios2 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:901 -f '<0x7200000000000001:1>' -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios3 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:902 -f '<0x7200000000000001:2>' -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios4 && exec /var/cortx/cortx-motr/utils/mkfs/m0mkfs -F -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:35:903 -f '<0x7200000000000001:3>' -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0mkfs.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ha && exec /var/cortx/cortx-motr/motr/m0d -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -T ad -e lnet:192.168.47.113@tcp:12345:34:1 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc -f '<0x7200000000000001:5>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
lt-m0d: systemd notifications not allowed

Started
Press CTRL+C to quit.
Motr HA agent started.
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/mds1 && exec /var/cortx/cortx-motr/motr/m0d -T ad -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:800 -f '<0x7200000000000001:4>' -H 192.168.47.113@tcp:12345:34:1 -c /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/confd/conf.xc |& tee -a m0d.log
lt-m0d: systemd notifications not allowed

Started
Press CTRL+C to quit.
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios1 && exec /var/cortx/cortx-motr/motr/m0d -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:900 -f '<0x7200000000000001:0>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios2 && exec /var/cortx/cortx-motr/motr/m0d -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:901 -f '<0x7200000000000001:1>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios3 && exec /var/cortx/cortx-motr/motr/m0d -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:902 -f '<0x7200000000000001:2>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
cd /var/motr/root/sandbox.st-44motr-rm-lock-cc-io/ios4 && exec /var/cortx/cortx-motr/motr/m0d -T ad -d disks.conf -D db -S stobs -A linuxstob:addb-stobs -w 20 -m 65536 -q 16 -N 100663296 -C 262144 -K 100663296 -k 262144 -e lnet:192.168.47.113@tcp:12345:33:903 -f '<0x7200000000000001:3>' -H 192.168.47.113@tcp:12345:34:1 |& tee -a m0d.log
Motr confd started.
Motr mdservices started.
lt-m0d: systemd notifications not allowed

lt-m0d: systemd notifications not allowed

lt-m0d: systemd notifications not allowed

lt-m0d: systemd notifications not allowed

Started
Press CTRL+C to quit.
Started
Press CTRL+C to quit.
Started
Press CTRL+C to quit.
Started
Press CTRL+C to quit.
Motr ioservices started.
motr service started
*** m0dixinit is omitted. Mkfs creates meta indices now.
Read obj while write/update is in process.
Delete obj while write/update is in process.
Delete obj while read is in process.
Test exclusivity among Readers and Writers
Binary files /var/motr/src_file and /var/motr/dest_file2 differ
Files differ, object got corrupted

=== pids of services: 3976 3966 3959 3955 3860 3768 3119 ===
Shutting down services one by one. mdservice is the last.
----- 3976 stopping--------lt-m0d: got signal 1
motr[03976]: cf80 ERROR [conf/rconfc.c:1182:rconfc_fail] rconfc: 0x7ffe676836e0, state M0_RCS_IDLE failed with -22
----- 3976 stopped --------
----- 3966 stopping--------lt-m0d: got signal 1
motr[03966]: fc80 ERROR [conf/rconfc.c:1182:rconfc_fail] rconfc: 0x7ffca15f63e0, state M0_RCS_IDLE failed with -22
----- 3966 stopped --------
----- 3959 stopping--------lt-m0d: got signal 1
motr[03959]: 2160 ERROR [conf/rconfc.c:1182:rconfc_fail] rconfc: 0x7ffd03e388c0, state M0_RCS_IDLE failed with -22
----- 3959 stopped --------
----- 3955 stopping--------lt-m0d: got signal 1
motr[03955]: 5990 ERROR [conf/rconfc.c:1182:rconfc_fail] rconfc: 0x7ffdc620c0f0, state M0_RCS_IDLE failed with -22
----- 3955 stopped --------
----- 3860 stopping--------lt-m0d: got signal 1
----- 3860 stopped --------
----- 3768 stopping--------lt-m0d: got signal 1
motr[03768]: c920 WARN [ha/link.c:1513:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.113@tcp:12345:33:800
motr[03768]: c920 WARN [ha/link.c:1513:ha_link_outgoing_fom_tick] rlk_rc=-110 endpoint=192.168.47.113@tcp:12345:33:900
----- 3768 stopped --------
----- 3119 stopping--------lt-m0d: got signal 1
----- 3119 stopped --------
m0tr 13922622 0
galois 22944 1 m0tr
lnet 586401 3 m0tr,ksocklnd
Motr services stopped.

Test log file available at /var/motr/motr_2020-09-10_21:59:39.log
Motr trace files are available at: /var/motr/motr
145.77user 19.98system 3:34.03elapsed 77%CPU (0avgtext+0avgdata 738268maxresident)k
237944inputs+4119232outputs (87major+1979821minor)pagefaults 0swaps

after reboot, the error still exists

lnet failure on latest centos 7 kernel with Mellanox

Am trying Cortx on latest Centos 7.8.2003 kernel 3.10.0-1127.19.1.el7.x86_64. I built lustre clients using https://github.com/Seagate/cortx-re/blob/master/scripts/release_support/build_lustre.sh on latest kernel using mlnx ofed that has support for rhel 7.8 https://linux.mellanox.com/public/repo/mlnx_ofed/4.9-0.1.7.0/rhel7.8/x86_64/UPSTREAM_LIBS/

But if I try to start lnet, I get below errors.

# lctl net up
LNET configure error 22: Invalid argument
Sep  2 13:26:27 xxxx-xxx kernel: LNetError: 172781:0:(api-ni.c:2533:lnet_startup_lndnet()) Can't load LND o2ib, module ko2iblnd, rc=256
Sep  2 13:26:29 xxxx-xxx kernel: ko2iblnd: disagrees about version of symbol ib_fmr_pool_unmap
Sep  2 13:26:29 xxxx-xxx kernel: ko2iblnd: Unknown symbol ib_fmr_pool_unmap (err -22)
Sep  2 13:26:29 xxxx-xxx kernel: ko2iblnd: disagrees about version of symbol __ib_alloc_pd
Sep  2 13:26:29 xxxx-xxx kernel: ko2iblnd: Unknown symbol __ib_alloc_pd (err -22)
Sep  2 13:26:29 xxxx-xxx kernel: ko2iblnd: disagrees about version of symbol rdma_resolve_addr
# cat /etc/modprobe.d/lnet.conf
options lnet networks=o2ib(ienp175s0f0)  config_on_load=1  lnet_peer_discovery_disabled=1

Any suggestions? @chumakd @just-now @tgeerdes-SG

New FDMI docs to be converted to be shared with community

@VenkyOS , here are some FDMI documents which are currently private to Seagate but should be migrated to github. Can you please convert them to RST and migrate them into the doc directory and add links to them in https://github.com/Seagate/cortx-motr/blob/dev/doc/reading-list.md?

Here is the private Seagate-only link where the files exist:
https://seagatetechnology.sharepoint.com/:f:/s/gteamdrv1/tdrive1224/EsBphdcDidhEpSlJk3fvsMYBynVzLA3d4p8lbPjnLoTsxQ?e=mTzD48

Thanks!

cortx-motr check-everything CONFIGURE_OPTS: unbound variable

[root@ssc-vm-0925 cortx-motr]# ./scripts/m0 check-everything
----- check_everything -----
./scripts/m0: line 158: CONFIGURE_OPTS: unbound variable

check-everything  =  (rebuild + dist-check + run-all) * 2
                     *NOTE* Please run this command before landing.
                     .
                     The command is executed twice (note '* 2' above):
                     the 2nd time with './configure --disable-m0-asserts'.

Problem: hax starts and stops rconfc frequently

Presently fsStatsUpdater in hare starts and stop rconfc frequently to invoke motr spiel command to fetch filesystem statistics which are updated in Consul.

Frequent rconfc start - stop is not very efficient when we need to invoke multiple spiel commands, e.g. to trigger SNS repair/rebalance.
A better way needs to be worked out to deal with this issue.

reference:
#298

Pool-wise consumption statistics is needed

Currently, the consumption statistics is reported for the whole cluster by Motr, which is useless in case of a multi-pools cluster configuration (like at SAGE):

$ hctl status --json
{
  "pools": [
    [
      "0x6f00000000000001:0x444",
      "tier1-nvme"
    ],
    [
      "0x6f00000000000001:0x451",
      "tier2-ssd"
    ],
    [
      "0x6f00000000000001:0x46c",
      "tier3-hdd"
    ]
  ],
  "profiles": [
    {
      "fid": "0x7000000000000001:0x4da",
      "name": "prof-0",
      "pools": [
        "tier1-nvme",
        "tier2-ssd",
        "tier3-hdd",
        null,
        null
      ]
    }
  ],
  "filesystem": {
    "stats": {
      "fs_free_seg": 51416442504,
      "fs_total_seg": 51545708832,
      "fs_free_disk": 203486464036864,
      "fs_avail_disk": 203486464036864,
      "fs_total_disk": 204053896232960,
      "fs_svc_total": 12,
      "fs_svc_replied": 12
    },
    "timestamp": 1608385030.93602,
    "date": "2020-12-19T14:37:10.936020"
  },

Here is the log msg from hax which fetches the statistics from Motr periodically using m0_spiel_filesystem_stats_fetch() API:

Dec 19 14:46:33 client-22 hare-hax[24890]: 2020-12-19 14:46:33,762 [DEBUG] {fs-stats-updater} FS stats are as follows: FsStats(fs_free_seg=51416442504, fs_total_seg=51545708832, fs_free_disk=203486464036864, fs_avail_disk=203486464036864, fs_total_disk=204053896232960, fs_svc_total=12, fs_svc_replied=12)

Per-pool consumption statistics is needed.

Repeating cas_fom_tick rc=-2 errors from mero-server

JIRA :
https://jts.seagate.com/browse/EOS-8035

Affected Version(s) :
Cortx-1.0.0-rc1, Cortx-1.0.0-rc2

Steps to reproduce :

  1. Deploy cluster
  2. Configure and run S3 I/O: manual one using aws tool or using cosbench tool.
  3. Check journalctl logs
  4. [To generate this issue] Request to non-existing index

Error log:

May 06 17:44:43 sm29-r20.pun.seagate.com mero-server[232226]: mero[232226]: bb20 ERROR [cas/service.c:1147:cas_fom_tick] <! rc=-2

Effect:
S3 I/O works, however such errors are generated multiple times.
Pacemaker resource doesn't not go to failed state - this is expected behavior.

motr[05167]: cbd0 ERROR [dix/req.c:794:dix_idxop_meta_update_ast_cb] All items are failed

So I am effectively following the examples here https://github.com/Seagate/cortx-motr/blob/main/motr/examples/example2.c
I am trying to create an index pretty much in the exact same way as the examples. After I call m0_entity_create and launch the operation I get the error described and my launched returns an error and fails.

It might have something to do with how I;ve configured my motr client. Unfortunately I don't know enough about how the indexes work in motr.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.