seagate / cortx-hare Goto Github PK

View Code? Open in Web Editor NEW

13.0 13.0 80.0 7.48 MB

CORTX Hare configures Motr object store, starts/stops Motr services, and notifies Motr of service and device faults.

Home Page: https://github.com/Seagate/cortx

License: Apache License 2.0

Makefile 2.65% Python 71.70% Dhall 9.82% Shell 12.28% C 3.42% Dockerfile 0.14%

cluster-management high-availability

cortx-hare's People

Contributors

Stargazers

Watchers

Forkers

shailesh-vaidya vvv chumakd imvenkip ccbft-storage knekrasov abhisheksahaseagate mssawant andriytk nileshgovande saumya-sunder prathameshrodi suvratjoshi rajanikantchirmade yatin-mahajan s-arya swapnilgaonkar7 gshipra anjali-2021 parikshit-dharmale jyoti-cs huanghua78 shreya-18 hessio venkyos t7ko-seagate rahul27kumar pujamudaliar indrajitzagade ivan-alekhin rajkumarpatel2602 supritshinde vaibhavparatwar maximalezhin balajiramachandran-seagate supriyachavan4398 max-seagate sergey-shilov daniarherikurniawan gowthamchinna ajaykumarptl davidvasiliauskas just-now nalinijena bdekvadiya nikhilpatil2995 jugalpatil pavankrishnat nkommuri rkothiya selvakumaar5496 dmitrykuzmenko venkuppu-chn swanand-gadre papan-singh tshaffe1 swapnil-seagate lakshita-jain yeshpal-jain-seagate zahidsk d-nayak vinoth2101 abhijitpatil1992 madhavemuri siningwuseagate atulsdeshmukh2312 kimika88 gauravchaudhari02 subhalaxmisa seagate-sarang-sawant vimleshsgt mehjoshi kanchan-chaudhari faradawn ezioauditored patilrajat ujjwalpl jyotiranjanbaral

cortx-hare's Issues

Problem: /tmp/confd file creation mechanism does not work if confd restarts

There is one problem with using /tmp/confd. So we delete the file after a successful bootstrap. Now, if I stopped confd and try to start it again, hax and consul both go crazy as the RC leader is not elected due to the absence of /tmp/confd.

At consul

2019/08/28 06:17:24 [DEBUG] http: Request GET /v1/kv/leader (192.453µs) from=127.0.0.1:42250
    2019/08/28 06:17:24 [DEBUG] http: Request GET /v1/kv/leader (168.142µs) from=127.0.0.1:42252
    2019/08/28 06:17:24 [DEBUG] http: Request GET /v1/kv/leader (219.885µs) from=127.0.0.1:42254
    2019/08/28 06:17:24 [DEBUG] http: Request GET /v1/kv/leader (179.674µs) from=127.0.0.1:42256

At hax

hax.exception.HAConsistencyException: Could not get the leader from Consul
2019-08-28 06:17:26,005 [DEBUG] {qconsumer} Reply sent
2019-08-28 06:17:26,005 [DEBUG] {qconsumer} Waiting for the next message
2019-08-28 06:17:26,010 [DEBUG] {Dummy-1} Received entrypoint request from remote endpoint '172.28.128.4@tcp:12345:44:1', process fid = 0x7200000000000001:0x1. The request will be processed in another thread.
2019-08-28 06:17:26,206 [DEBUG] {qconsumer} Got something from the queue
2019-08-28 06:17:26,206 [DEBUG] {qconsumer} Processing entrypoint request from remote endpoint '172.28.128.4@tcp:12345:44:1', process fid 0x7200000000000001:0x1
2019-08-28 06:17:26,208 [DEBUG] {qconsumer} Starting new HTTP connection (1): 127.0.0.1:8500
2019-08-28 06:17:26,209 [DEBUG] {qconsumer} http://127.0.0.1:8500 "GET /v1/kv/leader HTTP/1.1" 200 108
2019-08-28 06:17:26,210 [ERROR] {qconsumer} Failed to get the data from Consul. Replying with EAGAIN error code.
Traceback (most recent call last):
  File "/data/hare/hare/hax/hax/halink.py", line 70, in send_entrypoint_request_reply
    sess = prov.get_leader_session()
  File "/data/hare/hare/hax/hax/util.py", line 65, in get_leader_session
    'Could not get the leader from Consul')

I think it is not very convenient to create /tmp/confd every time before restarting confd. I think we should use the appropriate CONF_HA_STATES to reflect the consul service states:

M0_CONF_HA_PROCESS_STARTING -> warning,
M0_CONF_HA_PROCESS_STOPPED + M0_CONF_HA_SERVICE_FAILED -> critical? (critical is bit tricky) and
M0_CONF_HA_PROCESS_STARTED -> success.

debug `exit 0` in .gitlab-ci.yml

exit 0 # XXX DELETEME still present in .gitlab-ci.yml inside build job

Follow-up from "rfc: bootstrap update - don't copy consul config files"

The following discussions from #116 should be addressed:

@vvv started a discussion: (+1 comment)

s/the/a/
@vvv started a discussion: (+2 comments)

Waiting for some hardcoded amount of time is a fragile solution. How difficult would it be, in your opinion, to wait for some reliable marker that Consul has started? E.g., some line in consul log or something like that.

If both time-based solution and less fragile one are within our reach (i.e., are of comparable difficulty), we should prefer the less fragile solution.

Otherwise I'm fine with time-based solution.
@vvv started a discussion: (+1 comment)

I'd suggest that we numerated the items properly. The spec will be more difficult to maintain; fortunately, this text is read more often than it is updated.

Sometimes people want to read .md file in raw (not rendered) format. We can make their reading experience more pleasant.

Problem: `cfgen` leaves fields of ConfSdev uninitialized

Solution: See how m0genfacts/halond sets those fields. Update cfgen correspondingly.

Problem: There is no script for starting Hare

We need a Bash script for starting Hare stack - consul, hax, confd - in single-node setup.

Prerequisites:

Mero rpm is installed.

Users: Hare developers, CI.

Problem: The notion of “RC” and “RC leader” is confusing

The concept of Recovery Coordinator (RC) in Hare design is blurry and confusing. Halon terminology does not map directly to Hare design. The artificial “RC” alias hinders understanding.

Solution: Abandon the notion of “RC”. Use the terminology that is hard to misinterpret:

Term	Obsoletes	Description
Consul leader	RC leader	the re-electable entity that executes watches
Consul watch handlers	RC logic

bootstrap.sh: helper script added

hax: Move C extension to hax/hax/ directory

hax/hax.py: Export main only. Style edits.
hax: Move C extension to hax/hax/ directory.

Problem: M0_HA_MSG_STOB_IOQ events are not handled

Solution: hax shall handle M0_HA_MSG_STOB_IOQ events from Mero IO services by updating the Consul KV (e.g., put error information into io-errors/<sdev-fid> key).

Some Pacemaker script (maintained outside of Hare) will either poll Consul KV or create a Consul watch to monitor the above key.

Problem: setuptools aren't used for `hax` installation

Currently we have no plan for hax distribution and deployment. The bootstrap scripts rely on virtualenv mechanism, which might be the right approach, but not quite "Pythonic".

Solution: use setuptools' console_scripts entry point to simplify hax installation and bootstrap procedures.
This also reduces the likelihood of “extension/no extension” disputes like the one in #96.

EOS-717 Add the remaining conf object types to conf/confd.dhall

Proto-bootstrap scripts for HAX

This merge request addresses issue http://gitlab.mero.colo.seagate.com/mero/hare/issues/14

Follow-up from "Added input type format (per discussion with Andrey)"

The following discussions from #49 should be addressed:

@andriy.tkachuk started a discussion:

From /? No. From bq/ - maybe. But not atm.
@andriy.tkachuk started a discussion:

add: "(provided by Consul)".
@andriy.tkachuk started a discussion: (+1 comment)

Not "registered". It will just listen to HTTP POSTs from the Consul's service-type watch.
@andriy.tkachuk started a discussion:

Service FID.
@andriy.tkachuk started a discussion:

All these must be "passing" to regard the service Online on the node. Otherwise - it is Offline.

Problem: `install` script uses Python 2

Solution: Update ‘install’ script: rename pip -> pip3, python -> python3.

Delete hax/mero-interface.patch

The file is not needed any more, it should have been removed long ago.

Problem: the purpose of the hax fids is unclear

When hax makes m0_halon_interface_start() call, it has to pass three fid parameters:

process fid of the current process;
fid of the HA service run by the current process;
fid of the RM service run by the current process.

(The “current process” here is hax itself.)

hax is a mere bridge between Mero and Consul. It does not have confc cache. It does not run any Mero services. Why would it have to have any Mero fids at all?

How are those fids used by Mero? What is their purpose?

Update README.md - use entry-point script

Problem: Hare is not tested by CI

Solution: setup GitLab CI pipeline to run

make -C cfgen

and

make -C hax

Follow-up from "README.md: update with install.sh and bootstrap.sh usage"

The following discussions from #98 should be addressed:

@vvv started a discussion:
1. s/mero/Mero/
2. I don't think we should solidify dependency on patched Mero in the README file. That dependency will be removed soon(ish), @mandar.sawant is working on this.
@vvv started a discussion:
Please remove “and is available by /usr/bin/python3” part.

It's not needed for Python scripts with proper shebang.
```
#!/usr/bin/env python3
```
The corresponding stanza is being added to the 8/COSTY.
@vvv started a discussion:

README.md is not a work of fiction. @andriy.tkachuk I propose that we remove ‘Multiple nodes’ section and write it when the functionality is actually there.

hax: Refactor HTTP server and halink.py

Follow-up from "Bootstrap scripts can start hax as a service"

The following discussions from #105 should be addressed:

@vvv started a discussion: (+2 comments)

Why?

See https://github.com/Seagate/hare/blob/master/rfc/8/README.md#bash-code .
@vvv started a discussion: (+1 comment)
Is bash -c necessary?

Can we do
```
ExecStart=/usr/bin/env LD_LIBRARY_PATH=$LD_LIB_PATH python3 /opt/seagate/consul/hax/hax.py
```
and let system administrator decide what python3 should link to?
@vvv started a discussion: (+4 comments)

[nit,optional] Single line would look better, IMO.
@vvv started a discussion:
[nit] Drop the trailing /. This will prevent // situations later.
```
$ DIR=/path/to/
$ echo $DIR/file
/path/to//file
```

WIP: watch-service: Use {fid: Text, state: Text} record type

usage: Use cat <<EOF, similarly to fid-gen script.

Problem: hax.c doesn't compile with Mero master

$ (cd hax; make)
cpp -MM -MG -I../../mero -I/usr/include/python2.7 -I/usr/include/python2.7 -DM0_INTERNAL= -DM0_EXTERN=extern hax.c |\
 sed -r 's%^(.+)\.o:%./\1.d ./\1.o:%' >hax.d
cc -g -Werror -Wno-attributes -fPIC -I../../mero -I/usr/include/python2.7 -I/usr/include/python2.7 -DM0_INTERNAL= -DM0_EXTERN=extern  -c -o hax.o hax.c
hax.c: In function ‘init_halink’:
hax.c:465:5: error: assignment makes pointer from integer without a cast [-Werror]
  m0 = m0_halon_interface_m0_get(hc->hc_hi);
     ^
hax.c: In function ‘start’:
hax.c:498:11: error: passing argument 7 of ‘m0_halon_interface_start’ from incompatible pointer type [-Werror]
           link_is_disconnecting_cb, link_disconnected_cb);
           ^
In file included from hax.c:27:0:
../../mero/ha/halon/interface.h:348:5: note: expected ‘void (*)(struct m0_halon_interface *, struct m0_ha_link *, const struct m0_ha_msg *, uint64_t)’ but argument is of type ‘void (*)(struct m0_halon_interface *, struct m0_ha_link *, struct m0_ha_msg *, uint64_t)’
 int m0_halon_interface_start(struct m0_halon_interface *hi,
     ^
cc1: all warnings being treated as errors
make: *** [hax.o] Error 1

Problem: `(cd cfgen; make)` fails

$ (cd cfgen; make)
flake8  cfgen
cfgen:633:1: E302 expected 2 blank lines, found 1
cfgen:662:1: E302 expected 2 blank lines, found 1
cfgen:670:80: E501 line too long (104 > 79 characters)
cfgen:671:24: E128 continuation line under-indented for visual indent
cfgen:672:24: E128 continuation line under-indented for visual indent
cfgen:673:24: E128 continuation line under-indented for visual indent
cfgen:674:24: E128 continuation line under-indented for visual indent
cfgen:676:24: E128 continuation line under-indented for visual indent
cfgen:677:24: E128 continuation line under-indented for visual indent
make: *** [flake8] Error 1

Refactor `elect-rc-leader` and `new-rc-leader`

Problem: hax sends EP request to itself

Only users of Mero configuration (those who have confc cache) should ever require entrypoint information.

hax does not cache Mero configuration, it is a mere bridge. For hax to send entrypoint request to itself is wrong.

Workaround: instead of contacting Consul for the information it doesn't need, hax should reply to entrypoint request from itself with hard-coded data.

Proper (?) solution: modify “halon_interface” (or hax implementation) so that entrypoint request to itself is not sent.

Problem: It's not clear if `struct m0_ha::h_links_incoming` can be used for hax-to-mero messages

Trying to understand if the incoming ha_link from a mero server can be used to send an “outgoing” message (not a reply) from hax to mero.

The experiments show that the incoming ha_link from a mero server to hax can be used to send an outgoing message.
Following discussions happened with Max.

Problem: Coding style guidelines are missing

There is no clearly documented guidelines for code style. This causes friction when individual coding styles don't match.

Solution: an RFC with guidelines for code style (Bash, Python, C), use of code formatters and linters (yapf, pylint/flake8, mypy), Python version, etc.

Problem: Intermediate `consul-kv.dhall` is an unnecessary complication

Solution:

Modify ‘cfgen’ to generate consul-kv.json directly.
Remove KV-related Dhall code: dhall/defaults*, types/{{,Key}Value,ConsulService}.dhall.
Update the spec.

Added section on Python threading model and added links

Follow-up from "cfgen/dhall/consul-conf.dhall: added (initial version)"

The following discussions from #100 should be addressed:

@vvv started a discussion:

s/watch/Watch/

The convention established in Dhall community is to start type names with capital letter. Similarly to Haskell.

@vvv started a discussion:

let dir = "/opt/seagate/consul"
...
    args = [ "${dir}/watch-service" ]

@vvv started a discussion:

[nit] My bad, it should have been localIp. Overexposure to Python rewiress one'ss brainss. 🐍

Dhall and Haskell code bases traditionally use camelCase. 🐫
@vvv started a discussion:
[optional] This style is used more often:
```
      { id : Text
      , name : Text
      , address : Text
      , port : Natural
      , checks : List { args : List Text, interval : Text }
      }
```
Examples:
The good news is that we don't have to worry about formatting. Once we start using dhall format (soon-ish), the issue of formatting will disappear.

(Not really a) Problem: PC3 forbidding topic branches makes Maintainer's life harder

@konstantin.nekrasov wrote:

Shouldn't we remove or at least rephrase [6-8] to allow the Hare developers to push to the repository? Forking the whole repository can make sense in distributed open-source projects where there are many untrusted people. In our case this is just an additional barrier that buys nothing for us.

Good point. The need to keep one's fork in sync with the upstream
“master” can indeed feel excessive for a Maintainer. (According to
the
terminology
used in the RFC, Hare developers are “Maintainers”.)

I disagree that this master-only repository approach does not buy us
anything though...

Why new rules

The goal is to make this project attractive to outsiders. To make it
easy for people from other teams to contribute. PC3
sends a clear signal that participation is just a few clicks away, that
obstacles are actively hunted for and removed.

PC3 makes it easy for anyone to report a problem, propose a
solution, send a patch and expect that patch to be merged quickly.
Why quickly? Because it feels good for the contributor. It's
gratifying to realize that your actions have actual impact; this makes one
want to return and do more (free labour, yay!).

Isn't it risky to merge patches from “untrusted” people? Not at all.
With git it is trivial to revert a patch. The bad one.
The less-than-ideal patch can just as well be improved
by one of subsequent patches.

This game is too much fun to be be played by chosen few. (Also, “few”
does not scale.)

We are experimenting with rule system for our software development game.
If this particular rule system does not work, we'll throw it away and
try some other.

master-only repo

Cons:

necessity to add a second remote and keep its “master” updated
? (there should be more, which I'm not aware of at the moment)

Pros:

anyone can contribute
no stale branches
welcoming community

There is one more pro thing. When Maintainers are required to walk
in Contributor's shoes, they'll find the inconveniences and smooth them
really soon. (Eating one's own dog food, etc.)

Solution: the absence of topic branches is a feature, not a problem. There is no need to change the specification.

Problem: `hax` depends on LD_LIBRARY_PATH

Currently our scripts start hax with LD_LIBRARY_PATH environment variable set — we use it to tell the linker where to look for libmero.so. This approach is considered harmful.

Solution:

modify Makefile: set -Wl,-rpath=$M0_SRC_DIR/mero/.libs as described in Program Library HOWTO (and here)
remove LD_LIBRARY_PATH from the code

Document decisions from "Expose mero process statuses to Consul KV"

The following discussion from #75 should be addressed:

@vvv started a discussion:

Shouldn't this be documented in 4/KV?

Problem: It's not clear how hax should handle EP requests if there is no confd

hax hangs when no leader can be found in Consul.

From discussion with @vvv:

If I remove /tmp/confd file, hax can't find the [RC] leader node — is this correct behavior?
As long as this information must be gathered while preparing the reply to an entrypoint request, the entrypoint request never gets a reply which makes hax process hang. How should hax process the entrypoint request if the confd service is down and hence no leader node can be found in Consul?
... or maybe we should not consider these error cases since this is a PoC?

@vvv:

Instead of hanging, hax should return an error to m0d.

@konstantin.nekrasov:

Instead of hanging, hax should return an error to m0d. How to do that properly?

/**
 * Sends entrypoint reply.
 *
 * @param req_id         request id received in the entrypoint_request_cb()
 * @param rc             return code for the entrypoint.
 *                       It's delivered to the user
 * @param confd_nr       number of confds
 * @param confd_fid_data array of confd fids
 * @param confd_eps_data array of confd endpoints
 * @param confd_quorum   confd quorum for rconfc. @see m0_rconfc::rc_quorum
 * @param rm_fid         Active RM fid
 * @param rp_eps         Active RM endpoint
 *
 * @note This function can be called from entrypoint_request_cb().
 */
void m0_halon_interface_entrypoint_reply(
                struct m0_halon_interface  *hi,
                const struct m0_uint128    *req_id,
                int                         rc,
                uint32_t                    confd_nr,
                const struct m0_fid        *confd_fid_data,
                const char                **confd_eps_data,
                uint32_t                    confd_quorum,
                const struct m0_fid        *rm_fid,
                const char                 *rm_eps);

Currently I always return rc == 0. Do I need to set non-zero value to highlight an error?

@vvv:

So hax should send HA state update to local m0d-s, informing them of confd's failure. When m0d receives this notification, it will check if the quorum of confd-s is still online. [...]

Problem: Do we need to handle M0_HA_MSG_EVENT_PROCESS and M0_HA_MSG_EVENT_SERVICE ha messages?

Following are the process event types,

enum m0_conf_ha_process_event {
        /**
         * The process is about to start. Usually this notification is sent
         * after connection to HA is established, but it may not be the first
         * m0_ha_msg sent from the process.
         */                             
        M0_CONF_HA_PROCESS_STARTING,
        /**                             
         * The process is fully started and its services can handle requests.
         */                             
        M0_CONF_HA_PROCESS_STARTED,
        /**                             
         * The process is about to stop. New connections to the services from
         * this process shouldn't be made after this notification is sent
         * (exception: if connections are required during the "stopping" phase).
         */                             
        M0_CONF_HA_PROCESS_STOPPING,             
        /**                             
         * Process is stopped. No new connections should be made after this
         * point. Usually this notification is sent just before process
         * disconnects from HA, but it may not be the last m0_ha_msg sent
         * from the process.
         */
        M0_CONF_HA_PROCESS_STOPPED,
};
        
/** Defines the source of the process event */
enum m0_conf_ha_process_type { 
        /** Source is not defined. Example: the source is a debugging tool. */
        M0_CONF_HA_PROCESS_OTHER,
        /** The event is sent from kernel (only m0t1fs can send this atm). */
        M0_CONF_HA_PROCESS_KERNEL,      
        /** The event is sent from m0mkfs */
        M0_CONF_HA_PROCESS_M0MKFS,
        /** The event is sent from m0d */
        M0_CONF_HA_PROCESS_M0D,
};

and following are the service events,

enum m0_conf_ha_service_event {
        /**
         * Service is about to start. There is no point in connecting to the
         * service before this notification is sent.
         */
        M0_CONF_HA_SERVICE_STARTING,
        /** Service is started and it can handle requests. */
        M0_CONF_HA_SERVICE_STARTED,
        /**
         * Service is about to stop. New connections to the service shouldn't
         * be made after this notification is sent if the connections are not
         * a part of "stopping" phase.
         */
        M0_CONF_HA_SERVICE_STOPPING,
        /**
         * Service is stopped. There is no point in connecting to the service
         * after this notification is sent.
         */
        M0_CONF_HA_SERVICE_STOPPED,
        /**
         * Service failed during the starting phase. There is no point in
         * connecting to the service if it's failed.
         */
        M0_CONF_HA_SERVICE_FAILED,
};

@max-seagate.medved: Do we need above information in consul (For EES, may be in order to report cluster status or broadcast a failure) or will that be handled implicitly as part of the consul service infrastructure?

Problem: RC is not documented

Solution: write an RFC that will define terms “RC” and “RC leader”, show their relation to Consul sessions.

Does mero has a smart version of `m0_thread_adopt`

The code in http://gitlab.mero.colo.seagate.com/mero/hare/merge_requests/62 fails most probably because of the fact that the communication with m0d doesn't follow m0_thread_adopt() call.

Here is how the it looks like:

2019-08-05 10:18:43,907 [DEBUG] {Thread-1} Waiting
2019-08-05 10:18:43,912 [DEBUG] {MainThread} Starting new HTTP connection (1): 127.0.0.1:8500
2019-08-05 10:18:43,920 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/agent/self HTTP/1.1" 200 None
2019-08-05 10:18:43,925 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/catalog/service/hax HTTP/1.1" 200 334
2019-08-05 10:18:43,929 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/catalog/service/hax HTTP/1.1" 200 334
2019-08-05 10:18:43,934 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/agent/self HTTP/1.1" 200 None
2019-08-05 10:18:43,938 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/catalog/service/hax HTTP/1.1" 200 334
2019-08-05 10:18:43,942 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/agent/self HTTP/1.1" 200 None
2019-08-05 10:18:43,946 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/catalog/service/ha HTTP/1.1" 200 333
2019-08-05 10:18:43,950 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/agent/self HTTP/1.1" 200 None
2019-08-05 10:18:43,954 [DEBUG] {MainThread} http://127.0.0.1:8500 "GET /v1/catalog/service/rm HTTP/1.1" 200 333
2019-08-05 10:18:43,954 [DEBUG] {MainThread} Loading library from path: /home/720599/projects/hare/hax/hax/../libhax.so
Python object addr: 0x7f84b8a33940
Python object addr2: 0x7f84b8a33940
Returning: 0xd51d70
2019-08-05 10:18:44,073 [INFO] {MainThread} Start method is invoked from thread MainThread
Starting hax interface..
In entrypoint_request_cb
Module loaded? 1
Here - 1
Here - 2
2019-08-05 10:18:44,736 [DEBUG] {Thread-1} Got something from the queue
2019-08-05 10:18:44,736 [DEBUG] {Thread-1} Started processing entrypoint request from remote eps = '10.230.164.213@tcp:12345:45:1', process_fid = 0x7200000000000001:0x0
2019-08-05 10:18:44,741 [DEBUG] {Thread-1} Starting new HTTP connection (1): 127.0.0.1:8500
2019-08-05 10:18:44,744 [DEBUG] {Thread-1} http://127.0.0.1:8500 "GET /v1/kv/leader HTTP/1.1" 200 164
2019-08-05 10:18:44,749 [DEBUG] {Thread-1} http://127.0.0.1:8500 "GET /v1/session/info/91d858fc-ab70-c7ed-c025-9059265cb865 HTTP/1.1" 200 200
2019-08-05 10:18:44,754 [DEBUG] {Thread-1} http://127.0.0.1:8500 "GET /v1/catalog/service/confd HTTP/1.1" 200 335
2019-08-05 10:18:44,754 [DEBUG] {Thread-1} Pasing the entrypoint reply to hax.c layer
In m0_ha_entrypoint_reply_send
mero[24519]:  bb00  FATAL  [lib/assert.c:48:m0_panic]  panic: fatal signal delivered at unknown() (unknown:0)  [git: v1.4-256-g2cee927-dirty] /home/720599/projects/hare/hax/m0trace.24519
Mero panic: fatal signal delivered at unknown() unknown:0 (errno: 0) (last failed: none) [git: v1.4-256-g2cee927-dirty] pid: 24519  /home/720599/projects/hare/hax/m0trace.24519
Mero panic reason: signo: 11
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(m0_arch_backtrace+0x20)[0x7f84b26ace50]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(m0_arch_panic+0xe6)[0x7f84b26ad006]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(+0x35d714)[0x7f84b269c714]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(+0x36e058)[0x7f84b26ad058]
/lib64/libpthread.so.0(+0xf6d0)[0x7f84c03946d0]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(m0_thread_self+0xa)[0x7f84b26a37aa]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(m0_mutex_is_not_locked+0x12)[0x7f84b26a2cb2]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(m0_mutex_lock+0x1d)[0x7f84b26a2cdd]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(+0x33886e)[0x7f84b267786e]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(m0_ha_entrypoint_server_request_find+0x9)[0x7f84b2678a89]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(m0_ha_entrypoint_reply+0x8b)[0x7f84b267c51b]
/home/720599/projects/hare/hax/../../mero/mero/.libs/libmero.so.1(m0_halon_interface_entrypoint_reply+0x1a1)[0x7f84b26864a1]
/home/720599/projects/hare/hax/hax/../libhax.so(m0_ha_entrypoint_reply_send+0x7e)[0x7f84b34ca31d]
<...>

Is there a function in mero which either invokes m0_thread_adopt if the LTS is not initialized or simply does nothing (so the caller doesn't need to think about that?

WIP: Implement `to_dhall` methods for conf objects in cfgen/collect-facts

Add endpoint handling code in ConfProcess object.

Problem: hax build failed due to mero patch missing @ $HARE_SRC/hax

Need to update README.md for correct mero patch required for hax build.

-cd $M0_SRC/ && git apply $HARE_SRC/hax/mero-patch.patch
+cd $M0_SRC_DIR/ && git fetch http://gerrit.mero.colo.seagate.com:8080/mero refs/changes/16/18316/3 && git cherry-pick FETCH_HEAD

Design how Hare will be monitoring Mero processes

Initial input (from @mandar.sawant)

As per Max's input we need to support process/service state transitions from the cluster start perspective due to dependancy between processes, e.g. ioservice depends on confd. We need to figure out how we will support this, by having process/service kv pairs with corresponding states or using consul's service infrastructure. I think consul's service infrastructure can be used, e.g. bootstrap script can query consul for confd service status, and once the service is started successfully subsequent mero processes can be started.

Idea

Hare can leverage the Consul's notion of Services.

Some facts around the idea

Consul uses health check scripts to learn whether the service is alive
Hax can receive the notifications on process/service status change via ha_link interface
As a reaction on those notifications Hax can update some values in Consul KV
The check scripts for mero-level services can simply look into the Consul KV

rfc/6: ip address to be specified explicitly for the agents

Otherwise, there is no way to select the network interface on which to run the agents if there are several network interfaces on the nodes.

Problem: `consul-agent` service fails to start

Tried to start consul-agent service on VM and it failed with error

Multiple private IPv4 addresses found. Please configure one with 'bind' and/or 'advertise'.

Followed these steps to start consul-agent:

[vagrant@cmu hare]$ sudo ./install.sh 
[vagrant@cmu hare]$ sudo systemctl start consul-agent
[vagrant@cmu hare]$ sudo systemctl status consul-agent
consul-agent.service - Consul agent for Hare
   Loaded: loaded (/data/hare/systemd/consul-agent.service; disabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Wed 2019-08-28 09:06:32 UTC; 4s ago
  Process: 4033 ExecStart=/usr/local/bin/consul agent -config-dir=/opt/seagate/consul -data-dir=/tmp/consul -bootstrap-expect=1 -ui (code=exited, status=1/FAILURE)
 Main PID: 4033 (code=exited, status=1/FAILURE)

Aug 28 09:06:32 cmu systemd[1]: consul-agent.service: main process exited, code=exited, status=1/FAILURE
Aug 28 09:06:32 cmu systemd[1]: Unit consul-agent.service entered failed state.
Aug 28 09:06:32 cmu systemd[1]: consul-agent.service failed.
Aug 28 09:06:32 cmu systemd[1]: consul-agent.service holdoff time over, scheduling restart.
Aug 28 09:06:32 cmu systemd[1]: start request repeated too quickly for consul-agent.service
Aug 28 09:06:32 cmu systemd[1]: Failed to start Consul agent for Hare.
Aug 28 09:06:32 cmu systemd[1]: Unit consul-agent.service entered failed state.
Aug 28 09:06:32 cmu systemd[1]: consul-agent.service failed.
[vagrant@cmu hare]$ sudo tail /var/log/messages 
Aug 28 09:06:32 cmu systemd: Starting Consul agent for Hare...
Aug 28 09:06:32 cmu consul: ==> Multiple private IPv4 addresses found. Please configure one with 'bind' and/or 'advertise'.
Aug 28 09:06:32 cmu systemd: consul-agent.service: main process exited, code=exited, status=1/FAILURE
Aug 28 09:06:32 cmu systemd: Unit consul-agent.service entered failed state.
Aug 28 09:06:32 cmu systemd: consul-agent.service failed.
Aug 28 09:06:32 cmu systemd: consul-agent.service holdoff time over, scheduling restart.
Aug 28 09:06:32 cmu systemd: start request repeated too quickly for consul-agent.service
Aug 28 09:06:32 cmu systemd: Failed to start Consul agent for Hare.
Aug 28 09:06:32 cmu systemd: Unit consul-agent.service entered failed state.
Aug 28 09:06:32 cmu systemd: consul-agent.service failed.
[vagrant@cmu hare]$ ifconfig 
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.0.2.15  netmask 255.255.255.0  broadcast 10.0.2.255
        inet6 fe80::5054:ff:fec0:42d5  prefixlen 64  scopeid 0x20<link>
        ether 52:54:00:c0:42:d5  txqueuelen 1000  (Ethernet)
        RX packets 1489  bytes 138507 (135.2 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1060  bytes 127396 (124.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.28.128.3  netmask 255.255.255.0  broadcast 172.28.128.255
        inet6 fe80::a00:27ff:fe24:55a0  prefixlen 64  scopeid 0x20<link>
        ether 08:00:27:24:55:a0  txqueuelen 1000  (Ethernet)
        RX packets 297  bytes 52633 (51.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 249  bytes 48684 (47.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 43  bytes 2514 (2.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 43  bytes 2514 (2.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[vagrant@cmu hare]$

Able to start by explicitly providing one of the IP using -bind=<IP>.

Problem: `consul-kv.json` contains undocumented keys

Consul KV initialisation file contains keys that are not documented in the 4/KV specification.

$ OUT=/tmp/cfgen.out
$ [[ -d $OUT ]] || mkdir $OUT
$ cfgen/cfgen -D cfgen/dhall -o $OUT <cfgen/_misc/singlenode.yaml
$ dhall-to-json --pretty <$OUT/consul-kv.dhall | grep key
    "key": "leader",
    "key": "epoch",
    "key": "fid",
    "key": "node/3/service/M0_CST_CONFD/6",
    "key": "node/3/service/M0_CST_CONFD/9",
    "key": "node/3/service/M0_CST_CONFD/19",
    "key": "node/3/service/M0_CST_CONFD/22",
    "key": "node/3/service/M0_CST_IOS/6",
    "key": "node/3/service/M0_CST_IOS/9",
    "key": "node/3/service/M0_CST_IOS/19",
    "key": "node/3/service/M0_CST_IOS/22",

Solution:

rename "fid" key to "fid_keygen" ("fid" is too general)
add "fid_keygen" to the spec
add "node/..." keys to the spec
upgrade the status of the spec to “draft”

Update README.md - about HTTP Server

elect-rc-leader: Do not depend on repository location

Let script work even if the repository is cloned to a path different from ~/hare.
Don't use exit in the conditional expression of a while loop: this makes the code more difficult to read.
exit0 function is not actually needed. The code is simpler without this abstraction.
Use wider indentation (4 spaces instead of 2) — this makes the code flow easier to follow.

Problem: hax blocks while handling EP request

Callbacks from m0_halon_interface are propagated to Python level. That's mostly where hax is good at: the processing of ha_link events and messages can be done at the higher level. The problem is how Python GIL works with such callbacks.

Since callbacks are invoked from the internal mero threads, those threads are a kind of foreign to Python. The official Python docs say that, when such a thread needs to communicate the Pythonic structures, GIL must be obtained. Since GIL lock is acquired for the whole time span while the entrypoint request is being processed (i.e. all the communication with Consul is included as well), no other Python threads will become active. As a result, if such a callback requires retries or the network connectivity is bad, hax will become unresponsive to all of the following external events:

Signals (e.g. it won't stop at SIGINT since, in Python, all the signals are processed in the main thread only)
Incoming HTTP requests (even though we make our HTTP server multithreaded, none of its threads will be given with a CPU time resource)
Other callbacks from mero level (since GIL is effectively a mutex which is already obtained by another thread, all the other callbacks will need to acquire it first)

This won't do for EES, which requires at least two m0d-s (IO services) to be running on the same node.

Solution:

make ha_link callbacks asynchronous. For instance, if an entrypoint request is recieved, there is no need to send a reply from the same thread.
add a mechanism which handles the received requests in another Python thread (so that GIL will be switching normally).

MR http://gitlab.mero.colo.seagate.com/mero/hare/merge_requests/62 already brings an implementation of the given idea:

There is a concurrent queue at Python level for inter-thread communication
The thread which receives an entrypoint request does nothing except for registering the request to that queue and then exits
- So that GIL is released fast
- and Mero thread doesn't wait long on IO operations
There is a worker thread which reads the messages from that queue and is able to process them. EntrypointRequest is one of the messages it can handle.

seagate / cortx-hare Goto Github PK

cortx-hare's People

Contributors

Stargazers

Watchers

Forkers

cortx-hare's Issues

Why new rules

master-only repo

Idea

Some facts around the idea

See also

Recommend Projects

Recommend Topics

Recommend Org

Jobs