Superlance is a package of plugin utilities for monitoring and controlling processes that run under supervisor.
Please see docs/index.rst
for complete documentation.
Superlance utilities for use with the Supervisor process control system
Home Page: http://supervisord.org
License: Other
Superlance is a package of plugin utilities for monitoring and controlling processes that run under supervisor.
Please see docs/index.rst
for complete documentation.
Monitors a directory for file changes and sends an event. Did not see an suitable event type for this.
This will help in restarting processes after changes in the conf.d are detected for applications. More specifically this helps when processes are running within containers and the conf.d is mounted on shared volumes. So when placing files, access to processes in other containers not available due to process namespaces being insulated.
How do I install superlance on python3 when supervisor only supports python 2:
33fe43b4c05a root@~/src $ pip install superlance
Collecting superlance
Using cached https://files.pythonhosted.org/packages/14/87/d2b4fe1f9e7f97360e75e125cc03b2216a0ce5092034f203febc3818b7da/superlance-1.0.0-py2.py3-none-any.whl
Collecting supervisor (from superlance)
Using cached https://files.pythonhosted.org/packages/44/60/698e54b4a4a9b956b2d709b4b7b676119c833d811d53ee2500f1b5e96dc3/supervisor-3.3.4.tar.gz
Complete output from command python setup.py egg_info:
Supervisor requires Python 2.4 or later but does not work on any version of Python 3. You are using version 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0]. Please install using a supported version.
Is this possible? It doesn't seem like it, but almost everyone could benefit from it.
Hi,
When I build the doc from PyPI's 0.13 tarball, I see this:
superlance-0.13/docs/index.rst:44: WARNING: toctree contains reference to nonexisting document u'development'
and
copying static files... WARNING: html_static_path entry u'/var/tmp/portage/dev-python/superlance-0.13/work/superlance-0.13/docs/_static' does not exist
Do you think you could include the development.rst
file and the _static
directory (or just remove the relevant line from conf.py
)?
Getting the following error:
bash > /usr/local/bin/fatalmailbatch
Traceback (most recent call last):
File "superlance/fatalmailbatch.py", line 78, in <module>
main()
File "superlance/fatalmailbatch.py", line 74, in main
fatal = FatalMailBatch.create_from_cmd_line()
File "/usr/local/lib/python2.7/dist-packages/superlance-0.8-py2.7.egg/superlance/process_state_email_monitor.py", line 77, in create_from_cmd_line
options = cls.get_cmd_line_options()
File "/usr/local/lib/python2.7/dist-packages/superlance-0.8-py2.7.egg/superlance/process_state_email_monitor.py", line 73, in get_cmd_line_options
return cls.validate_cmd_line_options(cls.parse_cmd_line_options())
File "/usr/local/lib/python2.7/dist-packages/superlance-0.8-py2.7.egg/superlance/process_state_email_monitor.py", line 61, in validate_cmd_line_options
parser.print_help()
NameError: global name 'parser' is not defined
fatalmailbatch, crashmailbatch & crashsms are all failing for the same reason from command line & inside supervisord.
Running python 2.7.3.
Got it to run using global keyword on the parser variable, but I'm not sure how you want to resolve this.
❤️
Deprecation warnings are raised due to invalid escape sequences in Python 3.8 . Below is a log of the warnings raised during compiling all the python files. Using raw strings or escaping them will fix this issue.
find . -iname '*.py' | xargs -P 4 -I{} python -Walways -m py_compile {}
./superlance/tests/memmon_test.py:313: DeprecationWarning: invalid escape sequence \-
"""Let calc_rss() do its work on a fake process tree:
Hi,
I am trying to fix this error I have. I have seen similiar errors but they mainly talk about not using redirect_stderr=True.
For example here:
#55
I have this on my config file
[eventlistener:mylistener2]
command=python3 /etc/supervisor/bin/listener.py
process_name=%(program_name)s_%(process_num)s
numprocs=1
events=PROCESS_STATE
autorestart=true
stderr_logfile=/var/log/supervisor/event-error.log
stdout_logfile=/var/log/supervisor/event.log
And this on my listener
import sys
def write_stdout(s):
# only eventlistener protocol messages may be sent to stdout
sys.stdout.write(s)
sys.stdout.flush()
def write_stderr(s):
sys.stderr.write(s)
sys.stderr.flush()
def main():
while 1:
# transition from ACKNOWLEDGED to READY
write_stdout('READY\n')
# read header line and print it to stderr
line = sys.stdin.readline()
write_stderr(line)
# read event payload and print it to stderr
headers = dict([ x.split(':') for x in line.split() ])
data = sys.stdin.read(int(headers['len']))
write_stderr(data)
if headers["eventname"] == "PROCESS_STATE_STOPPING":
write_stderr("Process state stopping...\n")
# transition from READY to ACKNOWLEDGED
write_stdout('RESULT 2\nOK')
if __name__ == '__main__':
main()
I am testing it, and it seems to output Process state stopping... when they stop either way but everytime I reload my supervisor I get
ERRO pool mylistener2 event buffer overflowed, discarding event
on my logs
We are running many supervisor tasks. when server facing some problem. crashmail sendout email to configured email id. That's good, but sometime if not able to immediately resolve the server issue, crashmail sends out thousands of email. So Is there any option to limit the email for specific period?
When authentication is used in the unix_http_server section and the same are credentials are used everywhere necessary like supervisorctl, memmon throws 401 unauthorized when it tries to create the ServerProxy in the main method. The supervisorctl is working fine though and the inet_http_server is also setup. What could be the issue?
It seems to expect the SUPERVISOR_USERNAME, SUPERVISOR_PASSWORD and the SUPERVISOR_SERVER_URL to be present in the env. Should we set it explicitly?
Despite usage help
-n -- optionally specify the name of the httpok process. This name will
be used in the email subject to identify which httpok process
restarted the process.
it seems that httpok doesn't support this option
short_args="hp:at:c:b:s:m:g:d:eE"
long_args=[
"help",
"program=",
"any",
"timeout=",
"code=",
"body=",
"sendmail_program=",
"email=",
"gcore=",
"coredir=",
"eager",
"not-eager",
]
Is it mistake or by design?
It seems that the version which gets installed with pip install superlance
is affected by #110. The issue was fixed over 2 years ago but it's still wasting hours of developers' productivity. Could you please release the patched version?
I'm wondering why httpok doesn't include an option to always try to restart even if the process is not in ProcessStates.RUNNING
?
That would be my feature request for an option to add that (or to do it by default).
Do people use a different tool for that job?
My scenario is that a temporary configuration error meant that a key process couldn't start (startretries
was 3, but as the config was incorrect for around a half hour — so I'd have had have set that quite high to cover the half hour)
As I've an hourly httpok
running, my assumption was that this would have attempted to restart the process each hour, and (say) 10 hours later the system could be back working without intervention.
I want to see if I'm missing some approach before submitting a PR.
In a test with the following program config:
[program:test-crashmail]
command = bash -c 'echo "$(date): TESTING..."; sleep 5; false'
autostart = false
autorestart = false
redirect_stderr = true
stdout_logfile = %(here)s/test-crashmail.log
startsecs=1
startretries = 0
I also have a crashmail listener:
[eventlistener:crashmail]
command=crashmail -a -o '[supervisord] ' -m 'root@localhost'
events=PROCESS_STATE
stdout_logfile = %(here)s/crashmail.log
redirect_stderr = true
I manually started the test-crashmail
process (via the web UI) 4 times, leaving plenty of time between each.
In the debug logs of supervisord below, you can see that crashmail prints "unexpected exit, mailing" - supervisor sees this as an "UNKNOWN" state, and it never recovers (even after the mail is sent, and it prints "OKREADY").
No further process state events get sent to crashmail, because it's marked as not ready to receive them.
2013-03-08 09:45:43,112 DEBG fd 20 closed, stopped monitoring <POutputDispatcher at 23797200 for <Subprocess at 22786488 with name test-crashmail in state RUNNING> (stdout)>
2013-03-08 09:45:43,115 INFO exited: test-crashmail (exit status 1; not expected)
2013-03-08 09:45:43,115 DEBG received SIGCLD indicating a child quit
2013-03-08 09:45:43,119 DEBG event 23 sent to listener crashmail
2013-03-08 09:45:43,120 DEBG 'crashmail' stdout output:
unexpected exit, mailing
2013-03-08 09:45:43,122 DEBG crashmail: BUSY -> UNKNOWN (bad result line 'unexpected exit, mailing')
2013-03-08 09:45:43,123 DEBG rebuffering event 23 for pool crashmail (bufsize 0)
2013-03-08 09:46:11,038 INFO spawned: 'test-crashmail' with pid 10052
2013-03-08 09:46:11,059 DEBG 'test-crashmail' stdout output:
Fri Mar 8 09:46:11 EST 2013: TESTING...
2013-03-08 09:46:12,061 INFO success: test-crashmail entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2013-03-08 09:46:16,062 DEBG fd 20 closed, stopped monitoring <POutputDispatcher at 23930712 for <Subprocess at 22786488 with name test-crashmail in state RUNNING> (stdout)>
2013-03-08 09:46:16,065 INFO exited: test-crashmail (exit status 1; not expected)
2013-03-08 09:46:16,066 DEBG received SIGCLD indicating a child quit
2013-03-08 09:46:44,192 DEBG 'crashmail' stdout output:
Mailed:
To: root@localhost
Subject: [supervisord] : test-crashmail crashed at 2013-03-08 09:45:43,117
Process test-crashmail in group test-crashmail exited unexpectedly (pid 10042) from state RUNNINGRESULT 2
OKREADY
2013-03-08 09:47:16,550 INFO spawned: 'test-crashmail' with pid 10059
2013-03-08 09:47:16,569 DEBG 'test-crashmail' stdout output:
Fri Mar 8 09:47:16 EST 2013: TESTING...
2013-03-08 09:47:17,571 INFO success: test-crashmail entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2013-03-08 09:47:21,572 DEBG fd 20 closed, stopped monitoring <POutputDispatcher at 22965856 for <Subprocess at 22786488 with name test-crashmail in state RUNNING> (stdout)>
2013-03-08 09:47:21,574 INFO exited: test-crashmail (exit status 1; not expected)
2013-03-08 09:47:21,575 DEBG received SIGCLD indicating a child quit
2013-03-08 09:48:25,456 INFO spawned: 'test-crashmail' with pid 10063
2013-03-08 09:48:25,472 DEBG 'test-crashmail' stdout output:
Fri Mar 8 09:48:25 EST 2013: TESTING...
2013-03-08 09:48:26,475 INFO success: test-crashmail entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2013-03-08 09:48:30,476 DEBG fd 21 closed, stopped monitoring <POutputDispatcher at 23740856 for <Subprocess at 22786488 with name test-crashmail in state RUNNING> (stdout)>
2013-03-08 09:48:30,478 INFO exited: test-crashmail (exit status 1; not expected)
2013-03-08 09:48:30,479 DEBG received SIGCLD indicating a child quit
After testing this some more, I realise that it's failing because I used redirect_stderr
on the crashmail listener. I do this by habit for all of my own programs, since I don't want to have to track two log files. I didn't realise the listener protocol was based on stdout messages, and that redirect_stderr would break it. I guess there are a few ways around this:
redirect_stderr
for listenersWhen I was looking for a method to monitor processes (and kill rogue dev processes), I came across memmon from superlance.
I configured it this way:
[group:group]
programs = gunicorn,celerydb,celerycam
priority = 10
[eventlistener:memmon]
command=/.../web/env/bin/memmon -p group=200MB -m <adminemail>
events=TICK_60
serverurl = unix:///tmp/supervisor.sock
environment=SUPERVISOR_SERVER_URL='unix:///tmp/supervisor.sock',SUPERVISOR_USERNAME=user,SUPERVISOR_PASSWORD=123
Now that works fine apparently, but what happens after a while is this:
raceback (most recent call last):
File "/.../web/env/bin/memmon", line 9, in <module>
load_entry_point('superlance==0.7', 'console_scripts', 'memmon')()
File "/.../web/env/local/lib/python2.7/site-packages/superlance/memmon.py", line 289, in main
memmon.runforever()
File "/.../web/env/local/lib/python2.7/site-packages/superlance/memmon.py", line 139, in runforever
data = shell(self.pscommand % pid)
File "/.../web/env/local/lib/python2.7/site-packages/superlance/memmon.py", line 80, in shell
return os.popen(cmd).read()
OSError: [Errno 12] Cannot allocate memory
Isn't that something memmon should use as a reason to kill off the group? (Or do I get it fundamentally wrong?)
Hi,
I've posted this question on stackoverflow but given there isn't a lot of questions about superlance, I decided to repost the question here.
I'm trying to set up the email sending when a process changes state in supervisord by using crashmail. Having no luck with the default sendmail
program which requires quite a lot of setup, I decided to go with a small script in Python that sends email using SMTP.
This worked very well (I received indeed an email saying that the process state changes) for the first state change but stop working afterward. I have tried to change different options in supervisord
such as buffer_size
or autorestart
but it has no effect.
Here is the script I use to trigger the supervisord
state changes:
import time
from datetime import datetime
if __name__ == '__main__':
print(">>>>> STARTING ...", flush=True)
while True:
print("sleep now:", datetime.utcnow(), flush=True)
time.sleep(30)
raise Exception("meo meo")
This is the script that sends email through Gmail. This one will send the stdin
.
#!/usr/bin/env python
import smtplib
def get_server():
smtpserver = smtplib.SMTP('smtp.gmail.com:587')
smtpserver.ehlo()
smtpserver.starttls()
smtpserver.login("[email protected]", "password")
return smtpserver
if __name__ == '__main__':
import sys
data = sys.stdin.read()
s = get_server()
s.sendmail('[email protected]', ['[email protected]'], data)
s.quit()
Here is my supervisord.conf
[eventlistener:crashmail]
command=crashmail -a -m [email protected] -s /home/ubuntu/mysendmail.py
events=PROCESS_STATE
buffer_size=102400
autorestart=true
Does anyone have any idea why?
Thanks!
Currently stderr and stdout logs are less than useless if you want to analyze them after a problem has happened because they don't include timestamps and you have no idea when a certain event took place.
Take, for instance the piece of memmon error log I included on a previous issue (#70):
Checking groups app=1610612736
RSS of app:instance1 is 1614974976
Restarting app:instance1
RSS of app:instance2 is 1297457152
RSS of app:instance3 is 1477554176
Checking groups app=1610612736
RSS of app:instance1 is 318668800
RSS of app:instance2 is 1297506304
RSS of app:instance3 is 1477554176
Checking groups app=1610612736
RSS of app:instance1 is 164720640
RSS of app:instance2 is 1297575936
RSS of app:instance3 is 1477672960
Checking groups app=1610612736
RSS of app:instance1 is 340303872
RSS of app:instance2 is 1305280512
RSS of app:instance3 is 1477713920
Checking groups app=1610612736
RSS of app:instance1 is 166830080
RSS of app:instance2 is 1318711296
RSS of app:instance3 is 1477849088
Checking groups app=1610612736
RSS of app:instance1 is 337248256
RSS of app:instance2 is 1325903872
RSS of app:instance3 is 1477685248
The httpok error log of that process gives me no indication of when the restarts happened, so I can not associate this information with the other:
Restarting selected processes ['app:instance1']
app:instance1 is in RUNNING state, restarting
app:instance1 restarted
Restarting selected processes ['app:instance1']
app:instance1 is in RUNNING state, restarting
app:instance1 restarted
Restarting selected processes ['app:instance1']
app:instance1 is in RUNNING state, restarting
app:instance1 restarted
Hey - thanks for maintaining this!
How do I install this under python 3? I tried pip:
Vs-Pro.local vgoklani@~ $ pip install superlance
Collecting superlance
Using cached https://files.pythonhosted.org/packages/14/87/d2b4fe1f9e7f97360e75e125cc03b2216a0ce5092034f203febc3818b7da/superlance-1.0.0-py2.py3-none-any.whl
Collecting supervisor (from superlance)
Using cached https://files.pythonhosted.org/packages/44/60/698e54b4a4a9b956b2d709b4b7b676119c833d811d53ee2500f1b5e96dc3/supervisor-3.3.4.tar.gz
Complete output from command python setup.py egg_info:
Supervisor requires Python 2.4 or later but does not work on any version of Python 3. You are using version 3.7.0 (default, Jun 28 2018, 07:39:16)
[Clang 4.0.1 (tags/RELEASE_401/final)]. Please install using a supported version.
I've been using supervisor with python3 and it's very stable :) but I need crashmail. Thanks!
Hello,
I have installed the superlance package on python but when I run the supervisor on MAC OS. I got the error
FATAL can't find command 'fatalmailbatch' Please check the error. the supervisor is installed and working properly but fatalmailbatch event not working and I got the error. Please help how we can fix this issue.
Also, during the installation, I got this warning.
WARNING: The scripts echo_supervisord_conf, pidproxy, supervisorctl and supervisord are installed in
'/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.9/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The scripts crashmail, crashmailbatch, crashsms, fatalmailbatch, httpok and memmon are installed in
'/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.9/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Thanks
I am trying to set up a memmon event listener to monitor the process memory and restart the high memory consuming process inside one of our containers whose base image is alpine. I did install the pip latest version and installed the superlance with pip. I update my supervisord config with the event listener. however when I look at the supervisorctl status, the memmon process get indefinite STARTING loop. I stopped the memmon from supervisorctl and started it manually through command line inside the container. i get the below error:
Traceback (most recent call last):
File "/usr/bin/memmon", line 8, in
sys.exit(main())
File "/usr/lib/python2.7/site-packages/superlance/memmon.py", line 417, in main
memmon.rpc = childutils.getRPCInterface(os.environ)
File "/usr/lib/python2.7/site-packages/supervisor/childutils.py", line 17, in getRPCInterface
return xmlrpclib.ServerProxy('http://127.0.0.1', getRPCTransport(env))
File "/usr/lib/python2.7/site-packages/supervisor/childutils.py", line 11, in getRPCTransport
return SupervisorTransport(u, p, env['SUPERVISOR_SERVER_URL'])
File "/usr/lib/python2.7/UserDict.py", line 40, in getitem
raise KeyError(key)
KeyError: 'SUPERVISOR_SERVER_URL
Can anyone help me understand this error?
Below is the error from Supervisor logs:
Traceback (most recent call last):
File "/usr/bin/memmon", line 11, in
sys.exit(main())
File "/usr/lib/python2.7/site-packages/superlance/memmon.py", line 418, in main
memmon.runforever()
File "/usr/lib/python2.7/site-packages/superlance/memmon.py", line 152, in runforever
infos = self.rpc.supervisor.getAllProcessInfo()
File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in call
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/site-packages/supervisor/xmlrpc.py", line 519, in request
'' )
xmlrpclib.ProtocolError: <ProtocolError for 127.0.0.1/RPC2: 401 Unauthorized>
I also see that memmon start up also breaks my supervisorctl web interface on [inet_http_server] when i do curl -u username:password http://localhost:9001 I get 500 error.
Any help is appreciated!
parse_cmd_line_options
is somehow NOT called before
validate_cmd_line_options
Traceback (most recent call last):
File "/usr/local/bin/fatalmailbatch", line 9, in
load_entry_point('superlance==0.8', 'console_scripts', 'fatalmailbatch')()
File "/usr/local/lib/python2.7/dist-packages/superlance/fatalmailbatch.py", line 74, in main
fatal = FatalMailBatch.create_from_cmd_line()
File "/usr/local/lib/python2.7/dist-packages/superlance/process_state_email_monitor.py", line 77, in create_from_cmd_line
options = cls.get_cmd_line_options()
File "/usr/local/lib/python2.7/dist-packages/superlance/process_state_email_monitor.py", line 73, in get_cmd_line_options
return cls.validate_cmd_line_options(cls.parse_cmd_line_options())
File "/usr/local/lib/python2.7/dist-packages/superlance/process_state_email_monitor.py", line 61, in validate_cmd_line_options
parser.print_help()
NameError: global name 'parser' is not defined
I have a program that runs with multiple processes like this in my supervisord.conf
[program:firehose]
process_name=%(program_name)s_%(process_num)02d
command=php artisan doctrine:queue:work beanstalkd --queue=firehose --tries=5 --sleep=5 --delay=0 --daemon
directory=/var/app/current/
autostart=true
autorestart=true
numprocs=5
looks like this in supervisorctl status
firehose:firehose_00 RUNNING pid 26265, uptime 0:13:09
firehose:firehose_01 RUNNING pid 26264, uptime 0:13:09
firehose:firehose_02 RUNNING pid 26267, uptime 0:13:09
firehose:firehose_03 RUNNING pid 26266, uptime 0:13:09
firehose:firehose_04 RUNNING pid 26263, uptime 0:13:09
I have successfully caught these using memmon -a
but I can't get memmon to monitor just the program (all 5 processes). I have tried
-p firehose=100MB
-g firehose=100MB
-p firehose:firehose_00=100MB <-- example trying to monitor just one
but none of these have worked. What am I missing here?
hello:
I use supervision to monitor four projects, and superlance was used in the configuration file to monitor one of the projects and send emails when it unexpectedly exits. But now, every project that exits accidentally will send an email.Can you help me?
It would be helpful for newcommers like me to find the link to the documentation of superlance (https://superlance.readthedocs.io/en/latest/) in the readme at github and pypi :)
hello:
I use supervision to monitor four projects, and superlance was used in the configuration file to monitor one of the projects and send emails when it unexpectedly exits. But now, every project that exits accidentally will send an email.Can you help me?
Hi,
I installed superlance and configured crashsms. crashsms is running:
crashsms RUNNING pid 21683, uptime 0:10:06
This what I added in supervisord.conf:
[eventlistener:crashsms]
command=crashsms --toEmail="test@email_to_sms_gateway.com" --subject="Testing" --smtpHost="smtp.mailgun.org" --userName="[email protected]" --password="xxxxxxx" --fromEmail="[email protected]"
events=PROCESS_STATE
So far no E-mail has been sent. How can I manually cause a PROCESS_STATE change so I can test it out ?
I stopped one of my processes with: sudo supervisorctl stop worker-1000, but that didn't trigger the event.
Any guidance will be appreciated. Thanks.
Right now memmon uses ps, which only gives you the memory usage of the top-level pid in the process tree. This isn't very accurate if you use a run script or the process being monitored spawns child processes.
Memmon in unable to start, repeated error in memmon-stderr---supervisor-IgIsou.log:
Checking programs yii-queue-worker=1073741824
Traceback (most recent call last):
File "/usr/local/bin/memmon", line 11, in
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/superlance/memmon.py", line 418, in main
memmon.runforever()
File "/usr/local/lib/python2.7/dist-packages/superlance/memmon.py", line 152, in runforever
infos = self.rpc.supervisor.getAllProcessInfo()
File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in call
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/dist-packages/supervisor/xmlrpc.py", line 509, in request
self.connection.request('POST', handler, request_body, self.headers)
File "/usr/lib/python2.7/httplib.py", line 1042, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.7/httplib.py", line 1082, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 1038, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 882, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 844, in send
self.connect()
File "/usr/lib/python2.7/dist-packages/supervisor/xmlrpc.py", line 530, in connect
self.sock.connect(self.socketfile)
File "/usr/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
socket.error: [Errno 13] Permission denied
Supervisor version 3.3.1, Debian 9.
I would like the ability to customize the body of email messages that get sent from plugins like crashmail and memmon. Could a switch be added for this to those programs?
I can submit a pull request if this is approved.
Thanks!
We have the crashmail email go to different addresses based on an on-call rota, and at the moment we have to bounce it weekly to pick up the change. It would be nice to have it read the email address from a file just before sending instead. I'm not good with Python, but something like...
def mail(self, email, subject, msg):
if os.path.isfile(self.email) and os.access(self.email, os.R_OK):
with open(self.email,'r') as f:
body = 'To: %s\n' % f.read()
else:
body = 'To: %s\n' % self.email
Hi,
While it looks like the package supports Python 3, only the git version does, none of the releases do (and there hasn't been one since 2016).
Could you please release a new version?
supervisor: 3.2.0
superlace: 1.0.0
configuration
[eventlistener:crashmail]
command=/usr/local/bin/crashmail -a -m [email protected] "Server %(host_node_name)s"
events=PROCESS_STATE_EXITED
Crashmail works ok for me but syslog keeps writing this message:
ERRO pool crashmail event buffer overflowed
I tried changing different stuff like buffer_size etc but non of them worked
Any idea?
Thanks!
Hello, I believe I've found a bug in memmon
plugin.
For a program
argument in memmon
call, it's said that it's possible to specify group name to avoid ambiguity, with the format group_name:program_name
. However, the documentation placed in the code suggests otherwise - process_name:group_name
. At last I figured out that the first variant is the correct one.
When I tried to use memmon
plugin for my program, I noticed that the following exception had been emmitted to the eventlistener log:
Checking programs app:backend-0=83886080
RSS of app:backend-0 is 79663104
Traceback (most recent call last):
File "/home/vagrant/test-superlance/bin/memmon", line 11, in <module>
sys.exit(main())
File "/home/vagrant/test-superlance/lib/python2.7/site-packages/superlance/memmon.py", line 402, in main
memmon.runforever()
File "/home/vagrant/test-superlance/lib/python2.7/site-packages/superlance/memmon.py", line 169, in runforever
if rss > self.programs[name]:
KeyError: 'backend-0'
As far as I've understood, this is the line of code which throws an error because the program is being looked up without group name.
To reproduce the issue, I've written an extra test:
def test_runforever_tick_program_with_group(self):
programs = {'foo:foo': 0 }
groups = {}
_any = None
memmon = self._makeOnePopulated(programs, groups, _any)
memmon.stdin.write('eventname:TICK len:0\n')
memmon.stdin.seek(0)
memmon.runforever(test=True)
lines = memmon.stderr.getvalue().split('\n')
self.assertEqual(len(lines), 4)
self.assertEqual(lines[0], 'Checking programs foo:foo=0')
self.assertEqual(lines[1], 'RSS of foo:foo is 2264064')
self.assertEqual(lines[2], 'Restarting foo:foo')
self.assertEqual(lines[3], '')
The fix is quite simple, I can make a pull request, if you like.
Hello,
I tried installing superlance and running crashmail like this:
sudo apt-get install python-pip
sudo pip install superlance
after i do:
sudo nano /etc/supervisor/supervisord.conf
and after i added:
[eventlistener:crashmail]
command=/usr/local/bin/crashmail -a -m [email protected]
events=PROCESS_STATE
and I do not receive anything....
My fichier crashmail is :
#!/usr/bin/python
import re
import sys
from superlance.crashmail import main
if name == 'main':
sys.argv[0] = re.sub(r'(-script.pyw?|.exe)?$', '', sys.argv[0])
sys.exit(main())
Can you help me please ?
Thanks
Best regards
Ben
cmd --help
should not return a non-zero exit status, for the various commands installed by superlance.
Help!
I stopped all supervisord my processes, obviously the one is causing it. I am still getting crashmail emails with DIFFERENT pid everytime.
At start I get two messages, and after I'm got this errors:
2014-11-16 23:44:50,829 ERRO pool memmon event buffer overflowed, discarding event 62
2014-11-16 23:44:55,838 ERRO pool memmon event buffer overflowed, discarding event 63
2014-11-16 23:45:00,843 ERRO pool memmon event buffer overflowed, discarding event 64
2014-11-16 23:45:05,849 ERRO pool memmon event buffer overflowed, discarding event 65
2014-11-16 23:45:10,856 ERRO pool memmon event buffer overflowed, discarding event 66
2014-11-16 23:45:15,861 ERRO pool memmon event buffer overflowed, discarding event 67
2014-11-16 23:45:20,867 ERRO pool memmon event buffer overflowed, discarding event 68
2014-11-16 23:45:25,874 ERRO pool memmon event buffer overflowed, discarding event 69
2014-11-16 23:45:30,663 ERRO pool memmon event buffer overflowed, discarding event 70
2014-11-16 23:45:35,668 ERRO pool memmon event buffer overflowed, discarding event 71
2014-11-16 23:45:40,377 ERRO pool memmon event buffer overflowed, discarding event 72
2014-11-16 23:45:45,418 ERRO pool memmon event buffer overflowed, discarding event 73
settings:
[eventlistener:memmon]
command=memmon -p events=1MB
events=TICK_5,PROCESS_STATE
redirect_stderr=True
stdout_logfile=/tmp/memmon.log
It seems that httpok cannot restart whole group defined as fcgi-program
Say, if I have defined it like this:
[fcgi-program:test-fcgi]
...
Then "httpok -p test-fcgi:*" or "httpok -p test-fcgi" do not seem to restart the group.
But if I list all fast-cgi processes one by one:
-p test-fcgi:test-fcgi_20 -p test-fcgi:test-fcgi_21 and so on - then it seems to work as expected
it would be nice if httok could restart whole group/fast-cgi as at once.
I have a Plone site with 3 Zope busy instances running under Supervisord; I have memmon and httpok plugins configured but seems that sometimes httpok restarts an instance that was recently restarted by memmon.
here is a piece of my memmon stderr log:
Checking groups app=1610612736
RSS of app:instance1 is 1614974976
Restarting app:instance1 <-- restarting
RSS of app:instance2 is 1297457152
RSS of app:instance3 is 1477554176
Checking groups app=1610612736
RSS of app:instance1 is 318668800 <-- going up
RSS of app:instance2 is 1297506304
RSS of app:instance3 is 1477554176
Checking groups app=1610612736
RSS of app:instance1 is 164720640 <-- down
RSS of app:instance2 is 1297575936
RSS of app:instance3 is 1477672960
Checking groups app=1610612736
RSS of app:instance1 is 340303872 <-- going up
RSS of app:instance2 is 1305280512
RSS of app:instance3 is 1477713920
Checking groups app=1610612736
RSS of app:instance1 is 166830080 <-- down
RSS of app:instance2 is 1318711296
RSS of app:instance3 is 1477849088
Checking groups app=1610612736
RSS of app:instance1 is 337248256 <-- going up
RSS of app:instance2 is 1325903872
RSS of app:instance3 is 1477685248
Checking groups app=1610612736
RSS of app:instance1 is 432963584 <-- stabilized
RSS of app:instance2 is 1325481984
RSS of app:instance3 is 1477685248
Checking groups app=1610612736
RSS of app:instance1 is 630874112
RSS of app:instance2 is 1325424640
RSS of app:instance3 is 1477685248
as you can see, instance1 RSS goes from around 300MB to 150MB a couple of times before stabilizing; this seems to me as an indicator of httpok restarting the instance in the middle. both plugins are configured running at TICK_60
.
I solved this on the initial start by waiting 10 minutes before start using httpok; probably we need another parameter to deal with that after the process is running.
The CrashMail
class sets self.programs
and self.any
but never reads them. As a result, crashmail.py always behaves as though -a
is specified.
This sounds unlikely but seems to be the case:
This commit: 0e6fb2d#diff-d3d6eafafd2e31ce38ebab9e08f156eaL180
changed the logic for the retry loop. retry_time is hardcoded to 10 in httpok, so if timeout is < 10,
range(self.timeout // (self.retry_time or 1) - 1 , -1, -1):
becomes
range(5 // 10 - 1 , -1, -1):
simplifies to
range(-1 , -1, -1):
which will never execute because it is []
.
I attempted to write a test case but I'm not really skilled enough to dissect the http_ok test file.
The crashmailbatch supports sending emails via smpt options, but not via sendmail. On the crashmail it's the other way around. I'd like to use crashmailbatch with sendmail, but this is currently not supported.
It would be nice, if all the commands extend 'ProcessStateEmailMonitor' and 'ProcessStateEmailMonitor' supports a sendmail as default. The behavior can be overwritten by the open parameters --smtp-*.
I would like httpok
to send SIGUSR1
before killing a Zope instance, to obtain a thread dump, so that I can figure out what was keeping the instance busy.
What would be a good way to specify this in configuration?
Would something like the following make sense?
diff --git a/superlance/httpok.py b/superlance/httpok.py
index 23682fc..3e7c6b1 100644
--- a/superlance/httpok.py
+++ b/superlance/httpok.py
@@ -280,6 +280,8 @@ class HTTPOk:
namespec, m.read()))
write('%s is in RUNNING state, restarting' % namespec)
try:
+ for signal in self.signals:
+ self.rpc.supervisor.signalProcess(namespec, signal)
self.rpc.supervisor.stopProcess(namespec)
except xmlrpclib.Fault as e:
write('Failed to stop process %s: %s' % (
Raised at https://lists.supervisord.org/pipermail/supervisor-users/2014-September/001520.html too.
(signalProcess
is not yet in a supervisor release.)
Hi,
master contains a fix that would have prevented downtimes for us twice by now. Can we have a new release? I'd also be happy to do it. My pypi username is do3cc.
We have the following typical use case: Plone site with multiple instances running behind a web accelerator like Varnish, using memmon and httpok.
From time to time, memmon will need to restart an instance because of high memory consumption; Varnish has configured backend health check probes that test if the instances are available or not, and this probes have to play very nicely with the instances because requests hitting the backend are typically slow; for instance:
probe healthcheck {
.interval = 10s; .request = "HEAD / HTTP/1.1"; .timeout = 3s;
}
As restarting a Plone instance is a time consuming process (typically, around 30 seconds), Varnish will not notice the instance is down for some time and will continue sending requests; the instance then came in and will be flooded with a lot of pending requests from Varnish.
The Varnish backend then behaves erratically for some time until the instance stabilizes.
What I would like to have? a hook to run a command before the instance is restarted and after is marked as running.
Then I would be able to configure something like this:
varnishadm backend.set_health instance1 sick
varnishadm backend.set_health instance1 auto
This could be useful in other use cases also.
Hello!
I would like to extend this application to handle alerts to slack channels. Is that fine?
Because the restart is not an atomic operation around .001% of the time after memmon stops a process it throws an exception when trying to start it. This results both in memmon crashing and the process being restarted being left in a stopped state.
The error is often a broken pipe, but is sometimes: httplib.IncompleteRead: IncompleteRead(32644 bytes read, 35829 more expected)
which I believe means that
memmon.rpc = childutils.getRPCInterface(os.environ)
will need to be executed and then start to be retried.
Using: supervisor 3.1.1, memmon latest from git
The following configuration
[inet_http_server]
# Required for memmon
port = 127.0.0.1:9001
[supervisord]
nodaemon=true
# [program:appserver]
# ...
# ...
# See http://superlance.readthedocs.org/en/latest/memmon.html Workaround for
# dealing with application server memory consumption. Restart 'appserver'
# program (see above) when it reaches a dangerous thresshold.
[eventlistener:memmon]
command=memmon -p appserver=1400MB
events=TICK_60
redirect_stderr=true
stdout_logfile=%(ENV_AS_LOGDIR)s/mmstdout
stderr_logfile=%(ENV_AS_LOGDIR)s/mmstderr
causes memmon to fail (stack trace and stdout below)
Checking programs appserver=1468006400
Traceback (most recent call last):
File "/usr/local/bin/memmon", line 9, in <module>
load_entry_point('superlance==0.11', 'console_scripts', 'memmon')()
File "/usr/local/lib/python2.7/dist-packages/superlance/memmon.py", line 402, in main
memmon.runforever()
File "/usr/local/lib/python2.7/dist-packages/superlance/memmon.py", line 147, in runforever
infos = self.rpc.supervisor.getAllProcessInfo()
File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1578, in __request
verbose=self.__verbose
File "/usr/local/lib/python2.7/dist-packages/supervisor/xmlrpc.py", line 475, in request
return u.close()
File "/usr/lib/python2.7/xmlrpclib.py", line 793, in close
raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault 1: 'UNKNOWN_METHOD'>
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.