lago-project / lago-ost-plugin Goto Github PK
View Code? Open in Web Editor NEWLago ovirt-system-tests plugin
License: GNU General Public License v2.0
Lago ovirt-system-tests plugin
License: GNU General Public License v2.0
This is the current stack trace.
We shouldn't show it to the user. Instead, we need to explain why the operation failed,
and what needs to be done in order to run it again successfully.
@ Stopping oVirt environment: ERROR (in 0:05:25)
Error occured, aborting
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 362, in do_run
self.cli_plugins[args.ovirtverb].do_run(args)
File "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line 184, in do_run
self._do_run(**vars(args))
File "/usr/lib/python2.7/site-packages/lago/utils.py", line 501, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/lago/utils.py", line 512, in wrapper
return func(*args, prefix=prefix, **kwargs)
File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 294, in do_ovirt_stop
prefix.virt_env.engine_vm().stop_all_hosts()
File "/usr/lib/python2.7/site-packages/ovirtlago/utils.py", line 148, in wrapped_func
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/ovirtlago/virt.py", line 446, in stop_all_hosts
testlib.assert_true_within(_host_is_maint, timeout=timeout)
File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 263, in assert_true_within
assert_equals_within(func, True, timeout, allowed_exceptions)
File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 237, in assert_equals_within
'%s != %s after %s seconds' % (res, value, timeout)
AssertionError: None != True after 300 seconds
This option will allow to force stop the env even if moving the hosts to maintenance failed.
junit.xml report isn't being generated when lago.do_collect raises an exception (ovirtlago.testlib:170).
This happens frequently on Fedora 25, where the SDK v4 is not installed by 'install_lago.sh'.
We can add it to the spec file, once it is officially released for FC25.
12:16:15 + cd /dev/shm/ost/deployment-basic-suite-master
12:16:15 + lago ovirt stop
12:16:15 @ Stopping oVirt environment:
12:16:15 # Stopping Engine VMs:
12:16:20 # Stopping Engine VMs: Success (in 0:00:05)
12:16:20 # Putting hosts in maintenance mode:
12:16:21 # Putting hosts in maintenance mode: ERROR (in 0:00:00)
12:16:21 @ Stopping oVirt environment: ERROR (in 0:00:06)
12:16:21 Error occured, aborting
12:16:21 Traceback (most recent call last):
12:16:21 File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 360, in do_run
12:16:21 self.cli_plugins[args.ovirtverb].do_run(args)
12:16:21 File "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line 184, in do_run
12:16:21 self._do_run(**vars(args))
12:16:21 File "/usr/lib/python2.7/site-packages/lago/utils.py", line 501, in wrapper
12:16:21 return func(*args, **kwargs)
12:16:21 File "/usr/lib/python2.7/site-packages/lago/utils.py", line 512, in wrapper
12:16:21 return func(*args, prefix=prefix, **kwargs)
12:16:21 File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 292, in do_ovirt_stop
12:16:21 prefix.virt_env.engine_vm().stop_all_hosts()
12:16:21 File "/usr/lib/python2.7/site-packages/ovirtlago/utils.py", line 145, in wrapped_func
12:16:21 return func(*args, **kwargs)
12:16:21 File "/usr/lib/python2.7/site-packages/ovirtlago/virt.py", line 390, in stop_all_hosts
12:16:21 host_service.deactivate()
12:16:21 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/services.py", line 30877, in deactivate
12:16:21 return self._internal_action(action, 'deactivate', None, headers, query, wait)
12:16:21 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 290, in _internal_action
12:16:21 return future.wait() if wait else future
12:16:21 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 53, in wait
12:16:21 return self._code(response)
12:16:21 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 287, in callback
12:16:21 self._check_fault(response)
12:16:21 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 125, in _check_fault
12:16:21 self._raise_error(response, body.fault)
12:16:21 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 109, in _raise_error
12:16:21 raise error
12:16:21 Error: Fault reason is "Operation Failed". Fault detail is "[Cannot switch Host to Maintenance mode. Host has asynchronous running tasks,
12:16:21 wait for operation to complete and retry.]". HTTP response code is 409.
We currently require yum-utils
which is wrong.
The change should be done in the lago-ovirt.spec.in
The repository server is currently a very strange thing, is something that lives in the Lago environment but:
LagoInitFile
lago start
lago stop
lago destroy
lago ovirt serve
is the only lago command that creates a long lived Lago process.The above had been causing several issues:
lago ovirt serve
lago ovirt serve
can keep runing and prevent the same command from being run in a new lago environment on the same hostWhat needs to be done IMO:
LagoInitFile
to specify that a local repo server should be available in the environment (It could be expanded in the future to enable running multiple servers and specifying that reposync
should run at deploy, but lets not spend time on enhancements ATM)lago start
needs to be changed to start the server if asked for, lago stop
to stop it etc.lago status
should show the status of the repo server. To do that, it should have a special URL defined that will return some status JSON. To be robust the status command should probably always try to check if the server process is up before trying to query it over HTTP.lago ovirt serve
command should be converted into a noop showing a deprecation warning with some instructions on how to add the server to the LagoIniFile
.Also it might be useful to move the server code to its own separate Python file so that we don't have to have the whole Lago code base in memory just to serve some files over HTTP.
needed in order to provide the path to look for the tests output
it to solve issue:
https://ovirt-jira.atlassian.net/browse/OVIRT-2126
ovirt_cpu_map.yaml
.IvyBridge
and Intel SandyBridge Family
)Now that we collect the entire '/var/log' directory in ost-plugin, it would be that on each 'runtest' command we would only collect the logs per that test, instead of repeatedly collecting the entire directory. This will also reduce the size of the collected logs.
Currently it just fails with:
AttributeError: 'NoneType' object has no attribute 'start_all_hosts'
(moved from lago)
Moved from lago-project/lago#548 to here:
@dron1 wrote:
I created a new image on http://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/ and was trying to use the image to run lago on my local machine.
when I ran 'lago init' the vms failed to activate the hosts:
@ Starting oVirt environment: ERROR (in 0:00:58)
Error occured, aborting
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 360, in do_run
self.cli_plugins[args.ovirtverb].do_run(args)
File "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line 184, in do_run
self._do_run(**vars(args))
File "/usr/lib/python2.7/site-packages/lago/utils.py", line 501, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/lago/utils.py", line 512, in wrapper
return func(*args, prefix=prefix, **kwargs)
File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 254, in do_ovirt_start
prefix.virt_env.engine_vm().start_all_hosts(timeout=5 * 60)
File "/usr/lib/python2.7/site-packages/ovirtlago/utils.py", line 145, in wrapped_func
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/ovirtlago/virt.py", line 414, in start_all_hosts
api = self.get_api_v4(check=True)
File "/usr/lib/python2.7/site-packages/ovirtlago/virt.py", line 302, in get_api_v4
self._api_v4 = self._get_api(api_ver=4)
File "/usr/lib/python2.7/site-packages/ovirtlago/virt.py", line 282, in _get_api
raise RuntimeError('test api call failed')
RuntimeError: test api call failed
looking further at the reason I can see it's a cpu issue for the hosts
May 21 18:45:34 dhcp-0-198 kernel: kvm [3247]: vcpu0 unhandled rdmsr: 0x345
May 21 18:45:34 dhcp-0-198 kernel: kvm [3247]: vcpu0 unhandled wrmsr: 0x680 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled rdmsr: 0x345
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x680 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x6c0 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x681 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x6c1 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x682 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x6c2 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x683 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x6c3 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3245]: vcpu0 unhandled wrmsr: 0x684 data 0
May 21 18:45:34 dhcp-0-198 kernel: kvm [3249]: vcpu0 unhandled rdmsr: 0x345
May 21 18:45:40 dhcp-0-198 kvm: 2 guests now active
May 21 18:45:40 dhcp-0-198 kvm: 1 guest now active
when I logged in to the ovirt engine vm I could see that hosts fail to activate because a wrong cpu type.
This seems to happen since the cpu type has been selected based on the HW in which I created the image.
hence, if I try to use the image in lago in any computer that has a different cpu it would not be able to activate the ovirt hosts.
It'd be great to be able to run different suites concurrently. Right now, there's a single lock 'repolock', on the reposync process - instead of being perhaps per suite.
I assume it's because the reposync repository is also global, and not per suite...
Trying to run the lago demo tool with the following commands ( after extracting the image ) works:
lago init
lago ovirt start --with-vm
lago ovirt status
lago stop
But when running 'lago ovirt status' after the env is stopped, the commands hangs and eventually I had to run CTRL-C to stop it, got this exception:
lago ovirt status
Error occured, aborting
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 325, in do_run
self.cli_plugins[args.ovirtverb].do_run(args)
File "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line 184, in do_run
self._do_run(**vars(args))
File "/usr/lib/python2.7/site-packages/lago/utils.py", line 501, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/lago/utils.py", line 512, in wrapper
return func(*args, prefix=prefix, **kwargs)
File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 211, in do_ovirt_status
prefix.virt_env.engine_vm().status()
File "/usr/lib/python2.7/site-packages/ovirtlago/utils.py", line 145, in wrapped_func
return func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/ovirtlago/virt.py", line 463, in status
api = self.get_api_v4(check=True)
File "/usr/lib/python2.7/site-packages/ovirtlago/virt.py", line 301, in get_api_v4
self._api_v4 = self._get_api(api_ver=4)
File "/usr/lib/python2.7/site-packages/ovirtlago/virt.py", line 281, in _get_api
raise RuntimeError('test api call failed')
RuntimeError: test api call failed
testlib.LogCollectorPlugin should be fixed. It should show ERROR
message when a test fails.
17:12:52 [basic-suit] @ Run test: 007_sd_reattach.py:
17:12:52 [basic-suit] nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
17:12:52 [basic-suit] # deactivate_storage_domain:
17:12:52 [basic-suit] * Collect artifacts:
17:13:19 [basic-suit] * Collect artifacts: Success (in 0:00:23)
17:13:19 [basic-suit] # deactivate_storage_domain: Success (in 0:00:24)
17:13:19 [basic-suit] # Results located at /dev/shm/ost/deployment-basic-suite-4.2/default/007_sd_reattach.py.junit.xml
17:13:19 [basic-suit] @ Run test: 007_sd_reattach.py: Success (in 0:00:24)
17:13:19 [basic-suit] Error occured, aborting
17:13:19 [basic-suit] Traceback (most recent call last):
17:13:19 [basic-suit] File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 362, in do_run
17:13:19 [basic-suit] self.cli_plugins[args.ovirtverb].do_run(args)
17:13:19 [basic-suit] File "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line 184, in do_run
17:13:19 [basic-suit] self._do_run(**vars(args))
17:13:19 [basic-suit] File "/usr/lib/python2.7/site-packages/lago/utils.py", line 505, in wrapper
17:13:19 [basic-suit] return func(*args, **kwargs)
17:13:19 [basic-suit] File "/usr/lib/python2.7/site-packages/lago/utils.py", line 516, in wrapper
17:13:19 [basic-suit] return func(*args, prefix=prefix, **kwargs)
17:13:19 [basic-suit] File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 99, in do_ovirt_runtest
17:13:19 [basic-suit] raise RuntimeError('Some tests failed')
17:13:19 [basic-suit] RuntimeError: Some tests failed
It seems that the server that we currently use is overloaded:
11:49:32 @ Deploy oVirt environment:
11:49:33 # Deploy environment:
11:49:33 * [Thread-2] Deploy VM lago-basic-suite-master-host-0:
11:49:33 * [Thread-3] Deploy VM lago-basic-suite-master-host-1:
11:49:33 * [Thread-4] Deploy VM lago-basic-suite-master-engine:
11:49:55 * [Thread-3] Deploy VM lago-basic-suite-master-host-1: Success (in 0:00:22)
11:49:57 Traceback (most recent call last):
11:49:57 File "/usr/lib64/python2.7/SocketServer.py", line 295, in _handle_request_noblock
11:49:57 self.process_request(request, client_address)
11:49:57 File "/usr/lib64/python2.7/SocketServer.py", line 321, in process_request
11:49:57 self.finish_request(request, client_address)
11:49:57 File "/usr/lib64/python2.7/SocketServer.py", line 334, in finish_request
11:49:57 self.RequestHandlerClass(request, client_address, self)
11:49:57 File "/usr/lib64/python2.7/SocketServer.py", line 651, in __init__
11:49:57 self.finish()
11:49:57 File "/usr/lib64/python2.7/SocketServer.py", line 710, in finish
11:49:57 self.wfile.close()
11:49:57 File "/usr/lib64/python2.7/socket.py", line 279, in close
11:49:57 self.flush()
11:49:57 File "/usr/lib64/python2.7/socket.py", line 303, in flush
11:49:57 self._sock.sendall(view[write_offset:write_offset+buffer_size])
11:49:57 error: [Errno 32] Broken pipe
I think that yum aborts the connection because the data arrives to slow.
Another option is to configure yum, limit the number of connections / extend timeout.
When installing ovirt-engine-sdk-python
with pip, pycurl is also installed as a dependency.
The installation of pycurl can with the following error: ssl-backend-error-when-using...
We should explain in the docs how to solve this issue. The solution can be taken from ovirt-engine-sdk-python
The directory is obviously a cache directory, so should reside in /var/cache
and not /var/lib
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.