Comments (13)
Regarding the SAI_PORT error (i.e. attr index 0 attr id 76 failed) the below fix should solve the issue (merged recently)
https://review.openswitch.net/#/c/14562/
We need to check why the opx-pas is failing, can you please provide "service opx-pas status" details.
from opx-cps.
Here's opx-pas
status from a recent example:
root@NST-OPX-TEST-002:~# service opx-pas status
● opx-pas.service - This PAS service is to initialize platform.
Loaded: loaded (/lib/systemd/system/opx-pas.service; enabled)
Active: active (running) since Tue 2018-02-27 08:37:03 UTC; 28min ago
Main PID: 613 (opx_pas_service)
CGroup: /system.slice/opx-pas.service
└─613 /usr/bin/opx_pas_service
Feb 27 08:37:11 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM PPID not programmed
Feb 27 08:37:11 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM part number not programmed
Feb 27 08:37:11 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM service tag not programmed
Feb 27 09:05:21 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM vendor name not programmed
Feb 27 09:05:21 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM product name not programmed
Feb 27 09:05:21 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM hardware revision not programmed
Feb 27 09:05:21 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM platform name not programmed
Feb 27 09:05:21 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM PPID not programmed
Feb 27 09:05:21 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM part number not programmed
Feb 27 09:05:21 NST-OPX-TEST-002 opx_pas_service[613]: [PAS:chassis_resp_set], Chassis EEPROM service tag not programmed
from opx-cps.
I have discovered that when CPS gets into this state, it can be recovered by service opx-ip restart
.
However there is no indication that opx-ip
was unhappy:
root@NST-OPX-TEST-002:~# service opx-ip status
● opx-ip.service - IP address handler
Loaded: loaded (/lib/systemd/system/opx-ip.service; enabled)
Active: active (running) since Tue 2018-02-27 11:43:49 UTC; 2min 42s ago
Main PID: 5081 (python)
CGroup: /system.slice/opx-ip.service
└─5081 /usr/bin/python -u /usr/bin/base_ip.py
Feb 27 11:43:48 NST-OPX-TEST-002 systemd[1]: Starting IP address handler...
Feb 27 11:43:49 NST-OPX-TEST-002 systemd[1]: Started IP address handler.
Feb 27 11:44:36 NST-OPX-TEST-002 python[5081]: Attempting to add address 192.170.0.0/31 e101-001-0
Hope that helps.
from opx-cps.
I can now reliably reproduce this condition, by using ansible to install a debian package - any package, even one that does not exist! - on the switch.
On some other host (from which you will run ansible):
- create a
hosts
file containing the IP address of the OPX switch, eg
echo 172.17.31.206 > hosts
- run this command:
ansible -i hosts all -m apt -a "deb=foo.deb state=present"
You should now find that attempts to configure IP addresses through CPS fail.
It seems as though something happened during the ansible command that breaks the registration from /usr/bin/base_ip.py
. When we attempt to configure IP addresses, trans_cb()
in that file is simply not executed.
from opx-cps.
Somehow I could not reproduce the issue having followed the above procedure, hence need more info to troubleshoot further.
from opx-cps.
Sure - what do you need?
from opx-cps.
Can you provide the logs related to the issue (error logs, registration break, cps command failure ,etc), that may help.
from opx-cps.
I really don't have any useful logs:
- I can tell that
trans_cb()
is not called because I added an additionalprint('hello')
on entry to it - and when in the broken state we do not see that log - All I get from
configure_ip_address.py
is this runtime exception
What else do you need? Are there likely to be other useful logs somewhere else? Perhaps you'll want to provide debug versions of some code, making additional logs?
from opx-cps.
Assuming you are able to reproduce the issue consistently, can you please check if below changes fix the issue. Recently we have encountered some EPIPE randomly and following code helps to avoid that.
You can make the change on the target (/usr/bin/base_ip.py).
Diff:
Diff.txt
File attached with above change (please rename the file to base_ip.py) :
from opx-cps.
The code that you provided tries to log the undefined variable vrf_name
. As a result, no IP configuration succeeds at all. I guess you forgot to test this!
However, if I fix that so that we only try to log addr
and name
- then that does seem to solve the problem. So this fix is basically looking good.
I see that there are still a handful of uses of print
in this code. Will you want to fix those too?
from opx-cps.
Great to know that the change fixes the issue. We will commit this change. We may not need to remove other print.
I actually tested without restarting opx-ip service, hence it worked :-) Sorry for that.
As you have mentioned this is the updated file
base_ip.py.txt
from opx-cps.
Fix in review
https://review.openswitch.net/#/c/14580/
from opx-cps.
Closing this, please reopen for any further issue.
from opx-cps.
Related Issues (20)
- Statistics values always return / show 0 HOT 8
- How to apply low-level switch configuration via CPS API HOT 1
- Can't generate opx cps document HOT 5
- C SDK example HOT 7
- an example (CPS application )of configuring the MAC address forwarding database HOT 1
- CPS Get OID Call On S4000 Platform Running Out Of Memory Before Listing 16K Routes HOT 3
- Improve cps_get_oid.py script
- Simple ACL fails
- can't save L2 configuration with cps_set_oid.py HOT 7
- Query interface configuration by type broken HOT 8
- Server (untagged) unable to ping SVI gateway hosted of ToR HOT 7
- Setting physical address on LAG interface doesn't always stick HOT 14
- Result pollution in latest CPS HOT 2
- Packet with TTL=1 is not handled properly in S4248FB-ON HOT 1
- Object registration in python succeeds, but handlers are never invoked HOT 3
- Not able to remove registration added with cps_api_event_thread_reg HOT 1
- Development packages do not point to correct binary packages HOT 1
- opx-base-model fails to build due to problem in yin_utils.py HOT 1
- get cps python callback never returns any data HOT 2
- opx-cps service must be of type "Notify"
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from opx-cps.