ros / diagnostics Goto Github PK
View Code? Open in Web Editor NEWPackages related to gathering, viewing, and analyzing diagnostics data from robots.
Home Page: https://index.ros.org/p/diagnostics/
License: Other
Packages related to gathering, viewing, and analyzing diagnostics data from robots.
Home Page: https://index.ros.org/p/diagnostics/
License: Other
Robot Operating System (ROS) =============================================================================== ROS is a meta-operating system for your robot. It provides language-independent and network-transparent communication for a distributed robot control system. Installation Notes ------------------ For full installation instructions, including system prerequisites and platform-specific help, see: http://wiki.ros.org/ROS/Installation
While looking for the tuple class it somehow assumes that it should include <tr1/tuple>
although Clang implements C++11 with tuple being a part of the standard library (so it should be just <tuple>
).
Sorry for not gathering more info from the header file for now, I will look into it when I have time.
The issue is a complicated one, but here goes.
I first noticed this issue when I saw that diagnostics messages from arm64 machines sometimes arrive, but only infrequently (between 0% and 10% of the time), and eventually the following message comes from the Diagnostic Aggregator running on amd64 and it appears all messages are dropped.
[ERROR] [1564529964.093868659 /diag_agg] [/tmp/binarydeb/ros-kinetic-roscpp-1.12.14/src/libros/transport_publisher_link.cpp:TransportPublisherLink::onMessageLength:175]: a message of over a gigabyte was predicted in tcpros. that seems highly unlikely, so I'll assume protocol synchronization is lost.
At first, I though it was Endianness, but all machines are Little Endian. There are also C++ nodes which are able to communicate with each other properly on all machines. It also appears as though this only happens with the Diagnostic Updater, not any other topic.
After this, I started running a test: just running roscore and a single test node with the following code.
#include <ros/ros.h>
#include <diagnostic_updater/diagnostic_updater.h>
#include <diagnostic_updater/update_functions.h>
int main(int argc, char** argv)
{
ros::init(argc, argv, "updater_node");
double rate = 10.;
diagnostic_updater::Updater updater;
diagnostic_updater::FrequencyStatus frequency_status(
diagnostic_updater::FrequencyStatusParam(&rate, &rate)
);
updater.setHardwareID("none");
updater.add(frequency_status);
ros::Rate r(rate);
while (ros::ok())
{
frequency_status.tick();
updater.update();
ros::spinOnce();
r.sleep();
}
}
Everything works fine with amd64, but on arm64, the above issues happen. Additionally, the node just eventually crashes with std::bad_alloc
. Here is the backtrace and relevant message that was published (as it looks like the error was with serialization)
#13 0x0000000000419084 in diagnostic_updater::Updater::publish (this=this@entry=0x7fffffecc0, status_vec=std::vector of length 1, capacity 1 = {...})
at /opt/ros/kinetic/include/diagnostic_updater/diagnostic_updater.h:547
547 publisher_.publish(msg);
(gdb) list
542 node_name_.substr(1) + std::string(": ") + iter->name;
543 }
544 diagnostic_msgs::DiagnosticArray msg;
545 msg.status = status_vec;
546 msg.header.stamp = ros::Time::now(); // Add timestamp for ROS 0.10
547 publisher_.publish(msg);
548 }
549
550 /**
551 * Publishes on /diagnostics and reads the diagnostic_period parameter.
(gdb) p msg
$7 = {header = {seq = 0, stamp = {<ros::TimeBase<ros::Time, ros::Duration>> = {sec = 1564612119, nsec = 323701618}, <No data fields>}, frame_id = ""}, status = std::vector of length 1, capacity 1 = {{level = 0 '\000', name = "updater_node_1564612117153179612: Frequency Status", message = "Desired frequency met", hardware_id = "none",
values = std::vector of length 7, capacity 7 = {{key = "Events in window", value = "22"}, {key = "Events since startup", value = "22"}, {key = "Duration of window (s)", value = "2.100290"}, {key = "Actual frequency (Hz)", value = "10.474742"}, {key = "Target frequency (Hz)", value = "10.000000"}, {
key = "Minimum acceptable frequency (Hz)", value = "9.000000"}, {key = "Maximum acceptable frequency (Hz)", value = "11.000000"}}}}}
(gdb) bt
#0 0x0000007fb7a04528 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x0000007fb7a059e0 in __GI_abort () at abort.c:89
#2 0x0000007fb7bde254 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#3 0x0000007fb7bdbdc4 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#4 0x0000007fb7bdbe10 in std::terminate() () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#5 0x0000007fb7bdc0d4 in __cxa_throw () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#6 0x0000007fb7bdc6d8 in operator new(unsigned long) () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#7 0x000000000040f840 in ros::serialization::serializeMessage<diagnostic_msgs::DiagnosticArray_<std::allocator<void> > > (message=...) at /opt/ros/kinetic/include/ros/serialization.h:795
#8 0x000000000040d60c in boost::_bi::list1<boost::reference_wrapper<diagnostic_msgs::DiagnosticArray_<std::allocator<void> > const> >::operator()<ros::SerializedMessage, ros::SerializedMessage (*)(diagnostic_msgs::DiagnosticArray_<std::allocator<void> > const&), boost::_bi::list0> (f=<optimized out>, a=<synthetic pointer>...,
this=<optimized out>) at /usr/include/boost/function/function_template.hpp:129
#9 boost::_bi::bind_t<ros::SerializedMessage, ros::SerializedMessage (*)(diagnostic_msgs::DiagnosticArray_<std::allocator<void> > const&), boost::_bi::list1<boost::reference_wrapper<diagnostic_msgs::DiagnosticArray_<std::allocator<void> > const> > >::operator() (this=<optimized out>) at /usr/include/boost/bind/bind.hpp:893
#10 boost::detail::function::function_obj_invoker0<boost::_bi::bind_t<ros::SerializedMessage, ros::SerializedMessage (*)(diagnostic_msgs::DiagnosticArray_<std::allocator<void> > const&), boost::_bi::list1<boost::reference_wrapper<diagnostic_msgs::DiagnosticArray_<std::allocator<void> > const> > >, ros::SerializedMessage>::invoke (
function_obj_ptr=...) at /usr/include/boost/function/function_template.hpp:138
#11 0x0000007fb7efc8b0 in ros::TopicManager::publish(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, boost::function<ros::SerializedMessage ()> const&, ros::SerializedMessage&) () from /opt/ros/kinetic/lib/libroscpp.so
#12 0x000000000040fe14 in ros::Publisher::publish<diagnostic_msgs::DiagnosticArray_<std::allocator<void> > > (this=this@entry=0x7fffffee88, message=...) at /usr/include/c++/8/new:169
#13 0x0000000000419084 in diagnostic_updater::Updater::publish (this=this@entry=0x7fffffecc0, status_vec=std::vector of length 1, capacity 1 = {...}) at /opt/ros/kinetic/include/diagnostic_updater/diagnostic_updater.h:547
#14 0x000000000040c5ac in diagnostic_updater::Updater::force_update (this=0x7fffffecc0) at /opt/ros/kinetic/include/diagnostic_updater/diagnostic_updater.h:440
#15 diagnostic_updater::Updater::update (this=0x7fffffecc0) at /opt/ros/kinetic/include/diagnostic_updater/diagnostic_updater.h:390
#16 main (argc=<optimized out>, argv=<optimized out>) at /ws/src/test/src/updater_node.cpp:20
It crashes after ~30 seconds but seems to do so more quickly if multiple of the nodes are running.
As specified in the title, this only happens with GCC optimization level 3 (compiling with -O3
or CMAKE_BUILD_TYPE=Release
). I tried with -O2
and it appears to work fine.
I doubt this can be easily fixed and I'm not 100% sure if the issue is within this repo or ros_comm, but any ideas would be greatly appreciated. We really would like to use -O3
throughout our code for improved performance.
The current implementation of diagnostic_updater::Updater
reads the parameter diagnostic_period
from the parameter server without checking if it exists. If it does not exist, the getParamCached
method returns 0.0 and update will run every time.
Though very rarely, I've seen /diagnostic_agg msgs like below. Look at "level". Level of /Devices/IMU is 2 (ie. ERROR) but all of its sub devices show level 0 (OK).
(At Willow, you may be able to reproduce with prl).
level: 2
name: /Devices/IMU
message: Expected 4, found 3
hardware_id: ''
values:
-
key: imu_node: Calibration Status
value: Gyro is calibrated
-
key: imu_node: Frequency Status
value: Desired frequency met
-
key: imu_node: IMU Status
value: IMU is running
-
level: 0
name: /Devices/IMU/Calibration Status
message: Gyro is calibrated
hardware_id: Inertia-Link_4200-4132
values:
-
key: X bias
value: -0.0105521
-
key: Y bias
value: -0.0115087
-
key: Z bias
value: 0.00218787
-
level: 0
name: /Devices/IMU/Frequency Status
message: Desired frequency met
hardware_id: Inertia-Link_4200-4132
values:
-
key: Events in window
value: 503
-
key: Events since startup
value: 4802787
-
key: Duration of window (s)
value: 5.029089
-
key: Actual frequency (Hz)
value: 100.018110
-
key: Target frequency (Hz)
value: 100.000000
-
key: Minimum acceptable frequency (Hz)
value: 95.000000
-
key: Maximum acceptable frequency (Hz)
value: 105.000000
-
level: 0
name: /Devices/IMU/IMU Status
message: IMU is running
hardware_id: Inertia-Link_4200-4132
values:
-
key: Device
value: /etc/ros/sensors/imu
-
key: TF frame
value: imu_link
-
key: Error count
value: 0
-
key: Excessive delay
value: 0
diagnostic from debian ros-groovy-diagnostic-aggregator/precise uptodate 1.7.7-0precise-20121205-0830-+0000
As part of our Python 3 migration, the add_analyzer
script has come up as a nuisance which would be nice to avoid. Given that the rest of this package is C++, how would we feel about rewriting the script to be a small binary and parameterizing the service name?
The main proviso is that it would still need to shell out to rosparam path/to/my.yaml /namespace
to perform the yaml loading, so the startup cost of running that Python program would be paid, but not the memory cost of a long-running Python process.
If there's interest in this, I can send a PR.
Memory size should probably not be a major consideration in this, but there is a slight savings— the python process at idle has a vsize of around 1MB, while the C++ version at idle is 400kb.
Hey Austin,
now with #99 in, can you please do a new release for kinetic and melodic?
Thanks a lot!
Jochen
Looking at TimeStampStatus
and FrequencyStatus
code (and testing with our hardware), if you use the TopicDiagnostic
to diagnose a slow topic (0.3 Hz in our case), then the diagnostic fails.
At the time windows a message has arrived, it correctly reports on the frequency and all of this, but at the other time windows when no message has appeared, the diagnostic reports ERROR with No events recorded. No data since last update.
.
This is obviously true, but it is not an error state.
AnalyzerGroup is using deprecated pluglinlib code
/home/jbohren/versioned/ros/maintain_catkin/ws/src/diagnostics/diagnostic_aggregator/src/analyzer_group.cpp: In member function ‘virtual bool diagnostic_aggregator::AnalyzerGroup::init(std::string, const ros::NodeHandle&)’:
/home/jbohren/versioned/ros/maintain_catkin/ws/src/diagnostics/diagnostic_aggregator/src/analyzer_group.cpp:122:62: warning: ‘T* pluginlib::ClassLoader<T>::createClassInstance(const string&, bool) [with T = diagnostic_aggregator::Analyzer, std::string = std::basic_string<char>]’ is deprecated (declared at /opt/ros/groovy/include/pluginlib/class_loader_imp.h:236) [-Wdeprecated-declarations]
Currently http://docs.ros.org/groovy/api/diagnostic_updater/html/index.html links to the example.cpp
file from rosconsole.
The reference to example.cpp
is not unique since other packages which this package depends on also contain a file with that name. Therefore you need to reference the file with its relative location from the manifest.dox
file: https://github.com/ros/diagnostics/blob/groovy-devel/diagnostic_updater/mainpage.dox#L25
Example uses of these classes can be found in \ref src/example.cpp.
I just got this running sensors_monitor.py
on a ROS kinetic / Ubuntu Xenial machine:
[ERROR] [1507561973.045543]: Unable to process lm-sensors data
[ERROR] [1507561973.047346]: Traceback (most recent call last):
File "/etc/robot/ros/lib/diagnostic_common_diagnostics/sensors_monitor.py", line 189, in monitor
for sensor in parse_sensors_output(get_sensors()):
File "/etc/robot/ros/lib/diagnostic_common_diagnostics/sensors_monitor.py", line 156, in parse_sensors_output
s = parse_sensor_line(line)
File "/etc/robot/ros/lib/diagnostic_common_diagnostics/sensors_monitor.py", line 109, in parse_sensor_line
[sensor.name, sensor.type] = name.rsplit(" ",1)
ValueError: need more than 1 value to unpack
The output of the sensors
command is:
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +36.0°C (high = +84.0°C, crit = +100.0°C)
Core 0: +33.0°C (high = +84.0°C, crit = +100.0°C)
Core 1: +34.0°C (high = +84.0°C, crit = +100.0°C)
Core 2: +32.0°C (high = +84.0°C, crit = +100.0°C)
Core 3: +30.0°C (high = +84.0°C, crit = +100.0°C)
nct6106-isa-0290
Adapter: ISA adapter
in0: +0.72 V (min = +0.00 V, max = +1.74 V)
in1: +1.66 V (min = +0.00 V, max = +2.04 V)
in2: +3.41 V (min = +0.00 V, max = +4.08 V)
in3: +3.33 V (min = +0.00 V, max = +4.08 V)
in4: +0.63 V (min = +0.00 V, max = +2.04 V)
in5: +1.66 V (min = +0.00 V, max = +2.04 V)
in6: +1.70 V (min = +0.00 V, max = +2.04 V)
in7: +3.07 V (min = +0.00 V, max = +4.08 V)
in8: +2.03 V (min = +0.00 V, max = +4.08 V)
fan1: 4545 RPM (min = 0 RPM)
fan2: 2760 RPM (min = 0 RPM)
fan3: 0 RPM (min = 0 RPM)
SYSTIN: +38.0°C (high = +0.0°C, hyst = +0.0°C) ALARM
(crit low = +127.0°C, crit = +127.0°C) sensor = thermal diode
AUXTIN: -12.0°C (high = +80.0°C, hyst = +75.0°C)
(crit low = +127.0°C, crit = +127.0°C) sensor = thermal diode
PECI Agent 0: +36.0°C (high = +80.0°C, hyst = +75.0°C)
(crit low = +127.0°C, crit = +127.0°C)
PECI Agent 1: +0.0°C (high = +80.0°C, hyst = +75.0°C)
(crit low = +127.0°C, crit = +127.0°C)
PCH_CHIP_TEMP: +0.0°C
PCH_CPU_TEMP: +0.0°C
intrusion0: ALARM
beep_enable: disabled
Looks like the parser cannot read some lines.
The Struct is storing pointers to the min_freq_ and max_freq_ that are being accessed from the FrequencyStatus class using scoped_locks protecting only the object itself but not the data that can be altered using the struct.
Also, the data are not checked prior being dereferenced, but I guess this is not as important since it will most likely cause problems at the start time compared to the thread safety issue that could cause problems at any point of the lifetime of the application.
I tried the modificated version of official example for self_test, but a pure virtual call is occured during the destruction of self_test object.
The modified main function is:
185│ main(int argc, char** argv)
186│ {
187│ ros::init(argc, argv, "my_node");
188│
189│ MyNode *n = new MyNode();
190│
191│ //n.spin();
192│
193├> delete n;
194│
195│ return(0);
196│ }
the problem is occured in the line no 193.
(gdb)
pure virtual method called
terminate called without an active exception
Program received signal SIGABRT, Aborted.
0x00007ffff16e7cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff16e7cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007ffff16eb0d8 in __GI_abort () at abort.c:89
#2 0x00007ffff1ff36b5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff1ff1836 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007ffff1ff1863 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007ffff1ff233f in __cxa_pure_virtual () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007ffff4715adc in ros::ServicePublication::drop() () from /opt/ros/indigo/lib/libroscpp.so
#7 0x00007ffff4782d20 in ros::ServiceManager::unadvertiseService(std::string const&) () from /opt/ros/indigo/lib/libroscpp.so
#8 0x00007ffff4728fe4 in ros::ServiceServer::Impl::unadvertise() () from /opt/ros/indigo/lib/libroscpp.so
#9 0x00007ffff4729087 in ros::ServiceServer::Impl::~Impl() () from /opt/ros/indigo/lib/libroscpp.so
#10 0x00007ffff4729662 in boost::detail::sp_counted_impl_p<ros::ServiceServer::Impl>::dispose() () from /opt/ros/indigo/lib/libroscpp.so
#11 0x00007ffff4729569 in ros::ServiceServer::~ServiceServer() () from /opt/ros/indigo/lib/libroscpp.so
#12 0x00000000004264ee in self_test::TestRunner::~TestRunner (this=0x6583d0) at /opt/ros/indigo/include/self_test/self_test.h:68
#13 0x0000000000423dee in MyNode::~MyNode (this=0x6583d0) at /home/krz/catkin_ws/src/ethon/ethon_node/src/rower_test.cpp:41
#14 0x0000000000422efa in main (argc=1, argv=0x7fffffffd9b8) at /home/krz/catkin_ws/src/ethon/ethon_node/src/rower_test.cpp:193
Hi,
In the diagnostic_updater::Updater::add callback function ,if assign an value to diagnostic_updater::DiagnosticStatusWrapper &stat, like below.
the Hardware id and task name will be empty in rqt_robot_monitor.
I have this problem in Ubuntu 14.04 and ROS Indigo.
example:
class DummyClass
{
public:
DummyClass() {
my_stat.setHardwareID("none");
my_stat.add("test",1);
}
produce_diagnostics(diagnostic_updater::DiagnosticStatusWrapper &stat)
{
stat = my_stat;
}
diagnostic_updater::DiagnosticStatusWrapper my_stat;
};
main(){
DummyClass dc;
ros::NodeHandle nh;
diagnostic_updater::Updater;
updater.add("Method updater", &dc, &DummyClass::produce_diagnostics);
while (nh.ok())
{
ros::Duration(0.1).sleep();
updater.update();
}
return 0;
}
diagnostic_aggregator initialize analyzers from rosparam and it cannot add new analyzing rule after initialization. If it supports addition of rules, it would be very useful.
It means that if we want to add new rule, we need to run another aggregator and robot_monitor with remapping like /diagnostics -> /diagnostics_perception
.
I'm using ROS Indigo (Debian packages) on Ubuntu 14.04 and I'm trying to use the discard_stale parameter for a number of GenericAnalyzers. However, even when an analyzer has the discard_stale parameter set to true, the Stale status message appears and persists in the rqt_robot_monitor display.
I have attached a screen shot rqt_robot_monitor showing Stale state of the (non-existent) Base Controller. And here is my diagnostics.yaml file:
pub_rate: 1.0 # Optional
base_path: '' # Optional, prepended to all diagnostic output
analyzers:
pub_frequency:
type: GenericAnalyzer
path: 'Pub Frequency'
discard_stale: true
timeout: 5.0
discard_stale: true
contains: 'freq'
sensors:
type: GenericAnalyzer
path: 'Sensors'
discard_stale: true
timeout: 5.0
contains: '_sensor'
joints:
type: GenericAnalyzer
path: 'Joints'
discard_stale: true
timeout: 1.0
regex: '.*_joint$'
base_controller:
type: GenericAnalyzer
path: 'Base Controller'
discard_stale: true
timeout: 1.0
contains: 'base_controller'
hydro: Cannot load information on name: diagnostic_updater, distro: hydro, which means that it is not yet in our index. Please see this page for information on how to submit your repository to our index.
groovy CMakeLists.txt of diagnostic_updater does not call catkin_python_setup()
macro.
hydro version does call catkin_python_setup()
macro.
It looks like all of the dependencies for diagnostics
are available in Melodic, so it would be great to get this released. Thanks in advance!
can you please create new branch for ros2 development so that , we can make PR request for ros 2 migrated package
I am not sure, if this is a real issue or something I am doing wrong.
My config is as follows:
pub_rate: 1.0
base_path: ""
analyzers:
lasers:
type: GenericAnalyzer
path: Laser
find_and_remove_prefix: 'Laser'
What I see in my robot_monitor is that all information about laser is published inside the '/' namespace.
When I then change the config to this: (please note the change from Laser
to laser
)
pub_rate: 1.0
base_path: ""
analyzers:
lasers:
type: GenericAnalyzer
path: Laser
find_and_remove_prefix: 'laser'
Everything gets published as expected, saying the category is called Laser
and all messages are trimmed without the laser
prefix.
Can anybody confirm that?
The executable files run_selftest
and selftest_rostest
mentioned in (http://wiki.ros.org/self_test) should be installed in the CMakeLists.txt, so that they can be used via rosrun.
Hi @trainman419 and diagnostics maintainers,
As you may know the next ROS release Lunar Loggerhead is around the corner 🎉
Is it possible to release diagnostics
on ROS Lunar? Being a low level package this is currently preventing many repositories from being released.
If you don't have time to make a new release, please release the current kinetic version into Lunar by running bloom-release diagnostics -r lunar -t lunar --new-track
.
Thanks!
When adding diagnostics at runtime using either the script manual_diag.py given here http://wiki.ros.org/diagnostics/Tutorials/Adding%20Analyzers%20at%20Runtime or add_analyzers, the aggregator fails to properly update the group and leaves the added diagnostics under /Other/ even after wating for some time. The add_analyzers reports that the service call succeeded.
Looking at the logs, the aggregator gives:
[ WARN] [1519235533.166823508]: Bond for namespace /startup_analyzers was broken
[ WARN] [1519235533.171578828]: Broken bond tried to remove an analyzer which didn't exist.
I don't see any reasons why as the node adding diags (/startup_analyzers) is still running and I can see /diagnostics_agg/bond pub and sub when rostopic info. I'm running all of this on localhost.
The problem can be reproduced when following the tutorial http://wiki.ros.org/action/fullsearch/diagnostics/Tutorials/Adding%20Analyzers%20at%20Runtime?action=fullsearch&context=180&value=linkto%3A%22diagnostics%2FTutorials%2FAdding+Analyzers+at+Runtime%22#Overview at least on Kinetic@Ubuntu 16.04
There is only one node adding diags. Any idea or is that directly a Bond issue ?
The HeaderlessTopicDiagnostic
object registers a callback to the Updater
class that is not removed, when HeaderlessTopicDiagnostic
goes out of scope. The next update()
then leads to a segfault.
Here is code to reproduce the issue:
using namespace diagnostic_updater;
Updater updater;
double min_freq{0.}, max_freq{std::numeric_limits<double>::infinity()};
auto diag = std::make_unique<TopicDiagnostic>( "topic_name", updater,
FrequencyStatusParam(&min_freq, &max_freq, 0, 5), TimeStampStatusParam(0, 1));
updater.force_update(); // ok
diag.reset();
updater.force_update(); // segfault
I'm not sure about the python version, but the diagnostics_updater in c++ crashes with
std::runtime_error "Time is out of dual 32-bit range"
if the diagnostic_period parameter is < 1
The reason lies in diagnostics_updater::update_diagnostic_period()
where ros::Duration
overflows if period_
< old_period
:
https://github.com/ros/diagnostics/blob/groovy-devel/diagnostic_updater/include/diagnostic_updater/diagnostic_updater.h#L520
This bug cost me a few hours some years ago when I first ran into it (and didn't get to the bottom of it). It just cost me a few more, when a colleague reviewing my code that reads DiagnosticStatus messages in my custom DiagnosticAnalyzer asked why the name of the message I was matching to was missing the first character.
The problem is at line 542 of diagnostic_updater.h. It appears there was an original intent to remove the leading "/" from the node name, which provides the default intializer for the first part of the message name. However, if I provide a custom initializer name for the diagnostic_updater, line 542 removes its first character.
One can ask, "why specify the name"? No special reason - I didn't realize the normal procedure is to just instantiate the updater without arguments and it would choose the node name for the first part of diagnostic_updater message names. There's no documentation that says how the message name is generated. Using the default constructor initializers will be my fix.
This is not a major bug, but at a minimum it ought to be documented. I think it's probably too dangerous to change the behavior - it would break the hack I previously put in by adding a space in front of the updater name. A less dangerous fix might be to change the default intializer to ros::this_node.getName().substr(1) and remove the .substr at line 542, but that would still break hacks that pre-padded the name.
It is disappointing that there's no tutorial - only an example.cpp that is quite unprofessional in its variable/method names & data (Ex. stat.add("Stupidicity of this updater", 1000.);). I would be willing to turn example.cpp into a C++ tutorial, but would welcome a reviewer, or maybe a python co-contributor. example.cpp does seem to have most of the words you'd want in a wiki tutorial. This is where we could explain the bug/feature in the constructor. Also, I've seen somewhere where compilable code had tutorial text embedded in it such that you could check that it builds and also have it serve as a tutorial, but I don't remember where.
Thoughts?
Viewing aggregated diagnostics can be done using rqt_robot_monitor application. This is great but requires launching RQT in a Xwindows session. Instead, the rosdiagnostic
command very quickly let you visualize the active diagnostic on any robot without requiring heavy duty Xwindow library.
When executing catkin_make -DCATKIN_ENABLE_TESTING=0
, it fails with:
Linking CXX executable (...)/diagnostics_ws/devel/lib/diagnostic_aggregator/analyzer_loader
/usr/bin/ld: cannot find -lgtest
collect2: error: ld returned 1 exit status
make[2]: *** [(...)/diagnostics_ws/devel/lib/diagnostic_aggregator/analyzer_loader] Error 1
diagnostic_aggregator in version 1.8.4 needs the gtest library, but this is only discovered for linking when testing is enabled in catkin.
For the Travis build of this repo, they currently seem to fail on
ImportError: "from catkin_pkg.package import parse_package" failed: No module named catkin_pkg.package
This Python package is in the python-catkin-pkg, which is a dependency of python-catkin, which is installed in the .travis-file.
Currently there are 2 PRs failing on that error:
Another recent PR does not fail on this error but fails already earlier: #70
The present behaviour is to only publish the aggregated diagnostics at the fixed rate (default 1Hz):
I believe that it would be better to trigger an immediate publish of the aggregated topic when a diagnostic transitions to WARN or ERROR. We have some error reporting mechanisms that currently have to subscribe to /diagnostics
, and could instead subscribe to /diagnostics_agg
if we knew that a) no error reports would be missed, and b) new error reports would be passed through immediately.
Thoughts?
Will you be able to release into Indigo soon? Looks like most of the hardware drivers are blocked here by diagnostic_updater.
Thanks!
I'm running some pre-release tests and noticed it is missing for some reason.
http://packages.ros.org/ros-shadow-fixed/ubuntu/pool/main/r/ros-kinetic-diagnostic-aggregator/
specifically,
amd64.deb for xenial
Some tests were not building properly (linker issue). I fixed that in a69d762. But some of the tests fail:
[ERROR] [1381942884.706085587]: No analyzers initialzed in AnalyzerGroup /analyzer_loader/analyzers
/data/code/hydro_catkin_ws/src/diagnostics/diagnostic_aggregator/test/analyzer_loader.cpp:56: Failure
Value of: analyzer_group.init(path, nh)
Actual: false
Expected: true
[ FAILED ] AnalyzerLoader.analyzerLoading (355 ms)
-- run_tests.py: execute commands
/usr/bin/cmake -E make_directory /data/code/hydro_catkin_ws/build/test_results/diagnostic_analysis
/usr/bin/nosetests -P --process-timeout=60 /data/code/hydro_catkin_ws/src/diagnostics/diagnostic_analysis/test/bag_csv_test.py --with-xunit --xunit-file=/data/code/hydro_catkin_ws/build/test_results/diagnostic_analysis/nosetests-test.bag_csv_test.py.xml
E
======================================================================
ERROR: Failure: ImportError (No module named diagnostic_analysis.exporter)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nose/loader.py", line 390, in loadTestsFromName
addr.filename, addr.module)
File "/usr/lib/python2.7/dist-packages/nose/importer.py", line 39, in importFromPath
return self.importFromDir(dir_path, fqname)
File "/usr/lib/python2.7/dist-packages/nose/importer.py", line 86, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
File "/data/code/hydro_catkin_ws/src/diagnostics/diagnostic_analysis/test/bag_csv_test.py", line 51, in <module>
from diagnostic_analysis.exporter import LogExporter
ImportError: No module named diagnostic_analysis.exporter
-------------------- >> begin captured logging << --------------------
rospy.topics: INFO: topicmanager initialized
--------------------- >> end captured logging << ---------------------
----------------------------------------------------------------------
Ran 1 test in 0.001s
FAILED (errors=1)
[ RUN ] DiagnosticUpdater.testFrequencyStatus
/data/code/hydro_catkin_ws/src/diagnostics/diagnostic_updater/test/diagnostic_updater_test.cpp:141: Failure
Value of: stat[1].level
Actual: '\x1' (1)
Expected: 0
within max frequency but reported error
/data/code/hydro_catkin_ws/src/diagnostics/diagnostic_updater/test/diagnostic_updater_test.cpp:142: Failure
Value of: stat[2].level
Actual: '\x1' (1)
Expected: 0
within min frequency but reported error
[ FAILED ] DiagnosticUpdater.testFrequencyStatus (521 ms)
[ RUN ] DiagnosticUpdater.testTimeStampStatus
/data/code/hydro_catkin_ws/src/diagnostics/diagnostic_updater/test/diagnostic_updater_test.cpp:166: Failure
Value of: stat[2].level
Actual: '\x1' (1)
Expected: 0
now not accepted
[ FAILED ] DiagnosticUpdater.testTimeStampStatus (0 ms)
and finally, self_test/no_id_selftest never ends... so I could not run the other tests...
As seen on ROS answers: http://answers.ros.org/question/199874/no-definition-of-libsensors4-dev-for-os-osx/
Doesn't build on OSX because libsensors4-dev isn't available on OSX.
@mitchellwills thoughts on how to fix this? It's easy enough to make the build process skip the libsensors node if libsensors isn't available, but we'd still have to figure out how to specify the dependency properly, because ROS doesn't support the notion of system-specific or optional dependencies.
Thoughts?
To determine the update frequency, the diagnostic_updater divides by the time since the last update here: https://github.com/ros/diagnostics/blob/indigo-devel/diagnostic_updater/include/diagnostic_updater/update_functions.h#L174
If the time has not changed since then (happend to me when playing a rosback with use_sim_time
), we divide by zero.
This is necessary for diagnostic updater to work right in a managed ROS 2 node.
Similar to what was done in ros2/geometry2#108
Hi,
Can you update hydro release for selft_test CMake so that it checks if Catkin test is enabled.
I need to cross compile selft_tes for usb_cam for Angstrom but I need it to skip the test.
Thanks!
Given the sum total of issues #16, #24, #27 and #28, it sounds like the way testing nodes are exported to downstream packages needs to change.
The best proposed solution I've heard (thanks @wjwwood ! ) is to install the sources for the analyzer_loader and selftest_rostest, and provide an explicit set of cmake macros which will compile and run them as needed. Once that's written, the docs will need to be updated; at least http://wiki.ros.org/self_test , http://wiki.ros.org/diagnostics/Tutorials/Creating%20a%20Diagnostic%20Analyzer and http://wiki.ros.org/diagnostic_aggregator
I've exhausted my budget of employer-funded time to work on this for the next few months. If someone needs this fixed urgently, they'll have to provide a pull request.
The Generic Analyser processes starts_with before slashes are replaced with spaces and remove_prefix after the slashes are replaced. This causes issues with the find_and_remove_prefix parameter as it is matched using the slash, but then the prefix is not removed because the slash does not match the space.
For example if a node is in a namespace and has the name /ns/mynode
and it generates a status message with the full name ns/mynode: myname
Then you would expect using the following to match and remove the prefix, but it does not
find_and_remove_prefix: 'ns/mynode: '
instead you must use
startswith: 'ns/mynode: '
remove_prefix: 'ns mynode: '
This can be demonstrated by either viewing the /diagnostics_agg topic or using the rqt_gui plugin
See https://github.com/ros/rosdistro/blob/master/releases/groovy.yaml#L71
If it's old or just wrong repo, I can pull request. Just need a confirmation.
see
ros/catkin#264
catkin_python_setup() and corresponding setup.py are missing
For an advanced usage of AnalyzerGroup
class, I need to reimplement some of its public methods, but the big problem is reimplementing its virtual methods while I don't have access to the main class member which is std::vector<boost::shared_ptr<Analyzer> > analyzers_
doesn't really make sense and kind of impossible.
Is there any specific reason for defining analyzers_
as private
and not protected
(or a getter function for it)?
Currently, as a quick-fix, I reimplemented the addAnalyzer
method and stored the analyzers in an additional std::vector<boost::weak_ptr<Analyzer>>
.
When add_analyzer leaves, it utilises the bond mechanism to shutdown the bond, triggering an unloading of the analyzers on the aggregator side.
Looks like the bondpy mechanism however, isn't reliably ensuring the aggregator gets triggered. This results in an error message when you reload the same analyzers that were not unloaded:
[ERROR] [WallTime: 1472626077.441982] add_analyzers did not add any analyzers to diagnostic aggregator: Requested load from namespace /diagnostics/navi_common_diagnostic_analyzers which is already in use
Difficult to reproduce with small tests. Right now I'm only getting it on a robot with alot of software running. Even there, it does not occur 100%.
Any way to, or interest in, being able to add key/value pairs to diagnostic status messages that are specified in the analyzer yaml file. I basically want to "tag" a group of diagnostic messages and use that information in some later processing.
Something like the following:
analyzers:
battery:
type: ...
path: ...
find_and_remove_prefix: ...
tags:
- key1: value1
- key2: value2
Is there interest in additional sensor type support for the sensor_monitor node. I have an implementation of a C++ libsensors based node (https://github.com/RIVeR-Lab/computer_sensors) that publishes diagnostic data. It also supports types other than temp and fan (does not currently support voltage explicitly) by adding their properties. The use of the api seems like it would be more reliable than parsing the output of sensors.
Would something like this be accepted in a pull request or would I be better of releasing a standalone package?
With a configuration like this
analyzers:
my_stale_item:
type: diagnostic_aggregator/GenericAnalyzer
path: Nonexistent1
find_and_remove_prefix: 'nonexistent1'
my_stale_item_that_should_be_ignored:
type: diagnostic_aggregator/GenericAnalyzer
path: Nonexistent2
find_and_remove_prefix: 'nonexistent2'
discard_stale: true
timeout: 5.0
and with no messages published on /diagnostics
, I would expect "Nonexistent1" to show up as stale and "Nonexistent2" to not show up in /diagnostics_agg
.
However, both are always present in /diagnostics_agg
.
Here's a minimal launch file to reproduce:
<launch>
<node pkg="diagnostic_aggregator" type="aggregator_node" name="diagnostic_aggregator">
<rosparam>
analyzers:
my_stale_item:
type: diagnostic_aggregator/GenericAnalyzer
path: Nonexistent1
find_and_remove_prefix: 'nonexistent1'
my_stale_item_that_should_be_ignored:
type: diagnostic_aggregator/GenericAnalyzer
path: Nonexistent2
find_and_remove_prefix: 'nonexistent2'
discard_stale: true
timeout: 5.0
</rosparam>
</node>
<node name="$(anon monitor)" pkg="rqt_robot_monitor" type="rqt_robot_monitor"/>
</launch>
As seen on ROS answers: http://answers.ros.org/question/185155/error-in-tutorial-creating-a-diagnostic-analyzer/
It isn't possible to follow this tutorial: http://wiki.ros.org/diagnostics/Tutorials/Creating%20a%20Diagnostic%20Analyzer because the test node isn't built or included in the binary package.
The export_csv.py
and sparse_csv.py
nodes are not installed with the apt builds of the diagnostic_analysis package.
The C++ diagnostic_updater does not accept a node handle in the constructor so it cannot be used correctly in a nodelet.
The same applies to the self_test package (the constructor takes a node handle, but then it is ignored).
Both packages should take a node handle and a private node handle so that names can be resolved correctly in a nodelet.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.