GithubHelp home page GithubHelp logo

Deadlock in tf2's buffer_core about geometry2 HOT 6 CLOSED

wjwwood avatar wjwwood commented on August 18, 2024
Deadlock in tf2's buffer_core

from geometry2.

Comments (6)

wjwwood avatar wjwwood commented on August 18, 2024

The problem function is tf2::BufferCore::walkToTopParent, which is private and only called in two places:

So I made this diff:

diff --git a/tf2/src/buffer_core.cpp b/tf2/src/buffer_core.cpp
index 3151ebb..aef66b2 100644
--- a/tf2/src/buffer_core.cpp
+++ b/tf2/src/buffer_core.cpp
@@ -354,7 +354,7 @@ int BufferCore::walkToTopParent(F& f, ros::Time time, CompactFrameID target_id,
       {
         std::stringstream ss;
         ss << "The tf tree is invalid because it contains a loop." << std::endl
-           << allFramesAsString() << std::endl;
+           << allFramesAsStringNoLock() << std::endl;
         *error_string = ss.str();
       }
       return tf2_msgs::TF2Error::LOOKUP_ERROR;
@@ -404,7 +404,7 @@ int BufferCore::walkToTopParent(F& f, ros::Time time, CompactFrameID target_id,
       {
         std::stringstream ss;
         ss << "The tf tree is invalid because it contains a loop." << std::endl
-           << allFramesAsString() << std::endl;
+           << allFramesAsStringNoLock() << std::endl;
         *error_string = ss.str();
       }
       return tf2_msgs::TF2Error::LOOKUP_ERROR;

Such that walkToTopParent never calls a locking function, since everyone who calls it is already locked.

This seems to workaround my issue, but the other option might be to use a recursive lock?

from geometry2.

wjwwood avatar wjwwood commented on August 18, 2024

I just ran into a second one, here is the backtrace:

0   __lll_lock_wait /lib/x86_64-linux-gnu/libpthread.so.0   0   0x7f093804f89c  
1   _L_lock_858 /lib/x86_64-linux-gnu/libpthread.so.0   0   0x7f093804b065  
2   pthread_mutex_lock  /lib/x86_64-linux-gnu/libpthread.so.0   0   0x7f093804aeba  
3   boost::mutex::lock  mutex.hpp   52  0x7f093b363167  
4   boost::unique_lock<boost::mutex>::lock  locks.hpp   412 0x7f093b364aac  
5   boost::unique_lock<boost::mutex>::unique_lock   locks.hpp   290 0x7f093b3649e5  
6   tf2::BufferCore::allFramesAsString  buffer_core.cpp 809 0x7f0932258ede  
7   tf2::BufferCore::getLatestCommonTime    buffer_core.cpp 920 0x7f0932259469  
8   tf2::BufferCore::walkToTopParent<tf2::TransformAccum>   buffer_core.cpp 302 0x7f0932261fff  
9   tf2::BufferCore::lookupTransform    buffer_core.cpp 549 0x7f0932258052  
10  tf::Transformer::lookupTransform    tf.cpp  243 0x7f093733acb1  
11  tf::Transformer::transformPose  tf.cpp  497 0x7f093733c525  
12  rviz::FrameManager::transform   frame_manager.cpp   250 0x7f093b3612c2  
13  rviz::FrameManager::getTransform    frame_manager.cpp   215 0x7f093b360e47  
14  rviz::TFLinkUpdater::getLinkTransforms  tf_link_updater.cpp 59  0x7f093b3f5baa  
15  rviz::Robot::update robot.cpp   733 0x7f093b3f1511  
16  rviz::RobotModelDisplay::update robot_model_display.cpp 219 0x7f08eebedcba  
17  rviz::DisplayGroup::update  display_group.cpp   234 0x7f093b35b24a  
18  rviz::VisualizationManager::onUpdate    visualization_manager.cpp   315 0x7f093b44448f  
19  rviz::VisualizationManager::qt_static_metacall  moc_visualization_manager.cxx   65  0x7f093b45ecfb  
20  QMetaObject::activate(QObject*, QMetaObject const*, int, void**)    /usr/lib/x86_64-linux-gnu/libQtCore.so.4    0   0x7f0935fb3281  
... <More>              

Looks like it is still coming from walkToTopParent, but this time the deadlock is occurring in getLatestCommonTime, which also calls allFramesAsString.

from geometry2.

wjwwood avatar wjwwood commented on August 18, 2024

getLatestCommonTime is more of a problem because it called from both public and private functions...

from geometry2.

wjwwood avatar wjwwood commented on August 18, 2024

Though both the public and private uses have this comment above it:

    ros::Time latest_time;
    // TODO: This is incorrect, but better than nothing.  Really we want the latest time for
    // any of the frames
    getLatestCommonTime(req.target_id, req.source_id, latest_time, 0);

from geometry2.

tfoote avatar tfoote commented on August 18, 2024

getLatestCommonTime is private _getLatestCommonTime is public and deprecated. I'll add the lock to it and call the no lock inside the private method.

from geometry2.

wjwwood avatar wjwwood commented on August 18, 2024

+1

from geometry2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.