GithubHelp home page GithubHelp logo

LTO-related compilation problems about slim HOT 28 OPEN

messerlab avatar messerlab commented on June 11, 2024
LTO-related compilation problems

from slim.

Comments (28)

molpopgen avatar molpopgen commented on June 11, 2024 1

from slim.

molpopgen avatar molpopgen commented on June 11, 2024 1

from slim.

molpopgen avatar molpopgen commented on June 11, 2024 1

Also someone on Red Hat Enterprise Linux Server release 6.6 (Santiago), above. Seems like a pretty mixed bag. I don't know why it bites particular people and not others.

What may be happening is that more errors are being seen on CentOS/RHEL because they are commonly deployed on clusters. The Debian family (including Ubuntu and Pop OS) seem to find their home primarily on desktop/laptop/workstation setups, which may be more rare.

from slim.

signalogic avatar signalogic commented on June 11, 2024 1

For anyone who arrives here after hours of scouring stackoverflow and other forums, on Ubuntu 16.04 with ldd 2.23 or 2.24 we had to turn off lto (i.e. not use -flto in compile flags) or else we would see linker messages like "hidden symbol `our_sym' in /tmp/ccuYFfn5.ltrans4.ltrans.o is referenced by DSO". We tried changes to link order, applying attribute ((visibility ("default"))) to specific symbols, -Wl,-export_symbol, etc. We continuously test on gcc versions from 4.6.4 to 11.3 and so far only these versions of ldd have this issue (happened with gcc 6.2, 6.5, and 7.4)

from slim.

molpopgen avatar molpopgen commented on June 11, 2024

Not enough info. That is an old distro, so we need the compiler version. You may also be able to test on Travis?

from slim.

bhaller avatar bhaller commented on June 11, 2024

OK. I've alerted the user who reported the problem to the existence of this Github issue; hopefully he will reply here with more details. Thanks.

from slim.

jasongbragg avatar jasongbragg commented on June 11, 2024

This might be an obscure issue that will affect very few. If so, sorry to be a pest! It affected one machine I tried, and did not affect another.

The machine where SLiM 3 did not compile was ubuntu 16.04 (my error, sorry), with the following compiler info:

$ g++ --version
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

On another machine, running ubuntu 14.04 and c++ 4.8.4, everything compiled.

Some of the compiler errors that were observed:
/usr/bin/ar: CMakeFiles/gsl.dir/gsl/cblas/xerbla.c.o: plugin needed to handle lto object
/usr/bin/ar: CMakeFiles/gsl.dir/gsl/cblas/dgemv.c.o: plugin needed to handle lto object
/usr/bin/ar: CMakeFiles/gsl.dir/gsl/cblas/dtrmv.c.o: plugin needed to handle lto object

/usr/bin/ranlib: xerbla.c.o: plugin needed to handle lto object
/usr/bin/ranlib: dgemv.c.o: plugin needed to handle lto object
/usr/bin/ranlib: dtrmv.c.o: plugin needed to handle lto object

The following post seemed to replicate the errors, and suggested a solution, which seemed to work:
https://stackoverflow.com/questions/39236917/using-gccs-link-time-optimization-with-static-linked-libraries

from slim.

bhaller avatar bhaller commented on June 11, 2024

@molpopgen It's interesting that the errors seem to be in the GSL code. SLiM includes the GSL files that it needs within its own project; it does not have an external link dependency on the GSL. However, perhaps on Jason's 16.04 machine it was trying to link against his installed GSL somehow, and that didn't have the LTO compilation support that was needed? That's just a wild guess, I don't really understand any of this. :-> Anyway, since Jason is the only person who has reported this problem, and it's on a machine that he says is unusually configured, punting might be reasonable. If the fix makes sense to you, though, and seems harmless, then possibly it would make sense to take it...?

from slim.

molpopgen avatar molpopgen commented on June 11, 2024

from slim.

bhaller avatar bhaller commented on June 11, 2024

OK, I'll do that for now; it can always be reopened if someone else encounters this issue.

from slim.

bhaller avatar bhaller commented on June 11, 2024

Hi @molpopgen. Another user has reported the same issue, so I'm reopening this issue. This user is on Red Hat Enterprise Linux Server release 6.6 (Santiago), using g++ 7.3.0 – a different platform and a different compiler version than the previous user. His compile produced the same "plugin needed to handle lto object" errors as for the other user, and ultimately failed. He reports that more or less the same fix works for him:

SET(CMAKE_AR "gcc-ar")
SET(CMAKE_C_ARCHIVE_CREATE "<CMAKE_AR> qcs <LINK_FLAGS> ")
SET(CMAKE_C_ARCHIVE_FINISH true)

But I think you're right that this fix seems gcc-specific. Is there a way to enclose the fix lines inside something that says "do this only if the compiler is gcc"?

I don't know why this bites only certain people; I think the LTO stuff is working fine for most people. Nevertheless, it is proving to be a hassle. I've done some timing test, and it looks like the LTO fix is producing a measurable speedup, but an extremely small one – well under 1% for most models. So if there's not a simple fix here I'm tempted to pull the LTO change.

from slim.

molpopgen avatar molpopgen commented on June 11, 2024

from slim.

bhaller avatar bhaller commented on June 11, 2024

OK. Indeed, @petrelharp & co. are the ones who set up cmake for SLiM in the first place; I have only a vague understanding of it.

from slim.

petrelharp avatar petrelharp commented on June 11, 2024

Can you point me at what these LTO changes were?

from slim.

bhaller avatar bhaller commented on June 11, 2024

@petrelharp, here: #28

from slim.

douglasgscofield avatar douglasgscofield commented on June 11, 2024

Here to say, found the same thing in a large compute cluster compiling version 3.2.1. We're running 'CentOS Linux release 7.6.1810 (Core)' and I used cmake/3.13.2 and gcc/7.4.0 for compilation.

I got around it two different ways, either:

  • adding SET(CMAKE_AR "gcc-ar") and SET(CMAKE_RANLIB "gcc-ranlib") within the if that is true when LTO support is detected (in the top-level CMakeLists.txt, after line 74-ish), or
  • adding -DCMAKE_AR=$(which gcc-ar) -DCMAKE_RANLIB=$(which gcc-ranlib) to the cmake command

I'm sure these solutions are specific to using gcc, and I don't know CMake so I can't suggest alternative code for a pull request.

from slim.

bhaller avatar bhaller commented on June 11, 2024

@petrelharp, is anybody on your end working on this issue? If not, I think I might just remove the LTO stuff from the cmake file; the performance difference it makes is not large, and too many people are running into this issue. If we don't have a general fix for this, I think we should just pull LTO until such time as we do. @molpopgen, thoughts?

from slim.

molpopgen avatar molpopgen commented on June 11, 2024

from slim.

bhaller avatar bhaller commented on June 11, 2024

I have just commented out the LTO stuff for now. I plan to leave this issue open until such time as somebody figures out how to put the LTO stuff back in without breaking people's builds; it would be nice to have it enabled.

from slim.

gshamov avatar gshamov commented on June 11, 2024

First: I do have a similar problem building on CentOS7 with GCC 7.3.0 . LTO is detected by CMake, but then for every GSL and Table object I get messages about it. Like,

ranlib: vector.c.o: plugin needed to handle lto object

Second: why would you think that using a canned GSL with BLAS built from C sources is a good idea? Most HPC systems (like ours) do provide (m)any versions of GSL, and we have vendor-specified linear algebra libraries like OpenBLAS or MKL which will be easily an order of magnitude faster.

from slim.

molpopgen avatar molpopgen commented on June 11, 2024

@bhaller It is interesting to note that this issue seems only to be reported from CentOS users so far?

from slim.

bhaller avatar bhaller commented on June 11, 2024

@bhaller It is interesting to note that this issue seems only to be reported from CentOS users so far?

@molpopgen and ubuntu, as with the original reporter (see top), right? But I'm not a Linux person at all, so for all I know those are the same thing. :->

from slim.

bhaller avatar bhaller commented on June 11, 2024

First: I do have a similar problem building on CentOS7 with GCC 7.3.0 . LTO is detected by CMake, but then for every GSL and Table object I get messages about it. Like,

ranlib: vector.c.o: plugin needed to handle lto object

OK, thanks for the report. The LTO stuff is now disabled in the GitHub head version of SLiM, so that should build fine for you. That will be released as SLiM 3.3 soon.

Second: why would you think that using a canned GSL with BLAS built from C sources is a good idea? Most HPC systems (like ours) do provide (m)any versions of GSL, and we have vendor-specified linear algebra libraries like OpenBLAS or MKL which will be easily an order of magnitude faster.

I'm happy to discuss this – interesting question! – but it doesn't belong in this thread; perhaps you could open a new issue?

from slim.

molpopgen avatar molpopgen commented on June 11, 2024

Looking back on the first report: 14.04 worked, but 16.04 didn't. That's odd, IMO. But, things happen. My development box where fwdpy11 is tested is 16.04 with GCC 5.5 and everything works fine there. One key difference is that I am not trying to compile GCC on that system ever.

from slim.

bhaller avatar bhaller commented on June 11, 2024

Also someone on Red Hat Enterprise Linux Server release 6.6 (Santiago), above. Seems like a pretty mixed bag. I don't know why it bites particular people and not others.

from slim.

bhaller avatar bhaller commented on June 11, 2024

What may be happening is that more errors are being seen on CentOS/RHEL because they are commonly deployed on clusters. The Debian family (including Ubuntu and Pop OS) seem to find their home primarily on desktop/laptop/workstation setups, which may be more rare.

That makes sense, especially since many people using SLiM on a desktop will be using macOS, and thus the double-click installer or Xcode, rather than building at the command line.

from slim.

molpopgen avatar molpopgen commented on June 11, 2024

One option for a short-term fix is to allow opt-in for FLTO. The default would be OFF, and then invoking something like cmake . -DUSE_FLTO=1 would enable it.

from slim.

bhaller avatar bhaller commented on June 11, 2024

I'm inclined to leave it as is until a complete fix comes along. The performance gains I measured were quite small, so it's not really worth complicating the story for users; if it can work automatically that's great, but if it requires a switch, documentation, etc., then meh. Thanks for thinking about it, though, I appreciate it.

from slim.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.