GithubHelp home page GithubHelp logo

Comments (18)

roweryan avatar roweryan commented on August 19, 2024

elrepo-stack

from packages.

dagwieers avatar dagwieers commented on August 19, 2024

This seems to be a kernel regression.

Either this is something more generic (affecting multiple people) and a fix may already exist, or your case is very specific (unique hardware setup, not very widespread combinations) and we have to report it upstream (kernel developers).

In any case, on RHEL6 this kernel is working fine for me on my Thinkpad X220:

[dag@moria ~]$ uname -r
4.3.3-1.el6.elrepo.x86_64

PS What I found interesting is the fact that your hardware was identified as "To Be Filled By O.E.M.", so your hardware vendor missed some steps when building the hardware ;-)

from packages.

dagwieers avatar dagwieers commented on August 19, 2024

The kernel log does not show any major changes to add_disk that seem related to your regression:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/block/genhd.c

It may be related to the following change between 4.3.2 and 4.3.3:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/block/genhd.c?id=25520d55cdb6ee289abc68f553d364d22478ff54

This relates to a change to add_disk.

PS Can you confirm that the older kernels still boot fine as before ? (To exclude hardware issues as an incidental event)

from packages.

roweryan avatar roweryan commented on August 19, 2024

The famous dagwieers! 👍

You can't give a Thinkpad as an example - EVERYTHING works on a Thinkpad! 😃

I've gone back to kernel-ml-4.2.5-1.el7.elrepo.x86_64 for now, should I try the other ones too?

from packages.

dagwieers avatar dagwieers commented on August 19, 2024

It is already good to know that 4.2 works fine, at least it is not a hardware-induced problem. So it's a real regression. Nice catch :-)

I was hoping you may have a 4.3.2 kernel as well. (I haven't been going back through older changes yet) If 4.3.2 fails, test 4.3.1, and maybe 4.3.0. It may give us some more context to what's going on.

Also, I assume that it bails out consistently with the same error in genhd.c in the add_disk routine, right ?

from packages.

dagwieers avatar dagwieers commented on August 19, 2024

PS Maybe also try 4.4.0, just to ensure that it's not already fixed somehow. (I don't see any related changes to genhd.c, but it may be indirect).

from packages.

roweryan avatar roweryan commented on August 19, 2024

I have to lug a monitor to the pc to get the screenshot, but I can do that. Where can I get more kernels than the ones listed here: http://elrepo.org/linux/kernel/el7/x86_64/RPMS/

I will test all of the 4.3.x series.

from packages.

toracat avatar toracat commented on August 19, 2024

You can find earlier versions in the archive. For example:

http://repos.lax-noc.com/elrepo/archive/kernel/el7/x86_64/RPMS/

from packages.

roweryan avatar roweryan commented on August 19, 2024

Thanks! Tested. Here are the results:

  • kernel-ml-4.2.5-1.el7.elrepo.x86_64 OK
  • kernel-ml-4.3.0-1.el7.elrepo.x86_64 CRASH WITH STACKTRACE
  • kernel-ml-4.3.1-1.el7.elrepo.x86_64 CRASH WITH STACKTRACE
  • kernel-ml-4.3.2-1.el7.elrepo.x86_64 CRASH WITH STACKTRACE
  • kernel-ml-4.3.3-1.el7.elrepo.x86_64 CRASH WITH STACKTRACE
  • kernel-ml-4.4.0-1.el7.elrepo.x86_64 CRASH WITHOUT STACKTRACE

Attached are screencaps, potato quality unfortunately.

4 3 0
4 3 0-big
4 3 1
4 3 2
4 3 3
4 4 0

from packages.

dagwieers avatar dagwieers commented on August 19, 2024

The only changes to block/genhd.c between 4.2 and 4.3.
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/block/genhd.c?id=v4.3&id2=v4.2
What's more interesting are the exact commit messages:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/block/genhd.c?id=b54e5ed8f285d62c0d242c4ef9da90937994db02
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/block/genhd.c?id=6c71013ecb7e2bddbed9f5b95e7aed22c491daa9

And between 4.3 and 4.4:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/block/genhd.c?id=v4.3&id2=v4.4
With the related commit message:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/block/genhd.c?id=25520d55cdb6ee289abc68f553d364d22478ff54

Also notice that it is not exactly a fatal error, but just a warning (caused by the slowpath code). That may be why the behavior on 4.4 is different despite it not working (for another related or unrelated reason). It is clear that 4.4 continues a bit further before locking up.

I was wondering how long you left it in this state ? Since it did not really panic or escalated a fatal error, it might just be waiting for timeout, or worse in some sort of deadlock. But in both cases I would be expecting some output after some time.

Would be interesting to see what the kernel developers are going to ask and what the eventual cause is.

from packages.

roweryan avatar roweryan commented on August 19, 2024

Do I need to do anything so that the kernel developers see my stack trace?

I'll try and test the lockup tonight.

from packages.

roweryan avatar roweryan commented on August 19, 2024

kernel-ml-4.4.0-2.el7.elrepo.x86_64 is also unbootable :(

4 4 0-2

from packages.

roweryan avatar roweryan commented on August 19, 2024

sde is the usb boot disk if that helps. sd[a-d] are the btrfs volumes.

from packages.

roweryan avatar roweryan commented on August 19, 2024

I also reported this to the kernel bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111501

from packages.

roweryan avatar roweryan commented on August 19, 2024

I think this might be the fix: https://git.kernel.org/cgit/linux/kernel/git/mkp/scsi.git/commit/?h=4.5/scsi-queue&id=221255aee67ec1c752001080aafec0c4e9390d95

It should be on 4.5 - any chance of a patch before that? :D

from packages.

toracat avatar toracat commented on August 19, 2024

4.5 is here for testing:
http://elrepo.org/people/ajb/devel/kernel-ml/el6/x86_64/RPMS/
http://elrepo.org/people/ajb/devel/kernel-ml/el7/x86_64/RPMS/

from packages.

roweryan avatar roweryan commented on August 19, 2024

Yahooooooooooo! IT BOOTS!!!!! THANKS!

from packages.

dagwieers avatar dagwieers commented on August 19, 2024

Aha, sde was a USB device !
We could have looked at USB changes in the changelog too then.

Oh well, more knowledgeable people beat us to it ;-)

from packages.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.