GithubHelp home page GithubHelp logo

Deadlock in indexer about cdt HOT 10 CLOSED

eclipse-cdt avatar eclipse-cdt commented on July 24, 2024
Deadlock in indexer

from cdt.

Comments (10)

jonahgraham avatar jonahgraham commented on July 24, 2024

I have had two runs now with identical stack traces, so pretty sure this is what is going on.

Extracts of key threads:

"main" #1 prio=6 os_prio=0 cpu=6380.63ms elapsed=721.72s tid=0x00007f29ec0267b0 nid=0x1eaa7c waiting for monitor entry  [0x00007f29f128e000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at java.lang.Object.wait(java.base@17.0.3/Native Method)
	- waiting on <no object reference available>
	at org.eclipse.cdt.internal.core.pdom.PDOMManager.joinIndexer(PDOMManager.java:1173)
	- locked <0x00000000e8e07aa0> (a [Z)
	at org.eclipse.cdt.core.testplugin.util.BaseTestCase5.waitForIndexer(BaseTestCase5.java:165)
	at org.eclipse.cdt.core.testplugin.util.BaseTestCase.waitForIndexer(BaseTestCase.java:234)
	at org.eclipse.cdt.internal.pdom.tests.PDOMCPPBugsTest.test191679(PDOMCPPBugsTest.java:227)


"Worker-5: Update Monitor" #56 prio=5 os_prio=0 cpu=19.67ms elapsed=712.00s tid=0x00007f2848000e70 nid=0x1eab17 waiting for monitor entry  [0x00007f2878487000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.eclipse.cdt.internal.core.pdom.PDOMManager.isIndexerIdle(PDOMManager.java:751)
	- waiting to lock <0x00000000e2035f38> (a java.util.ArrayDeque)
	at org.eclipse.cdt.internal.core.pdom.PDOMManager$7.done(PDOMManager.java:1153)
	- locked <0x00000000e8e07aa0> (a [Z)
	at org.eclipse.core.internal.jobs.JobListeners$$Lambda$127/0x0000000800da52f0.notify(Unknown Source)

"Worker-0: Notify Index Change Listeners" #43 prio=5 os_prio=0 cpu=709494.63ms elapsed=716.35s tid=0x00007f29ec8e7520 nid=0x1eaadc runnable  [0x00007f29a49f3000]
   java.lang.Thread.State: RUNNABLE
	at org.eclipse.core.internal.jobs.JobManager.schedule(JobManager.java:1353)
	at org.eclipse.core.internal.jobs.InternalJob.schedule(InternalJob.java:390)
	at org.eclipse.core.runtime.jobs.Job.schedule(Job.java:654)
	at org.eclipse.cdt.internal.core.pdom.PDOMManager.enqueue(PDOMManager.java:712)
	- locked <0x00000000e2035f38> (a java.util.ArrayDeque)
	at org.eclipse.cdt.internal.core.pdom.PDOMManager$3.run(PDOMManager.java:1012)
	- locked <0x00000000e2036538> (a java.util.HashMap)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)


"Worker-2: Building" #52 prio=5 os_prio=0 cpu=28.64ms elapsed=712.01s tid=0x00007f28c0001820 nid=0x1eab13 waiting for monitor entry  [0x00007f287888b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.eclipse.cdt.internal.core.pdom.PDOMManager.handlePostBuildEvent(PDOMManager.java:1499)
	- waiting to lock <0x00000000e2036538> (a java.util.HashMap)
	at org.eclipse.cdt.internal.core.pdom.CModelListener.resourceChanged(CModelListener.java:166)

from cdt.

jonahgraham avatar jonahgraham commented on July 24, 2024

I think this is the same regression identified in eclipse-platform/eclipse.platform#193 - I will try updating to newer I-build, and if that works we may need to discuss what to do about the M1 release this week.

from cdt.

jonahgraham avatar jonahgraham commented on July 24, 2024

With the updated target platform it looks like the problem isn't resolved, but the Java deadlock detector code can identify it now at least:

Found one Java-level deadlock:
=============================
"main":
  waiting to lock monitor 0x00007f37957fa400 (object 0x00000000e9980f88, a [Z),
  which is held by "Worker-11: C/C++ Indexer"

"Worker-11: C/C++ Indexer":
  waiting to lock monitor 0x00007f358801b240 (object 0x00000000e09dbbf0, a java.util.ArrayDeque),
  which is held by "Worker-7: Notify Index Change Listeners"

"Worker-7: Notify Index Change Listeners":
  waiting to lock monitor 0x00007f359c000e70 (object 0x00000000e09dbcf8, a java.util.concurrent.ConcurrentLinkedQueue),
  which is held by "Worker-11: C/C++ Indexer"

Full jstack output jstack.txt

from cdt.

jonahgraham avatar jonahgraham commented on July 24, 2024

CDT's immediate issue is fixed (hopefully) with #82 - but that really just hides the problem hoping that Platform resolves the issue in eclipse-platform/eclipse.platform#193

from cdt.

jukzi avatar jukzi commented on July 24, 2024

the deadlock described in #81 (comment)
is a deadlock between
org.eclipse.core.internal.jobs.InternalJob.eventQueue
and
org.eclipse.cdt.internal.core.pdom.PDOMManager.idleCondition

CDT should not synchronize in the JobChangeAdapter
https://github.com/eclipse-cdt/cdt/blob/main/core/org.eclipse.cdt.core/parser/org/eclipse/cdt/internal/core/pdom/PDOMManager.java#L1152

AND outside the while loop
https://github.com/eclipse-cdt/cdt/blob/main/core/org.eclipse.cdt.core/parser/org/eclipse/cdt/internal/core/pdom/PDOMManager.java#L1165

you could just synchronize while waiting
https://github.com/eclipse-cdt/cdt/blob/main/core/org.eclipse.cdt.core/parser/org/eclipse/cdt/internal/core/pdom/PDOMManager.java#L1173

to be notified.

"Worker-11: C/C++ Indexer":
  waiting to lock monitor 0x00007f358801b240 (object 0x00000000e09dbbf0, a java.util.ArrayDeque),
  which is held by "Worker-7: Notify Index Change Listeners"

"Worker-7: Notify Index Change Listeners":
  waiting to lock monitor 0x00007f359c000e70 (object 0x00000000e09dbcf8, a java.util.concurrent.ConcurrentLinkedQueue),
  which is held by "Worker-11: C/C++ Indexer"
===================================================

"Worker-11: C/C++ Indexer":
	at org.eclipse.cdt.internal.core.pdom.PDOMManager.isIndexerIdle(PDOMManager.java:751)
	- waiting to lock <0x00000000e09dbbf0> (a java.util.ArrayDeque)
	at org.eclipse.cdt.internal.core.pdom.PDOMManager$7.done(PDOMManager.java:1153)
	- locked <0x00000000e9980f88> (a [Z)
	at org.eclipse.core.internal.jobs.JobListeners$$Lambda$127/0x0000000800da5228.notify(Unknown Source)
	at org.eclipse.core.internal.jobs.JobListeners.sendEvent(JobListeners.java:63)
	at org.eclipse.core.internal.jobs.JobListeners.sendEvents(JobListeners.java:49)
	- locked <0x00000000e09dbcf8> (a java.util.concurrent.ConcurrentLinkedQueue)
	at org.eclipse.core.internal.jobs.JobManager.endJob(JobManager.java:736)
	at org.eclipse.core.internal.jobs.WorkerPool.endJob(WorkerPool.java:117)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:83)
"Worker-7: Notify Index Change Listeners":
	at org.eclipse.core.internal.jobs.JobListeners.sendEvents(JobListeners.java:48)
	- waiting to lock <0x00000000e09dbcf8> (a java.util.concurrent.ConcurrentLinkedQueue)
	at org.eclipse.core.internal.jobs.JobManager.schedule(JobManager.java:1351)
	at org.eclipse.core.internal.jobs.InternalJob.schedule(InternalJob.java:392)
	at org.eclipse.core.runtime.jobs.Job.schedule(Job.java:654)
	at org.eclipse.cdt.internal.core.pdom.PDOMManager.enqueue(PDOMManager.java:712)
	- locked <0x00000000e09dbbf0> (a java.util.ArrayDeque)
	at org.eclipse.cdt.internal.core.pdom.PDOMManager$3.run(PDOMManager.java:1012)
	- locked <0x00000000e09dc218> (a java.util.HashMap)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)

from cdt.

jonahgraham avatar jonahgraham commented on July 24, 2024

Thank you @jukzi for the analaysis. I have no doubt that there is some improvements to the indexer joining code (and this issue needs to be reopened) but it was written more than a decade ago. So the behaviour change in Platform has regressed CDT and I can't see where CDT violated the API, but perhaps it is because CDT depended on undocumented behaviour.

Anyway, I will see if I can provide a reduced test case, which will either show the remaining issue in platform, or at least allow me to understand the problem more fully. The existing code is way too complicated within CDT, and frustratingly the code tends not to lock up when running under the debugger. It fails quite often (not 100% of the time) when I run the tests in the IDE, but it has never failed when debugging.

from cdt.

jukzi avatar jukzi commented on July 24, 2024

Well the timing in debugging is different, but you could for sure reproduce this obvious deadlock with breakpoints in the synchronized blocks. I understand this deadlock. I will also think about how he platform could get rid of it's synchronize.

from cdt.

jukzi avatar jukzi commented on July 24, 2024

Me blaming PDOMManager.idleCondition was wrong. Sorry. The problem is
PDOMManager.fTaskQueue
which is synchronized while the Job is scheduled. And also used in the listeners

That should be done outside the synchronize. Same pattern is already used in

probably same problem in

from cdt.

jukzi avatar jukzi commented on July 24, 2024

BTW that code is totally wrong:
https://github.com/eclipse-cdt/cdt/blob/main/core/org.eclipse.cdt.core/parser/org/eclipse/cdt/internal/core/pdom/PDOMManager.java#L1149

		JobChangeAdapter listener = new JobChangeAdapter() {
			@Override
			public void done(IJobChangeEvent event) {
				synchronized (idleCondition) {
					if (isIndexerIdle()) {
						idleCondition[0] = true;
						idleCondition.notifyAll();
					}
				}
			}
		};
		Job.getJobManager().addJobChangeListener(listener);

It adds the listener to every job so that any(!) job that completes will set the idleCondition even if other jobs still running. And it synchronizes all(!) jobs done(). Thats why all these unrelated threads:
"Worker-0: Reporting encoding changes.",
"Worker-2: Notify Index Change Listeners",
"Worker-3: C/C++ Indexer"
"Worker-4: Update Monitor"
"Worker-6: Update Job"
"Worker-7: Java problems decoration calculation..."
...
wait for the same synchronized (idleCondition)

What probably is wanted here is to only listen for fIndexerJob to be done. So you could instead add and remove the job to/from only that Job.

from cdt.

jonahgraham avatar jonahgraham commented on July 24, 2024

You may be right - that heavy joinIndexer code is mostly used in tests to make sure everything is idle.

from cdt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.