Comments (10)
Since
WeakDictionary
is a port ofWeakHashSet
in Java , has anyone taken a long hard look at theWeakDictionary
code to see why it's not more performant and why some of it's tests fail in debug mode?
Yes, Vincent Van Den Burghe looked at this a while back and this was a "best effort" that could be done using pure managed code. It uses WeakReference
to allow references to be GC'd, however, the dictionary entries still need to be cleaned up after the WeakReference
dies and therein lies the bottleneck. Most read/write operations do a CleanIfNeeded()
operation which isn't very efficient.
Using native resources is really the only way we can be sure that dictionary entries are "removed" as soon as their contained reference goes out of scope. ConditionalWeakTable
does exactly what we need, but in .NET Standard 2.0 it is missing 2 of the APIs we need, which is a deal breaker in those 5 cases mentioned above.
Since no version of .NET Framework yet supports .NET Standard 2.1
Microsoft considers .NET Framework to be "finished" and there will be no further development on it, including upgrading to .NET Standard 2.1. This is a problem that won't go away until everyone moves from .NET Framework to .NET 5+, which is now the best upgrade path.
This problem dies with .NET Framework (meaning it will become less of an issue over time) and we have a solution that works for now, but is less than ideal. Additionally, WeakDictionary
has been marked internal. Therefore, this issue is fairly low priority.
That being said, there are other components of J2N that could benefit performance-wise by using native code as well. It would be great if we set up a precedent for building and deploying native code with J2N.
from lucenenet.
Is related to: GH-610
from lucenenet.
The source code for the dotnetcore implementation is here https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/ConditionalWeakTable.cs
from lucenenet.
Thanks. An attempt has already been made on this. The blocker is the fact that it depends on DependentHandle which has native resources.
DependentHandle
calls native methods using [MethodImpl(MethodImplOptions.InternalCall)]
.
[MethodImpl(MethodImplOptions.InternalCall)]
private static extern IntPtr nInitialize(object primary, object? secondary);
[MethodImpl(MethodImplOptions.InternalCall)]
private static extern object? nGetPrimary(IntPtr dependentHandle);
[MethodImpl(MethodImplOptions.InternalCall)]
private static extern object? nGetPrimaryAndSecondary(IntPtr dependentHandle, out object? secondary);
[MethodImpl(MethodImplOptions.InternalCall)]
private static extern void nSetPrimary(IntPtr dependentHandle, object? primary);
[MethodImpl(MethodImplOptions.InternalCall)]
private static extern void nSetSecondary(IntPtr dependentHandle, object? secondary);
[MethodImpl(MethodImplOptions.InternalCall)]
private static extern void nFree(IntPtr dependentHandle);
It is not currently known how to translate that into a call that can be done from a 3rd party library such as J2N. Some important questions to answer:
- Can we assume the native method exists for the current platform, since .NET Standard should "just work"?
- Do we need to bundle the native resources in J2N to call them?
- If we need to bundle native resources, do they work on all platforms or do we need separate builds for specific platforms?
Note that [MethodImpl(MethodImplOptions.InternalCall)]
doesn't exist on .NET Standard 1.x, but it is probably time to drop support for that target, anyway.
from lucenenet.
Regarding DependentHandle
:
Can we assume the native method exists for the current platform
This is a head scratcher. On one hand the static extern
is likely referencing some native binary implementation. This would suggest the binary implementation for each platform would have to be bundled with J2N so that they could be available and that doesn't feel like the right direction. On the other hand this call is being made from .Net Core 3.0 code which would suggest that the appropriate binaries must already be in place on all platforms that .Net Core 3.0 supports. So in the end, my money is on those extern methods being available everywhere .Net Core 3.0 supports. That of course does not ensure .NET Standard 2.0 support by any means. Which brings us back to having to embed OS specific binaries which sounds like Yuk!
So I did some poking around and it appears there is an identical mono class that does not make extern calls. Incorporating this as the J2N implementation may be the way to go. https://github.com/dotnet/runtime/blob/master/src/mono/netcore/System.Private.CoreLib/src/System/Runtime/CompilerServices/DependentHandle.cs
from lucenenet.
So I did some poking around and it appears there is an identical mono class that does not make extern calls. Incorporating this as the J2N implementation may be the way to go. https://github.com/dotnet/runtime/blob/master/src/mono/netcore/System.Private.CoreLib/src/System/Runtime/CompilerServices/DependentHandle.cs
Good find. However, upon closer inspection, GC.register_ephemeron_array(data);
is also a native call and so is the source of GC.EPHEMERON_TOMBSTONE
.
So, either way it appears we need to go native to solve this. However, it looks like using the approach of Mono is simpler.
from lucenenet.
Yep it looks like you are correct.
Another options, and maybe one I like even better is to use compiler directives to use the existing WeakDictionary
for NET Standard 2.0 and to use ConditionalWeakTable
for NET Standard 2.1 or later.
This avoids the whole "port ConditionalWeakTable
to NET Standard 2.0" all together. This way the code works on NET Standard 2.0 but is even more performant in highly concurrent environments when running in NET Standard 2.1 or later.
from lucenenet.
We already are using conditional compilation to minimize usage of WeakDictionary
, this issue is mainly about
- Improving performance on .NET Framework and other platforms that don't support .NET Standard 2.1
- Factoring
WeakDictionary
(and its tests) out of Lucene.NET altogether
Do note that WeakDictionary
is basically a straight port from WeakHashSet
in Java and some of its tests fail when running in Debug
mode in addition to having performance issues with cleanup. AFAIK this does not cause issues with Lucene.NET, but it is not very encouraging.
We could potentially add an option in .NET Standard 2.0 for CachedOrdinalsReader
to use ConditionalWeakTable
by disabling the functionality of RamBytesUsed()
when the user doesn't require it and eliminate one usage of WeakDictionary
.
However, that would still leave WeakDictionary
in
FieldCacheImpl
AssertingScorer
CachingWrapperFilter
ThreadedIndexingAndSearchingTestCase
Since the main usage (and main bottleneck) of WeakDictionary
on .NET Standard 2.0 is in FieldCacheImpl
, that wouldn't gain us very much.
from lucenenet.
Ok, that's all great information. Thank you!
Since no version of .NET Framework yet supports .NET Standard 2.1 and since .NET Framework 4.6.1 does supports .NET Standard 2.0, if the performance hit of using WeakDictionary
is large then I can totally see where you are coming from.
Since WeakDictionary
is a port of WeakHashSet
in Java , has anyone taken a long hard look at the WeakDictionary
code to see why it's not more performant and why some of it's tests fail in debug
mode?
from lucenenet.
Moving this to Future since this is not blocking the 4.8.0 release.
from lucenenet.
Related Issues (20)
- Random Query Parser Error HOT 1
- The type initializer for "Lucene.Net.Diagnostics.Debugging" threw an exception HOT 1
- Scarce Documentation for OpenNLP Integration HOT 10
- Add a link and info about the Lucene.NET Slack channel HOT 4
- Investigate Failing Test: Lucene.Net.Index.TestIndexWriterOnJRECrash::TestNRTThreads_Mem()
- Investigate Failing Test: Lucene.Net.Analysis.Miscellaneous.TestStemmerOverrideFilter::TestRandomRealisticWhiteSpace() HOT 1
- Task: Finish [SuppressTempFileChecks] attribute functionality
- Failure when parsing phrases HOT 3
- Alternative for SetNextReader to return all strings HOT 1
- Docs: DocFx Build Failure for API Docs HOT 4
- Lucene.Net: 4.8 SetNextReader executes repeatedly and returns only one result HOT 1
- Replace Lucene.Net.Support.Arrays.Empty<T> with System.Array.Empty<T>
- Audit use of AtomicInt32 and AtomicInt64 methods
- Improve ICollector usage
- Simplify IndexReader constructor
- Meta: Add Support unit tests HOT 1
- Review formatting of boolean strings (in ToString() methods and similar)
- Add cancellation support to IndexSearcher
- Fix test name reporting when test is in a base class
- Create Roslyn code analyzer to streamline review of proper usage of format/parse methods for numeric types
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lucenenet.