icsharpcode / nullabilityinference Goto Github PK

Global type inference for C# 8 nullable reference types

License: MIT License

C# 100.00%

csharp dotnet dotnetcore c-sharp-8 nullable-reference-types tool

nullabilityinference's Issues

Inference status: ICSharpCode.Decompiler

The primary use-case (how I ended up starting this project) was trying to annotate ILSpy's decompiler engine, which quickly felt like a task that could be automated.

Timeline:

March 2010 (yes, over 10 years ago): I had an idea for a null checking analysis as an analysis on IL code. I implemented some of that idea, but it didn't work well enough for my liking and I gave up on the idea. The original idea of the nullability constraint graph (back then I called it "nullability subtyping graph") is from back then. So is the idea of using the minimum cut for minimizing the number of warnings (back then: errors reported by the analysis tool).
Early May 2020: I wanted to annotate ILSpy's decompiler engine (ICSharpCode.Decompiler) with nullable reference types. But it felt like a bunch of monotonous work that ought to be automated. I realized that the C# 8 nullable reference type system is somewhat similar to what I did 10 years earlier, and that my ideas may be applicable to an inference tool. An inference tool doesn't need to be perfect to be useful, it just needs automate the vast majority of the work.
2020-05-09: I started building the NullabilityInference prototype.
2020-05-17: The prototype can handle some individual code files from ICSharpCode.Decompiler.Utils
2020-06-08: The prototype finally supports enough language features to run over the whole ICSharpCode.Decompiler without crashing with a NotImplementedException
- Use .NET Core 3 + enable NRT, but not using inference --> 2714 warnings
- After running InferNull --> 1134 warnings.
- Some bugfixes reduce this to 1120 warnings.
2020-06-13: Implementing flow analysis (#5) gets us down to 1079 warnings.
2020-06-21: [NotNullWhen(true)]-inference finally works correctly --> 722 warnings.

In the remaining warnings, I see some categories of problems occurring repeatedly:

unconstrained generics: we can't infer [AllowNullable] yet
generics: we can't infer T: notnull constraints yet
uninitialized fields: ILSpy has many classes where fields are initialized not in the constructor, but by other methods (e.g. a single public method serves as an entry point for a class and initializes a bunch of fields; with private methods relying on the fields already being initialized)
Roslyn doesn't realize that fields are initialized when the ctor calls a property setter

Reference assemblies lack annotations

Even on .NET Core 3.1, not all system libraries are annotated.

For example, the System.Linq methods are lacking annotations.
This means a call like collection.FirstOrDefault() will:

incorrectly allow nullable collections
incorrectly allow the return value to be non-nullable

This can cause significantly wrong annotations being inferred (and then accepted by the compiler without warning).

Maybe we should somehow include the .NET 5 annotations with the inference tool, so that it can produce useful results on projects targeting .NET Core 3 or even .NET Framework 4.x?

Distribute as dotnet tool

It would be more convenient to install and use this tool as dotnet tool.

Crashing on documentation comment types

Hello!
I tried running InferNull on the SharpZipLib source and it failed with a duplicate key exception.
I added some debugging output to see what is going on, and it seems like it chokes on the documentation comment types:

Perhaps this is a known issue? I will try substituting the missing SyntaxTypes to continue testing.

The initial duplicate key exception was thrown in ICSharpCode.NullabilityInference.SyntaxToNodeMapping.CreateNewNode() and the missing key was in ICSharpCode.NullabilityInference.SyntaxToNodeMapping[TypeSyntax syntax]

[return: NotNullIfNotNull(paramName)]

[return: NotNullIfNotNull(paramName)] is a semi-common attribute to use, especially in some code bases that like to use:
if (input == null) return null;
at the start of many functions.

Unlike [NotNullWhen(bool)] for out parameters, I don't see a clean way to infer NotNullIfNotNull with our current algorithm.
But it would be valuable to figure something out, so I'm creating this issue to collect some cases of [NotNullIfNotNull] methods and their constraint graphs.

Flow analysis

Currently our inference re-uses Roslyn's flow-analysis.

However, this has some fundamental problems.

9:	Dictionary<T, Node> mapping = new Dictionary<T, Node>();

11:	Node? GetNode(T element)
	{
13:		Node? node;
14:		if (!mapping.TryGetValue(element, out node))
15:		{
16:			node = new Node();
17:			mapping.Add(element, node);
18:		}
19:		return node;
	}

There's no edges created for line 16/17 because here Roslyn knows that node is non-null.
Line 14 creates two edges: one from <nullable> because TryGetValue will assign null when returning false; the other from mapping!1#2 (the mapping field's Node type argument) when TryGetValue returns true.

The return statement creates an edge from the variable's type, because Roslyn's flow analysis can't guarantee us that the variable is non-null -- our Roslyn code analysis runs under the pessimistic assumption that all types-to-be-inferred might end up nullable, so it considers mapping to be Dictionary<T, Node?>, which leaves open the possibility that GetNode returns null.

However, after our inference decides that mapping!1#2 is non-null, it would be correct to also indicate that the GetNode return value is non-null. After all, if no node exists yet, the function will create one.

The issue here is that Roslyn's flow analysis isn't aware of our types-to-be-inferred.
It would be better if, instead of using Roslyn's flow analysis, we had our own that keeps track of node's nullability.

The idea would be to create additional "helper" graph nodes for the different flow states of a local variable of reference type.
After TryGetValue initializes node, it's flow-state would be (true: mapping!1#2, false: <nullable>). Within the if body, the flow-state would initially be <nullable>, but after line 16 would change to <nonnull>.
After the if, the flow-state from both alternatives could be re-combined by creating a new node "node-after-line-18" and edges from the nodes from the two if-branches -- in this case <nonnull> from the then-branch and mapping!1#2 from the else branch.
Then the return statement would create an edge from this "node-after-line-18" instead of the node for the variable's declared type.
All flow-state nodes associated with a variable would have an edge to the variable's declared type node.
We'd end up with a graph somewhat like this:

Thus in the end, node would be inferred as nullable, but the GetNode return type would only depend on mapping!1#2 and thus can be inferred depending on whether there's a mapping.Add(x, null) access somewhere else in the program.

icsharpcode / nullabilityinference Goto Github PK

nullabilityinference's Issues

Inference status: ICSharpCode.Decompiler

Reference assemblies lack annotations

Distribute as dotnet tool

Crashing on documentation comment types

[return: NotNullIfNotNull(paramName)]

Flow analysis

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs