GithubHelp home page GithubHelp logo

Comments (7)

gha-zund avatar gha-zund commented on May 24, 2024 1

@calculator@S300L160371 is one of our durable entities. We use them to do some calculations (inkl. reading some data from a storage account as input and then writing the result to a database). There is one such calculator entity for each iot-device we have.

in our productive environment, we have at the moment about 1.5k such entities, which should doing calculations more or less in parallel. but each device must do 1 calculation at a time, so there we need to have some synchronization. durable entities could give us that.

However, we have the feeling that in some scaling scenarios it's not 100% guaranteed, that we have only 1 entity instance at the same time over all function app instances.
Maybe the null-reference comes from a concurrency scenario?

I think we are using the latest released packages:
image

from durabletask-mssql.

cgillum avatar cgillum commented on May 24, 2024 1

OK, no worries. I'll close this issue in the meantime but ping me if you see it again and we can reopen.

from durabletask-mssql.

gha-zund avatar gha-zund commented on May 24, 2024

I just realized that this exception always comes together with several warning logs (warnings come first).

Category DurableTask.Core
Event name DiscardingMessage
message {entityId}: Discarding [EventRaised]: Received work-item for an invalid orchestration
details Received work-item for an invalid orchestration

from durabletask-mssql.

gha-zund avatar gha-zund commented on May 24, 2024

Ah, another info: we are using in-process functions, with .NET 6

from durabletask-mssql.

cgillum avatar cgillum commented on May 24, 2024

Glad to hear that the MSSQL package is mostly working well for you. :)

The NullReferenceException makes me think there is a bug in our code that's somehow specific to @calculator@S300L160371. For example, maybe there's some unexpected or missing state in the database. Because this exception is going unhandled in the SQL provider code, it's resulting in throttling by the Durable Task Framework, which will impact the throughput of the entire VM.

I think we need to look at this more closely to understand what's causing the null-ref and try to fix it. Which versions of the DurableTask nuget packages are you using?

from durabletask-mssql.

cgillum avatar cgillum commented on May 24, 2024

@gha-zund with the information you provided above, I was able to locate your Durable Entity instance and see this exception and the warnings you mentioned. I suspect that there's some error case that results in storing invalid state into the database. Corrupt state in the database will trigger the warning you're seeing, and I suspect that warning is what leads to the exception.

I'm not sure if it's related to concurrency - the SQL backend has strong transactional consistency which should make it more resistant to state corruption. However, we can't completely rule it out yet.

To root cause this, I'll need to see what data you have in your database. Are you able to run the following query against your SQL database and share the results (feel free to scrub out any data that can't be shared publicly):

DECLARE @EntityID varchar(50) = '@calculator@S300L160371'
SELECT * FROM dt.Instances WHERE InstanceID = @EntityID
SELECT * FROM dt.History WHERE InstanceID = @EntityID
SELECT * FROM dt.NewEvents WHERE InstanceID = @EntityID

Ideally this would be captured when the errors are ongoing for a particular entity, and you can change the entity ID used in the query if needed. If the problem self-heals, then the result may not contain anything useful. The most important result would be the data in the dt.History table, which would give us a clue about whether there is corruption.

from durabletask-mssql.

gha-zund avatar gha-zund commented on May 24, 2024

Alright, thanks for the hints!

I fear the data will not be available anymore, since we purged all the history for all entities to get a clean state for further evaluation...
The warnings did not occur the last 24h anymore, so maybe the problem is gone too.

We will keep the eyes open and provide the data in case the warnings appear again!

from durabletask-mssql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.