GithubHelp home page GithubHelp logo

Comments (6)

rwynn avatar rwynn commented on September 23, 2024

Hi @apetrik-nh

Do you delete documents in the parent collection?

I added this commit which would ignore the delete events on the parent collection since you only use the parent through a relate and have keep-src set to false. In this case the parent document is never indexed and might be the source of the 404 deletes.

4454a4c

from monstache.

apetrik-nh avatar apetrik-nh commented on September 23, 2024

@rwynn , thank you for the quick reply.
We never delete collections and in majority of cases delete no parent and child documents but instead marking those with "deleted" flag. I doubt your fix can help as failed transactions are logged against IDs of the child documents. There was an assumption internally that if the child document is marked as deleted, then the parent document is changed, the engine will try to reindex all the children for that parent and will try to delete it again in the Elastic which will fail. But we cannot replicate this use-cases on the local environment (no bulk failed logs are there). But in production we constantly see those errors in the logs.
PS: do you have a donation credentials to support your library and your work? Like a PayPal?

from monstache.

rwynn avatar rwynn commented on September 23, 2024

@apetrik-nh assumption sounds like it's along the correct lines. My guess (if I understand your case) would be something like this is happening...

  • soft deleted flag set on a parent document
  • via relate child documents run through your golang script Map function
  • you detect deleted flag and return Drop=true to delete each child from the search index.
  • later, parent updated in some other way
  • same thing happens but deletes on all children 404 since they were previously removed

So, in your Map function, if I remember correctly, you get passed a "change doc" which represents what actually changed in MongoDB. I think you could use this to determine in the case above whether a Drop=true or a Skip=true is warranted. E.g. if deleted in updatedFields then Drop=true else Skip=true.

https://www.mongodb.com/docs/manual/reference/change-events/update/
e.g.

"updateDescription": {
--
"updatedFields": {
"email": "[email protected]"
},
"removedFields": ["phoneNumber"],
"truncatedArrays": [ {
"field" : "vacation_time",
"newSize" : 36
} ]
}

from monstache.

rwynn avatar rwynn commented on September 23, 2024

I think returning Drop=true from a script is the only way you can influence monstache to turn an insert or update into an Elasticsearch delete. So, if you are not actually deleting anything from watched collections, that would be where I would look.

Monstache also has a Process escape hatch that basically lets you process the events yourself in which case a delete could happen there also. But you would have to code the delete yourself in Process (add a delete request to the bulk processor).

from monstache.

rwynn avatar rwynn commented on September 23, 2024

Thanks, for the offer @apetrik-nh to support this development in some way. Unfortunately, I don't have much time to invest in monstache these days so just try my best to do a little improvement here and there (and accept pull requests).

Would love to hear if you and your team are in a position to take better care of the project and have an interest.

from monstache.

apetrik-nh avatar apetrik-nh commented on September 23, 2024

Thank you for all your replies. We will try to reproduce this massive log issue on the controlled environment using your hint that deletes can only come from Map script. And maybe will make a logic smarter to return Skip instead of Drop if deletion is not happening right now.
On the contribution part, our team has zero experience in Golang and even current script is a bit painful to support because of missing expertise.

from monstache.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.