This is added specifically in the core but used only for index scenario. Need to find

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

batchMutateForIndex() needs to be checked for better way about hindex HOT 15 OPEN

huawei-hadoop commented on September 28, 2024

batchMutateForIndex() needs to be checked for better way

from hindex.

Comments (15)

anoopsjohn commented on September 28, 2024

There is an open jira in HBase to support cross region transaction for regions in same RS. Once that is done, the same thing should be used here and rewrite.

from hindex.

chrajeshbabu commented on September 28, 2024

How batchMutateForIndex different with batchMuate is

no check for resources (blocking memstore size check) - since we are checking for blocking memstore size intially in postStartRegionOperation, having this check should not cause any problem.
no start/closeRegionOperations - with this calls we will just aquire lock and release it, having this also should not cause any problems.

So I am thinking we can use bacthMutate directly instead of having seperate method in kernel, What do you think?

from hindex.

anoopsjohn commented on September 28, 2024

No there were issues with having a resource check. It was giving a deadlock issue. I forgot the full context now. Ram can tell.

from hindex.

ramkrish86 commented on September 28, 2024

It was something related to flush and puts happening for the index region. During puts and flush the resource check would be done and would try to acquire a lock.
If you can find out that deadlock issue, we have a supporting testcase to prove that. If you revert the change and run that testcase it would lead to deadlock and with the fix it would avoid it. I can check it but may take some time.

from hindex.

ramkrish86 commented on September 28, 2024

If you can find out that deadlock issue
I mean the JIRA for that deadlock issue.

from hindex.

chrajeshbabu commented on September 28, 2024

I will provide the details here

from hindex.

chrajeshbabu commented on September 28, 2024

description of the deadlock issue - this wont be solved by postStart/CloseRegionOperations hooks?

The main table region is going on. As part of flush we will expect the MVCC to be completed.

// wait for all in-progress transactions to commit to HLog before
// we can start the flush. This prevents
// uncommitted transactions from being written into HFiles.
// We have to block before we start the flush, otherwise keys that
// were removed via a rollbackMemstore could be written to Hfiles.
mvcc.waitForRead(w);
At the same time the blocking memstore size is reached. So none of the puts to indextable and main table is getting thro. Now as the blocking memstore size is reached

// hook to complete the actual put
if (coprocessorHost != null) {
List mutations = new ArrayList();
for (int i = firstIndex; i < lastIndexExclusive; i++) {
// only for successful puts
if (batchOp.retCodeDetails[i].getOperationStatusCode() != OperationStatusCode.SUCCESS) {
continue;
}
Mutation m = batchOp.operations[i].getFirst();
mutations.add(m);
}
coprocessorHost.postBatchMutate(mutations, walEdit);
}

  // ------------------------------------------------------------------
  // STEP 8. Advance mvcc. This will make this put visible to scanners and getters.
  // ------------------------------------------------------------------
  if (w != null) {
    mvcc.completeMemstoreInsert(w);
    w = null;
  }

We do completeMemstoreInsert after postBatchMutate. So the first flush that was waiting for the MVCC to move just hangs up. This leads to a deadlock situation.

from hindex.

chrajeshbabu commented on September 28, 2024

TestForComplexIssues#testHDP3015

from hindex.

ramkrish86 commented on September 28, 2024

@Rajesh
So you suggest that

So I am thinking we can use bacthMutate directly instead of having seperate method in kernel, What do you think?
doing this would solve the problem? Means my question is you already know this was the issue and the soln that you give now will solve the problem?

from hindex.

ramkrish86 commented on September 28, 2024

@chrajeshbabu
Where you able to reproduce the issue with the testcase?

from hindex.

chrajeshbabu commented on September 28, 2024

if we dont have hooks postStart/CloseRegionOperations, this problem will come. Presently there is no deadlock issue in the code.
But because of
#26 the test case is getting hanged. I have corrected and ran the test case, then its passing.
Even if we use batchMutate instead of batchMutateForIndex also its passing. Thats why I am suggesting we can use batchMutate only. Then we can avoid kernel changes.

from hindex.

ramkrish86 commented on September 28, 2024

I will check the code and get back on this.

from hindex.

chrajeshbabu commented on September 28, 2024

ok Ram.

from hindex.

ramkrish86 commented on September 28, 2024

I can tell you what was the reason for the fix.
Table region puts happened and we are in postBatchMutate.
Before the fix the postBatchMutate was calling batchMutate for index that would try to acquire the lock and also will checkForResources.
Now by this time if the memstoreSize has reached the main table will be trying to flush and that time will wait for the mvcc to be completed (this is for the main puts).
So in side the batchMutate that is called for index it would wait for the memStoresize to come down.
As the completeMemstoreinsert is after the postBatchMutate everything will be hanging.
If am trying to explain what you know already kindly excuse me.

So the idea for the fix is that even before you start the mutation operation for the main and index region you acquire the lock and check for the resources. So this means that once you say batchMutate() if at that time the resources are available there will not be any block on the current mutation on the index and main region.
So while now postBatchMutate is called, it would still proceed with the puts for the index region and once it is done the flush also would be completed.
We may have more data at that point of time in the memstore and the flush completion would again make room for new entries in memstore.

from hindex.

anoopsjohn commented on September 28, 2024

The fix is needed. We should have a new batchMutate in HRegion which will not do any resource check and acquire any lock. (Acquire lock again is fine?)
This new method will be part of the Jira which is raised to support the cross region transaction.

from hindex.

batchMutateForIndex() needs to be checked for better way about hindex HOT 15 OPEN

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs