apache / accumulo-proxy Goto Github PK
View Code? Open in Web Editor NEWApache Accumulo Proxy
Home Page: https://accumulo.apache.org
License: Apache License 2.0
Apache Accumulo Proxy
Home Page: https://accumulo.apache.org
License: Apache License 2.0
ProxyDurabilityIT.testDurability()
now fails after the changes in #38. I think this is caused by the difference in JUnit versions between accumulo 2.0 which uses JUnit4. ProxyDurabilityIT.testDurability()
uses MAC and I think the difference causes certain things not to be run and the MAC does not get set up properly. This may be fixed as part of #33
Currently the proxy uses internal Accumulo code for ITs. It could use the Accumulo maven plugin for running ITs instead. Fluo uses this plugin for its ITs, if anyone is interested I can provide pointers within the Fluo code.
There are example client files that can be run with ruby and python which seem helpful. I created this ticket to propose the idea of creating a similar file to test the cpp.
After updating accumulo-proxy to use accumulo 2.1, the usage of compaction also has to be updated. Currently, compaction strategies are being used which is marked for removal and should be updated.
testCompactionStrategy
fails with the following:
AccumuloException(msg:org.apache.accumulo.core.client.AccumuloException: TabletServer could not load CompactionStrategy class org.apache.accumulo.test.EfgCompactionStrat)
at org.apache.accumulo.proxy.thrift.AccumuloProxy$compactTable_result$compactTable_resultStandardScheme.read(AccumuloProxy.java:28065)
at org.apache.accumulo.proxy.thrift.AccumuloProxy$compactTable_result$compactTable_resultStandardScheme.read(AccumuloProxy.java:28033)
at org.apache.accumulo.proxy.thrift.AccumuloProxy$compactTable_result.read(AccumuloProxy.java:27967)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
at org.apache.accumulo.proxy.thrift.AccumuloProxy$Client.recv_compactTable(AccumuloProxy.java:696)
at org.apache.accumulo.proxy.thrift.AccumuloProxy$Client.compactTable(AccumuloProxy.java:676)
at org.apache.accumulo.proxy.its.SimpleProxyBase.testCompactionStrategy(SimpleProxyBase.java:2760)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
at java.base/java.lang.Thread.run(Thread.java:829)
@keith-turner and I briefly touched upon allowing proxy.properties
to be overidden for a Docker instantiation of accumulo-proxy in this pull request: #20
Currently in accumulo-docker you can override properties by the following:
export ACCUMULO_CL_OPTS="-o prop.1=var1 -o prop.2=var2"
docker run -d --network="host" accumulo monitor $ACCUMULO_CL_OPTS
This is documented in the docs here: https://github.com/apache/accumulo-docker
For standard docker approaches I would say this is a simplistic mechanism that works really cleanly however if you start to look at things like docker-compose or kubernetes I would say it would be advantageous to avoid overriding the CMD or ENTRYPOINT flags where possible.
An alternative method would be to provide these as environment variables, therefore you could also use the Kubernetes secret APIs to handle things such as passwords and either map them to a file or environment variable for use within the application.
Would this be acceptable? If so I'm also happy to look at providing the same change to accumulo-docker to retain consistency?
There are a lot of out of date dependencies versions that should be updated.
I think it would be useful if the build created a tarball containing at least the following items. This would make it easy for a user to run the proxy.
File | Description |
---|---|
conf/proxy-env.sh | Bash code that sets up the classpath with deps like accumulo |
conf/proxy.properties | Properties needed for Accumulo and the proxy |
lib/accumulo2-proxy-1.0.0.jar | The proxy java code |
bin/accumulo-proxy | A script that sources proxy-env.sh and then runs the proxy passing in proxy.properties |
The source for the scripts and configuration could be placed in src/main/scripts
and src/main/config
in this repo.
There has been some discussion about whether the usage of kerberos in accumulo-proxy can be stripped out or if it needs to stay.
This ticket can be closed whether it is determined one way or another if kerberos should stay or be removed.
This probably could be done in a single PR since there are only a few tests.
In 65fd1bb I pushed some changes to wrap the Wait.waitFor()
blocks in the tests in assertTrue()
. The Wait.waitFor()
method will not fail on its own and simply returns a boolean which was being ignored in all of the tests, and still is being ignored in testCompactionSelector.
The condition in the test,
Wait.waitFor(() -> countFiles(tableName) == (expectedFileCount / 2));
checks to make sure that half of the files were compacted in accordance with the SelectHalfSelector being used in the test. After correctly checking the return value for this condition, it is evident that the selector is not working as intended and the number of files does not change after compaction.
When the proxy was released with Accumulo it was ok if used non-public API Accumulo code because there was a one to one correspondence between the versions. Now that it is on an independent release schedule it should ideally only uses Accumulo's public API because this will ensure releases of the proxy work with any version of Accumulo >= 2.0.
A first step in implementing this would be to update the pom to check the proxy source code to ensure only public API is used. For an example of how to do this see pom.xm line 194, contrib/checkstyle.xml, and contrib/import-control.xml in accumulo examples. As a first step we could add the checks to the pom with exceptions for any current non-api usage. This would prevent new non-api usage from being added while we work on removing all non-api usage.
The main accumulo repo and now other accumulo-* projects automatically generate license headers for new files or update license headers for existing files. The suggestion for this ticket is to add that functionality to accumulo-proxy.
It's fine if these are ignored, but another option might be to add the license header automatically, like the main accumulo repo does.
Originally posted by @ctubbsii in #54 (comment)
If automated license headers are added, the changes made in #54 can be reverted.
The current Dockerfile uses accumulo version 2.0.0 and other outdated dependencies. This file should be updated to work with accumulo 2.1.
In #59, the way a user authenticates to the proxy was changed. This changes some things that are used in proxy.properties
. For example, the proxy looks for a property with the key sharedSecret
which is not currently in that file so the sharedSecret
that is used to authenticate the client becomes null
.
This test now fails after the changes made in #42. Based on some discussion on that ticket (linked here), it may be the case that this test should be removed or replaced entirely.
From code review of #15, Simplify the proxy by dropping this login mechanism entirely, and make the proxy single-user (using the user specified in the proxy/client properties file).
Describe the bug
Having optimized our insertion of data to Accumulo (see
https://observablehq.com/@m-g-r/almost-600000-entries-per-second-from-lisp-to-accumulo)
I noticed that the data written was often not complete when deleting
entries with deleteCell
mutations. At the same time there were not any
errors to be seen on the client side nor in any log files.
The problem seems to be caused by a combination of three things of the
Accumulo Proxy, its Thrift interface but also in the client library of
Accumulo that is used by the proxy:
flush()
, close()
, addMutation()
etc. in the BatchWritersynchronized
" but theboolean closed
, theMutationSet mutations
, and the long
integer totalMemUsed
are not"Synchronized
" means that close()
cannot be run at the same time by
two threads but it still can run while addMutation()
is runnig, for
example.
Here, addMutation()
can be running and in a waiting state (for background
jobs to write data to Accumulo) while close()
is run by a new thread
which then prevents addMutation()
from finishing. (More on this
further down.)
oneway
".Thus errors cannot be sent back to the client immediately. Instead
if something gets wrong for an update call, the client can only be
informed by a subsequent call.
This seems to be the intention that the flush
or closeWriter
calls can
throw an MutationsRejectedException
. But this works only if those
calls are not handled too early. That is, if I send a number of update
calls the client continues without delay as these are oneway
calls.
The following flush
or closeWriter
will be send out immediately
as well. If the threads handling the update
calls are slower than the
threads handling the closeWriter()
, those slow update
calls cannot
be handled anymore.
At the same time, as the close
has happened already, the writer
cannot be used anymore and the client will never be informed about
those errors during the late updates.
update
call are not properly handled and do notThe reason seems to be that in 2013, when fixing
"ACCUMULO-1340 made proxy update call tolerate unknown session ids"
the catch clause from ProxyServer.update()
got changed like this:
try {
BatchWriterPlusException bwpe = getWriter(writer);
addCellsToWriter(cells, bwpe);
- } catch (Exception e) {
- throw new TException(e);
+ } catch (UnknownWriter e) {
+ // just drop it, this is a oneway thrift call and throwing a TException seems to make all subsequent thrift calls fail
}
}
with the side effect that also any other exceptions aside from
UnknownWriter
do not get thrown as TExceptions
now. And Accumulo Proxy
seems to ignore it aside from writing to stdout or stderr about it.
I only saw the reason for our dropped mutations when running the
Accumulo Proxy in the foreground:
2022-08-08 13:55:05,897 [thrift.ProcessFunction] ERROR: Internal error processing update
java.lang.IllegalStateException: Closed
at org.apache.accumulo.core.clientImpl.TabletServerBatchWriter.addMutation(TabletServerBatchWriter.java:243)
at org.apache.accumulo.core.clientImpl.BatchWriterImpl.addMutation(BatchWriterImpl.java:44)
at org.apache.accumulo.proxy.ProxyServer.addCellsToWriter(ProxyServer.java:1389)
at org.apache.accumulo.proxy.ProxyServer.update(ProxyServer.java:1453)
at jdk.internal.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$8(TraceUtil.java:232)
at com.sun.proxy.$Proxy9.update(Unknown Source)
at org.apache.accumulo.proxy.thrift.AccumuloProxy$Processor$update.getResult(AccumuloProxy.java:9652)
at org.apache.accumulo.proxy.thrift.AccumuloProxy$Processor$update.getResult(AccumuloProxy.java:9633)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:61)
at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
at org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:112)
at org.apache.thrift.server.Invocation.run(Invocation.java:18)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.base/java.lang.Thread.run(Thread.java:830)
Alas, the client code thinks all went well and continues to run as if no error has happened.
Versions (OS, Maven, Java, and others, as appropriate):
To Reproduce
I have written a little test case to check the severity of the problem
but as these are written in Common Lisp they will probably not be of
help for you. I describe them instead.
First, I add a number of simple entries to Accumulo (just numbers
as key and value), then I count. Afterwards I try to delete all entries,
and count again if the deletion was successful.
I do this deletion with a batch scanner over all entries, creating
simple update mutation with a ColumnUpdate
with deleteCell
true
for
each row found by the scanner. The updates I send to Accumulo with
a writer. After the last update call I explicitly call flush and then
close the writer. In Lisp this deletion function looks like this:
(defun delete-all-values (table-name &key (k *scanner-next-k-entries*))
;; use a separate connection for the scanner, to make it as quick as doing the updates after the scanning
(accumulo-client:with-connection (writer-connection)
(accumulo-client:with-connection (accumulo-client:*connection*)
(let ((writer (raccumulo.i::create-writer table-name writer-connection)))
(unwind-protect
(accumulo-client:with-scanner (scanner table-name)
(:batch-scanner-p t :threads *scanner-threads*)
(loop for (entries more-p) = (multiple-value-list (accumulo-client:scanner-next-k-entries scanner :k k))
do (dolist (key-value entries)
(let* ((key (accumulo:keyvalue-key key-value))
(row (accumulo:key-row key)))
(accumulo.accumulo-proxy:update (accumulo-client:connection-client writer-connection)
writer
(thrift:map row
(thrift:list
(accumulo:make-columnupdate :delete-cell t))))))
while more-p))
(raccumulo.i::flush-writer writer)
(raccumulo.i::close-writer writer))))))
The test function is:
(defun test (&optional (count 1000))
(delete-entries :check-at-end t)
(count-entries)
(insert-entries count)
(format t "~&inserted: ~d~%" (count-entries))
(delete-entries)
(let* ((num (count-entries))
(succ (zerop num)))
(format t "after deletion: ~d~%" num)
(format t "~a~&" (if succ :successful :failed))
(values succ num count)))
And then a loop to do it a number of times is:
(defun test-loop (&optional (times 3) (count 1000))
(every #'identity
(loop for i from 0 below times
do (format t "~&~%round: ~d" i)
collect (test count))))
When I call it to make 10 rounds with 100.000 entries each, the outcome is:
round: 0
inserted: 100000
after deletion: 15381
FAILED
round: 1
inserted: 100000
after deletion: 13338
FAILED
round: 2
inserted: 100000
after deletion: 18683
FAILED
round: 3
inserted: 100000
after deletion: 14296
FAILED
round: 4
inserted: 100000
after deletion: 9983
FAILED
round: 5
inserted: 100000
after deletion: 16286
FAILED
round: 6
inserted: 100000
after deletion: 12158
FAILED
round: 7
inserted: 100000
after deletion: 18712
FAILED
round: 8
inserted: 100000
after deletion: 10087
FAILED
round: 9
inserted: 100000
after deletion: 18290
FAILED
Each time a couple of thousand entries stay in the table.
In the best case "only" 9.983 and in the worst case even 19.290.
The Accumulo Proxy displays 37 times "ERROR: Internal error processing update java.lang.IllegalStateException: Closed
" during that call.
Full result attached: 20220808-tests-oneway_again-with-errors.txt
I wrote another very simple test function to see how many updates I can
send at a time without getting a fault:
(defun meta-test-loop (&optional (times 3) (max 100) (step 1))
(every #'identity
(loop for i from 0 below max by step
do (format t "~&~%meta round: ~d" i)
collect (test-loop times i) into result
do (format t "~&~%meta round: ~d, result: ~a" i result)
finally (return result))))
I called it as "(meta-test-loop 10 10 1)
" that is start from 0 to 10 and
write that number of entries 10 times. Already in round 5 it failed
once. In round 6 it failed six times, in round 9 it failed 8 times out
of 10. Full result attached: 20220808-tests-oneway_again-with-errors2.txt
Workarounds
When I add a delay of at least a couple of 100ms before closeWriter
the problem starts to vanish. But as I do not receive any errors
during an update because of problem 3 above, I can never be sure if it
really succeeded. If the machine is under heavy load it might change.
For a delete it is simple: I can count the entries at the end and if
the number is not zero, I need to wait longer. That is what I have
implemented in the function "(delete-entries :check-at-end t)
".
But for more complex mutation, this is not feasible. (As basically
all mutation work needs to be retrieved from the server and checked
explicitly.)
The only easy workaround was to change the update call not to be oneway
anymore and recompile the Java and Common Lisp Thrift interface of
the Accumulo Proxy and then build a new Accumulo Proxy. With that change
I do not see any errors anymore and all deletions are successful. The
tests above as "(test-loop 10 100000)
" run without any errors at all.
But that comes with a severe drop in performance, instead of 600,000
entries per second for my benchmark I get only 250,000. Other more
complex import tasks take 19 hours instead of 3.
More on the flush operation of the BatchWriter
and analysis
The flush
operation as implemented in BatchWriter
in close()
just
waits that all work as stored in the MutationSet
is handled by the
mutation writer background threads.
This might be good enough for an inbetween flush but not if you want
to close()
and thus terminate or stutdown the writer. There might be
threads just in the moment adding to the mutations.
This code is in the core of Accumulo in the file:
https://github.com/apache/accumulo/blob/rel/2.0.1/core/src/main/java/org/apache/accumulo/core/clientImpl/TabletServerBatchWriter.java
There is a longer comment at the beginning on how it operates.
It just looks at memory usage of the mutation, which is computed and
updated. Each added mutation increases it by an estimation, each time
a mutation is sent to the server it is reduced by the bytes sent.
flush()
or close()
just waits while "totalMemUsed > 0 && !somethingFailed
"
holds true, and assumes afterwards that all work is done. This would
usually be the case when totalMemUsed
reaches zero.
addMutation()
increases totalMemUsed
in the line:
totalMemUsed += m.estimatedMemoryUsed();
but that line is quite late in the function and the counter seems not
be protected to be used from threads running in parallel. Only the
functions flush()
, close()
, addMutation()
etc. are all marked
"synchronized
" but that means close()
can run while addMutation()
is running.
When I write 100.000 entries to Accumulo in one go, I expect there to
be quite a number of threads running addMutatation()
which would wait
in the line
waitRTE(() -> (totalMemUsed > maxMem || flushing) && !somethingFailed);
But at the end when close()
is called, close
immediately sets
closed = true;
which then triggers the check in addMutation()
just following the
WaitRTE()
above:
// do checks again since things could have changed while waiting and not holding lock
if (closed)
throw new IllegalStateException("Closed");
And that leads to the observed "java.lang.IllegalStateException: Closed
"
as reported by Accumulo Proxy.
Hm, it is really just the flag "closed
" that causes this problem.
But the waiting by the line
waitRTE(() -> totalMemUsed > 0 && !somethingFailed);
in close()
is also not enough to make sure that no other thread is not
adding already more work in addMutation()
as it got past the second
"if (closed)
" check and handled the mutation already before increasing
the memory counter.
This all seems rather thread unsafe. The precautions are not effective.
In addition to this, it would be good if the client of the Accumulo Proxy
had also a chance to test if all work was done. For example, by flush
returning the number of mutation processed.
I have no idea why this is not a problem for others. Is it not?
The Common Lisp implementation code for Thrift compiles to native
machine code, which runs efficiently, while having something delay the
close
just a little bit often alleviates the problem. But the problem should
also exhibit itself when using the Java client library alone, that is, without
the Accumulo Proxy (as long as one does not explicitly manage all threads
oneself and makes sure that close()
is never run as long as there are
threads that might call addMutation()
). Strange.
Background
Currently the master branch of accumulo-proxy ships with maven-enforcer-plugin to require as minimum of:
These are configured in the pom.xml as shown below:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
<executions>
<execution>
<!-- must be same id as in the apache parent pom, to override the version -->
<id>enforce-maven-version</id>
<goals>
<goal>enforce</goal>
</goals>
<phase>validate</phase>
<configuration>
<rules>
<requireMavenVersion>
<version>[3.5.0,)</version>
</requireMavenVersion>
<requireJavaVersion>
<version>[11,)</version>
</requireJavaVersion>
</rules>
</configuration>
</execution>
</executions>
</plugin>
When attempting to compile accumulo-proxy with Java 14 and Maven 3.6.3 (both meet the requirements) compilation fails as the maven-enforcer-plugin fails to execute (see below for full logs)
Looking into this it seems to be due to Java including openjdk changing their numbering conventions, I checked both:
/Library/Java/JavaVirtualMachines/jdk-14.jdk/Contents/Home/bin/java --version
java 14 2020-03-17
Java(TM) SE Runtime Environment (build 14+36-1461)
Java HotSpot(TM) 64-Bit Server VM (build 14+36-1461, mixed mode, sharing)
/Library/Java/JavaVirtualMachines/openjdk-14.jdk/Contents/Home/bin/java --version
openjdk 14 2020-03-17
OpenJDK Runtime Environment (build 14+36-1461)
OpenJDK 64-Bit Server VM (build 14+36-1461, mixed mode, sharing)
maven-enforcer-plugin comes with a target to verify the values being retrieved from the environment and I've executed this both on the default (1.4.1) and on the latest version (3.0.0-M3)
1.4.1
This successfully gets the maven version (3.6.3) but fails to get the Java version, specifically see the line "[ERROR] : begin 0, end 3, length 2" towards the end, this is where it is trying to parse the version number and fails.
mvn org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:display-info
[INFO] Scanning for projects...
[INFO]
[INFO] -----------------< org.apache.accumulo:accumulo-proxy >-----------------
[INFO] Building Apache Accumulo Proxy 2.0.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-enforcer-plugin:1.4.1:display-info (default-cli) @ accumulo-proxy ---
[INFO] Maven Version: 3.6.3
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.697 s
[INFO] Finished at: 2020-04-16T14:14:05-04:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:display-info (default-cli) on project accumulo-proxy: Execution default-cli of goal org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:display-info failed: An API incompatibility was encountered while executing org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:display-info: java.lang.ExceptionInInitializerError: null
[ERROR] -----------------------------------------------------
[ERROR] realm = plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4.1
[ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
[ERROR] urls[0] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4.1/maven-enforcer-plugin-1.4.1.jar
[ERROR] urls[1] = file:/Users/nathanielfreeman/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
[ERROR] urls[2] = file:/Users/nathanielfreeman/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.11/plexus-interpolation-1.11.jar
[ERROR] urls[3] = file:/Users/nathanielfreeman/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
[ERROR] urls[4] = file:/Users/nathanielfreeman/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
[ERROR] urls[5] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
[ERROR] urls[6] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
[ERROR] urls[7] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
[ERROR] urls[8] = file:/Users/nathanielfreeman/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
[ERROR] urls[9] = file:/Users/nathanielfreeman/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
[ERROR] urls[10] = file:/Users/nathanielfreeman/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
[ERROR] urls[11] = file:/Users/nathanielfreeman/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
[ERROR] urls[12] = file:/Users/nathanielfreeman/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.22/plexus-utils-3.0.22.jar
[ERROR] urls[13] = file:/Users/nathanielfreeman/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar
[ERROR] urls[14] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/enforcer/enforcer-api/1.4.1/enforcer-api-1.4.1.jar
[ERROR] urls[15] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/enforcer/enforcer-rules/1.4.1/enforcer-rules-1.4.1.jar
[ERROR] urls[16] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
[ERROR] urls[17] = file:/Users/nathanielfreeman/.m2/repository/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
[ERROR] urls[18] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/shared/maven-dependency-tree/2.2/maven-dependency-tree-2.2.jar
[ERROR] urls[19] = file:/Users/nathanielfreeman/.m2/repository/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar
[ERROR] urls[20] = file:/Users/nathanielfreeman/.m2/repository/org/eclipse/aether/aether-util/0.9.0.M2/aether-util-0.9.0.M2.jar
[ERROR] urls[21] = file:/Users/nathanielfreeman/.m2/repository/org/codehaus/plexus/plexus-i18n/1.0-beta-6/plexus-i18n-1.0-beta-6.jar
[ERROR] urls[22] = file:/Users/nathanielfreeman/.m2/repository/org/apache/maven/plugin-testing/maven-plugin-testing-harness/1.3/maven-plugin-testing-harness-1.3.jar
[ERROR] urls[23] = file:/Users/nathanielfreeman/.m2/repository/org/codehaus/plexus/plexus-archiver/2.2/plexus-archiver-2.2.jar
[ERROR] urls[24] = file:/Users/nathanielfreeman/.m2/repository/org/codehaus/plexus/plexus-io/2.0.4/plexus-io-2.0.4.jar
[ERROR] urls[25] = file:/Users/nathanielfreeman/.m2/repository/junit/junit/4.11/junit-4.11.jar
[ERROR] urls[26] = file:/Users/nathanielfreeman/.m2/repository/org/hamcrest/hamcrest-core/1.3/hamcrest-core-1.3.jar
[ERROR] Number of foreign imports: 1
[ERROR] import: Entry[import from realm ClassRealm[maven.api, parent: null]]
[ERROR]
[ERROR] -----------------------------------------------------
[ERROR] : begin 0, end 3, length 2
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginContainerException
**3.0.0-M3 **
This successfully executes and identifies the correct versions:
mvn org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M3:display-info
[INFO] Scanning for projects...
[INFO]
[INFO] -----------------< org.apache.accumulo:accumulo-proxy >-----------------
[INFO] Building Apache Accumulo Proxy 2.0.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-enforcer-plugin:3.0.0-M3:display-info (default-cli) @ accumulo-proxy ---
[INFO] Maven Version: 3.6.3
[INFO] JDK Version: 14 normalized as: 14
[INFO] OS Info: Arch: x86_64 Family: mac Name: mac os x Version: 10.15.4
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.657 s
[INFO] Finished at: 2020-04-16T14:13:00-04:00
[INFO] ------------------------------------------------------------------------
If someone can assign the issue to me I have a proposed fix of simply upgrading the dependency to 3.0.0-M3
accumulo-proxy is really a separate project from the core Accumulo distribution, and I think it should be shipped separately, and maintained separately in its own repo.
This will also help isolate the maintenance interest, so that if there isn't any current interest, no time is wasted on it, and if there is, work can be done independently from the releases of the core Accumulo distribution.
From #5
The Proxy server can be configured to start a MiniAccumulo instance and Mock Accumulo instance. This should be removed from the Proxy. Any set up of MiniAccumulo should be done outside of Proxy code. This will require changes in proxy tests and ITs.
Code suggests adding non-string values using Accumulo Proxy do not work since Thrift is defaulted to String value.
Is there a way or examples to insert Date/Integer types into Accumulo Proxy using python?
Example, if you replace values in basic_client.py, you will get a runtime error.
row1 = {'a':[ColumnUpdate('a','a',value=1), ColumnUpdate('b','b',value=2]}
I tried to create a Proxy client in Python 3 but it did not work. This is probably due to the thrift bindings need to be regenerated to work with Python 3.
Not sure all that this entails. When making Fluo and Accumulo work well in docker it was important that the script could accept options on the command line. Would be nice if the proxy script also did this, so could do something like.
docker run accumulo-proxy -o proxyopt1=foo -o proxyopt2=foofoo
Also, if the default proxy log4j config just prints stuff to stdout, this works nicely when running in docker.
This is related to #5
I was checking to see if there were any methods in TableOperations
that were not present in the proxy and found a few. I assume that most if not all of these are purposefully not implemented but wanted to double check.
getConfiguration()
locate()
modifyProperties()
isOnline()
setSamplerConfiguration()
clearSamplerConfiguration()
getSamplerConfiguration()
summaries()
addSummarizers()
removeSummarizers()
listSummarizers()
getTimeType()
getSplits()
getProperties()
This test now fails after the changes made in #42. Based on some discussion on that ticket (linked here), it may be the case that this test should be removed or replaced entirely.
I think the changes in #42 have caused the logging to no longer behave correctly. This may be due to a mismatch in versions between accumulo-proxy and the main accumulo repo or something else.
During the pull request here: #20 @mjwall spotted that we were being a bit inefficient with our container size by storing the tar before extracting it.
This should be cleaned up and ideally done in a single step by updating the download_bin() method.
For consistency sakes we should also ideally do this on the accumulo-docker repo https://github.com/apache/accumulo-docker/blob/master/Dockerfile
The update for the fix in Accumulo apache/accumulo#1828 caused the ITs to throw ThriftSecurityException
.
This one is from org.apache.accumulo.proxy.its.TJsonProtocolProxyIT
:
2022-06-01 14:39:01,565 [thrift.ProcessFunction] ERROR: Internal error processing tableIdMap org.apache.thrift.TException: org.apache.accumulo.core.client.AccumuloSecurityException: Error BAD_CREDENTIALS for user user - Username or Password is Invalid at org.apache.accumulo.proxy.ProxyServer.tableIdMap(ProxyServer.java:733) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$8(TraceUtil.java:232) at com.sun.proxy.$Proxy20.tableIdMap(Unknown Source) at org.apache.accumulo.proxy.thrift.AccumuloProxy$Processor$tableIdMap.getResult(AccumuloProxy.java:8553) at org.apache.accumulo.proxy.thrift.AccumuloProxy$Processor$tableIdMap.getResult(AccumuloProxy.java:8533) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:61) at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518) at org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:112) at org.apache.thrift.server.Invocation.run(Invocation.java:18) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.apache.accumulo.core.client.AccumuloSecurityException: Error BAD_CREDENTIALS for user user - Username or Password is Invalid at org.apache.accumulo.core.clientImpl.ServerClient.executeVoid(ServerClient.java:73) at org.apache.accumulo.core.clientImpl.ConnectorImpl.(ConnectorImpl.java:66) at org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:228) at org.apache.accumulo.proxy.ProxyServer.getConnector(ProxyServer.java:228) at org.apache.accumulo.proxy.ProxyServer.tableIdMap(ProxyServer.java:731) ... 18 more Caused by: ThriftSecurityException(user:user, code:BAD_CREDENTIALS) at org.apache.accumulo.core.clientImpl.thrift.ClientService$authenticate_result$authenticate_resultStandardScheme.read(ClientService.java:16388) at org.apache.accumulo.core.clientImpl.thrift.ClientService$authenticate_result$authenticate_resultStandardScheme.read(ClientService.java:16366) at org.apache.accumulo.core.clientImpl.thrift.ClientService$authenticate_result.read(ClientService.java:16310) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88) at org.apache.accumulo.core.clientImpl.thrift.ClientService$Client.recv_authenticate(ClientService.java:475) at org.apache.accumulo.core.clientImpl.thrift.ClientService$Client.authenticate(ClientService.java:461) at org.apache.accumulo.core.clientImpl.ConnectorImpl.lambda$new$0(ConnectorImpl.java:67) at org.apache.accumulo.core.clientImpl.ServerClient.executeRawVoid(ServerClient.java:117) at org.apache.accumulo.core.clientImpl.ServerClient.executeVoid(ServerClient.java:71) ... 22 more
In the steps to create a ruby client, it gives a command to run the example ruby file:
bundle exec client.rb
I was unable to get that to work and instead had to run:
bundle exec ruby client.rb
Not sure if this is unique to my system or not.
I am have installed ruby version ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
In the tests, there is a lot of places where an exception is expected to be thrown and instead of using JUnits assertThrows(), a try-catch block is used and then fail() is called if the expected exception is not thrown. For example:
Current:
try {
client.clearLocatorCache(creds, doesNotExist);
fail("exception not thrown");
} catch (TableNotFoundException ex) {}
Proposed:
assertThrows(TableNotFoundException.class, ()-> client.clearLocatorCache(creds, doesNotExist));
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.