apache / accumulo Goto Github PK

Apache Accumulo

License: Apache License 2.0

Shell 0.66% Java 97.51% HTML 0.04% Thrift 0.35% C++ 0.27% JavaScript 0.64% CSS 0.06% Makefile 0.02% C 0.02% FreeMarker 0.44%

accumulo big-data hacktoberfest

accumulo's Introduction

Apache Accumulo is a sorted, distributed key/value store that provides robust, scalable data storage and retrieval. With Apache Accumulo, users can store and manage large data sets across a cluster. Accumulo uses Apache Hadoop's HDFS to store its data and Apache Zookeeper for consensus.

Download the latest version of Apache Accumulo on the project website.

Getting Started

Follow the quick start to install and run Accumulo
Read the Accumulo documentation
Run the Accumulo examples to learn how to write Accumulo clients
View the Javadocs to learn the Accumulo API

More resources can be found on the project website.

Building

Accumulo uses Maven to compile, test, and package its source. The following command will build the binary tar.gz from source. Add -DskipTests to build without waiting for the tests to run.

mvn package

This command produces assemble/target/accumulo-<version>-bin.tar.gz

Contributing

Contributions are welcome to all Apache Accumulo repositories.

If you want to contribute, read our guide on our website.

Export Control

Click here to show/hide details

This distribution includes cryptographic software. The country in which you currently reside may have restrictions on the import, possession, use, and/or re-export to another country, of encryption software. BEFORE using any encryption software, please check your country's laws, regulations and policies concerning the import, possession, or use, and re-export of encryption software, to see if this is permitted. See https://www.wassenaar.org/ for more information.

The U.S. Government Department of Commerce, Bureau of Industry and Security (BIS), has classified this software as Export Commodity Control Number (ECCN) 5D002.C.1, which includes information security software using or performing cryptographic functions with asymmetric algorithms. The form and manner of this Apache Software Foundation distribution makes it eligible for export under the License Exception ENC Technology Software Unrestricted (TSU) exception (see the BIS Export Administration Regulations, Section 740.13) for both object code and source code.

The following provides more details on the included cryptographic software:

Apache Accumulo uses the built-in java cryptography libraries in its RFile encryption implementation. See oracle's export-regulations doc for more details for on Java's cryptography features. Apache Accumulo also uses the bouncycastle library for some cryptographic technology as well. See the BouncyCastle site for more details on bouncycastle's cryptography features.

accumulo's People

Contributors

Stargazers

Watchers

Forkers

orenfalkowitz codemonkey86 ambleside tlpinney jatrost keith-turner skuehn seanjensengrey joshelser ekohlwey marciosilva joshdsullivan cliveb rbrady mathewsbabu crohling niumowm ctubbsii mahesh2013 24601 nhambletccri mdimarco xuyanhui rmarshasatx jasonnic technmsg cjnolet eric-usca rtvt123 vdt datatacticscorp whiz carl-platt harayz apurbad jpmcnamee anthonyccri calebtmurphy apeshave ryanleary lazycrazyowl haydenmarchant mikewalch moesol drewfarris panosl1 crigano hustjl22 taksaito snowwolph chagge lkjx77 ibmsoe jsbaue xuchenw srikanth-viswanathan echeipesh ericnewton dlyle65535 mwaineo dhutchis michaelbraun edcoleman vaibhavthapliyal phrocker tanadeau thormanrd rwgdrummer jbrahy mattweyant henac amyankney fysoft2006 rweeks binderparty phemisystems maxj votrongdao tomzhang klucar shraddha512 ollie314 shenguoquan bigdataswami hh24k osguydch matthew-dailey adamjshook mjseakan shawnwalker runt18 stevenludwig srcclrapache1 sobolsigizmund desperado1992 jwonders milleruntime jxsd0084 melrief cjmctague

accumulo's Issues

Token file functionality for mapred code deprecated, but not replaced

As fallout from the recent Connector builder stuff that @mikewalch worked on for 2.0, many of the old APIs were deprecated in the MapReduce code to be replaced with ConnectionInfo. However, the mapreduce APIs for storing an authentication token in a job's distributed cache, and reading it from within the mappers does not have an equivalent in the new code.

What is needed is two things:

The ability to serialize the entire connection info into the job, without exposing the credentials in the Hadoop configuration, and
The ability to serialize arbitrary authentication tokens, not just the known ones that the current Connector builder has special code to handle. (A whole authentication token serializer was previously created for this purpose, but it is not used in the new code.)

Investigate ThriftBehaviorIT checking for runtime exception handling

ThriftBehaviorIT should be checking to ensure thrift is behaving as expected with regard to exception handling. However, the test seems to pass whether or not the thrift code is generated with the handle_runtime_exceptions flag. This indicates a potentially broken test, or insufficient test coverage. Either way, it must be investigated prior to releasing 2.0.

Add createIfNotExists method to table/namespace operations

Currently, if you want to create a namespace or table in Accumulo that may already exists, you need to try to create it and ignore the exception.

  try {
      connector.namespaceOperations().create("mynamespace");
    } catch (NamespaceExistsException e) {
      // ignore
    }

    try {
      connector.tableOperations().create("mynamespace.mytable");
    } catch (TableExistsException e) {
      // ignore
    }

It would be nice if a createIfNotExists method was added to the API that ignored this exception. This would simplify the above code to the following:

connector.namespaceOperations().createIfNotExists("mynamespace");
connector.tableOperations().createIfNotExists("mynamespace.mytable");

Organize Fate Operations

The RepOs of every fate operation are ALL in the org.apache.accumulo.master.tableOps package. It is annoying to track down which RepO goes with which operation. Creating a package in o.a.a.master.tableOps for each operation would organize them nicely. This would have to be done in 2.0 since it will break serialization.

Monitor 2.0 Bulk Import State is funky

Follow on for #436 and for new Monitor. The Monitor had previously displayed (or attempted to) the state of Bulk import processes. Hopefully, for Bulk Import 2.0, this won't be needed. If so, then the bulk import page could be removed in Monitor 2.0. Otherwise we need to determine how/what to monitor for the new bulk import.

Utilize NewTableConfiguration in ITs

There are places in our ITs that I think could use NewTableConfiguration when tables are created. This should help make the tests more stable.

MiniAccumuloClusterTest
LargeRowIT
TabletIT
SpitIT
RegexGroupBalanceIT
SessionDurabilityIT
ConfigurableCompactionIT
DurabilityIT

Not sure about the Replication tests... If you have to create both tables first, then set the table replication property?

ReplicationIT
UnorderedWorkAssignerReplicationIT
CyclicReplicationIT
KerberosReplicationIT
MultiInstanceReplicationIT

Support new bulk import API in shell

This is follow up work for #436. Need to support the new bulk import API in the shell.

Retry when conncurrent merge happens in bulk import.

This is follow up work to #436. Currently, the new bulk import API detects concurrent merges and throws an exception. It would be better to retry.

Remove uses of deprecated ClientConfiguration

ClientConfiguration was deprecated in favor of Connector.builder(). There are about 353 usages of ClientConfiguration. As many usages should be removed if possible.

Inline BlockFile interfaces

I think we can take a step towards cleaning up RFile, BCFile and CacheableBlockFile by eliminating some of the intermediate interfaces. There are 4 in particular in org.apache.accumlo.core.file.blockfile that are redundant and only add confusion/complexity. I think the classes that implement these interfaces can be in-lined.

ABlockWriter
ABlockReader
BlockFileReader
BlockFileWriter

Here is an example where one of these interfaces is misleading:

private static class LocalityGroupReader extends LocalityGroup implements FileSKVIterator {
...
  private IndexIterator iiter;
  private int entriesLeft;
  private ABlockReader currBlock;
...
  currBlock = getDataBlock(indexEntry);

The LocalityGroupReader stores a ABlockReader for the currBlock data block which is actually a BlockRead...

Update copyright date to 2018

Noticed this one on both 1.7.4 and 1.9.0 release. NOTICE files still said 2017.

Hadoop2 Metrics does not retry sending if metrics server is down

Original issue: https://issues.apache.org/jira/browse/ACCUMULO-4849

2018-03-16 11:01:14,726 [impl.MetricsSystemImpl] WARN : Error creating sink 'graphite'
org.apache.hadoop.metrics2.impl.MetricsConfigException: Error creating plugin: org.apache.hadoop.metrics2.sink.GraphiteSink
at org.apache.hadoop.metrics2.impl.MetricsConfig.getPlugin(MetricsConfig.java:203)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.newSink(MetricsSystemImpl.java:529)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configureSinks(MetricsSystemImpl.java:501)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.configure(MetricsSystemImpl.java:480)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:189)
at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:164)
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:54)
at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50)
at org.apache.accumulo.server.metrics.MetricsSystemHelper$MetricsSystemHolder.<clinit>(MetricsSystemHelper.java:46)
at org.apache.accumulo.server.metrics.MetricsSystemHelper.getInstance(MetricsSystemHelper.java:50)
at org.apache.accumulo.tserver.metrics.TabletServerMetricsFactory.<init>(TabletServerMetricsFactory.java:45)
at org.apache.accumulo.tserver.TabletServer.<init>(TabletServer.java:401)
at org.apache.accumulo.tserver.TabletServer.main(TabletServer.java:3086)
at org.apache.accumulo.tserver.TServerExecutable.execute(TServerExecutable.java:43)
at org.apache.accumulo.start.Main.lambda$execKeyword$0(Main.java:122)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.metrics2.MetricsException: Error creating connection, localhost:2004
at org.apache.hadoop.metrics2.sink.GraphiteSink$Graphite.connect(GraphiteSink.java:160)
at org.apache.hadoop.metrics2.sink.GraphiteSink.init(GraphiteSink.java:64)
at org.apache.hadoop.metrics2.impl.MetricsConfig.getPlugin(MetricsConfig.java:199)
... 15 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at java.net.Socket.<init>(Socket.java:434)
at java.net.Socket.<init>(Socket.java:211)
at org.apache.hadoop.metrics2.sink.GraphiteSink$Graphite.connect(GraphiteSink.java:152)
... 17 more

Mini Accumulo Cluster class loaders behaviour

Hey all!

I'm trying to use (the first attempt to move from accumulo mock client) Mini Accumulo Cluster in SBT tests and experience an issue with this line: https://github.com/apache/accumulo/blob/rel/1.9.1/minicluster/src/main/java/org/apache/accumulo/minicluster/impl/MiniAccumuloClusterImpl.java#L272

The thing is that it loads the following class loaders:

URLClassLoader with NativeCopyLoader with RawResources(
  urls = List(...),
  parent = DualLoader(a = java.net.URLClassLoader@1c345372, b = java.net.URLClassLoader@69ea3742),
  resourceMap = Set(app.class.path, boot.class.path),
  nativeTemp = ...
)
DualLoader(a = java.net.URLClassLoader@1c345372, b = java.net.URLClassLoader@69ea3742)
NullLoader // <!- here is the issue
sun.misc.Launcher$AppClassLoader@75b84c92
sun.misc.Launcher$ExtClassLoader@483f6d77

The error I'm getting is: Unknown classloader type : sbt.internal.inc.classpath.NullLoader
Where the NullLoader itself is:

package sbt.internal.inc.classpath
final class NullLoader() extends java.lang.ClassLoader { ... }

Mb it makes sense to skip unknown class loaders rather than to throw exceptions and to throw an exception only if there are no class loaders at all? Also any workarounds are welcome and let me know if it makes sense to create issue on the SBT side rather than here.

Would be glad to fix and to test this issue in case it's indeed required.

Update:

Also I discovered a workaround on the SBT side (accidentally right after posting this issue), to fork JVM in tests (SBT setting: fork in Test := true).

Re-implement write ahead log archiving

In the past the Accumulo could archive files and write ahead logs instead of deleting them. This was useful for debugging data loss situations that occurred during testing. It seems the functionality to archive write ahead logs no longer exists, it may be useful to bring it back.

A work around to not having this functionality during test is to not run the Accumulo GC process. However the drawback of this is that GC is not tested. What if the GC process would have deleted a file that it shouldn't if it were running? It would be nice to see this happen.

Add support for Zstandard Compression

Hadoop added support for the Zstd compression library in version 2.9.0 https://issues.apache.org/jira/browse/HADOOP-13578

Zstandard was open source'd by Facebook a few years ago and looks like a nice improvement over gzip (better compression ratio and speed) and snappy (better compression ratio).
https://facebook.github.io/zstd/

LogSorter InputStream closes prematurely

Some time ago, LogSorter was modified to handle partial headers and the FSDataInputStream in sort() was moved to a try-with-resources. It appears this change caused the FSDataInputStream to close prematurely, before the finally can call the close() method. The close method gets the bytesCopied before closing the input so this probably results in bytesCopied in the Monitor to always be -1. I think this happens every time a WAL is sorted on recovery, showing this error:

2018-05-23 14:19:55,298 [log.LogSorter] ERROR: Error during cleanup sort/copy 2841a75d-8086-4fdb-a736-5f0bf60ff42e
java.io.IOException: Stream is closed!
        at org.apache.hadoop.fs.BufferedFSInputStream.getPos(BufferedFSInputStream.java:56)
        at org.apache.hadoop.fs.FSDataInputStream.getPos(FSDataInputStream.java:72)
        at org.apache.accumulo.tserver.log.LogSorter$LogProcessor.close(LogSorter.java:198)
        at org.apache.accumulo.tserver.log.LogSorter$LogProcessor.sort(LogSorter.java:169)
        at org.apache.accumulo.tserver.log.LogSorter$LogProcessor.process(LogSorter.java:96)
        at org.apache.accumulo.server.zookeeper.DistributedWorkQueue$1.run(DistributedWorkQueue.java:109)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:748)

We could move the FSDataInputStream back to a regular try/catch or refactor the sort method to close things properly.

Cache rfile file lengths

Rfiles store metadata at the end of the file. To open a RFile, the file length must first be obtained. If the file lengths were cached, this could possibly avoid a trip to the namenode. This caching could be done in tservers and in the new bulk import code (#436).

Reverse first sort direction when a user sorts on a column in the monitor

In practice, we almost always want to see the greatest when sorting columns (ingest, entries, running scans, etc). Right now, this requires 2 sorts since the default sort is ascending.

-1 tablet log id used in WAL when table durability is set to None

When a tables durability is set to none tablets get a tablet log id of -1. The durability can be set per batch writer and the table setting can be changed. Howerver when this change in durability happens, the tablet id in the WALs is still -1. This means that a tablet may recover data from other tablets because multiple tablets would be mapped to the same id of -1.

I discovered this with changes I made for #458. My changes caused an IT to fail with this pre-existing bug. This bug may have existed since 1.7.0. This bug would only be seen when setting table.durability=NONE and then later changing it to something else.

MonitorUtil#getMonitorLocation throws NoNodeException

In the rare case where this method is called before a monitor ever ran, it will propagate a NoNodeException. It should just return null.

Support exact deletes for tables

https://issues.apache.org/jira/browse/ACCUMULO-4629

Provide grafana dashboard and report more metrics

https://issues.apache.org/jira/browse/ACCUMULO-4850

PR #403

Vfs2 error in Shell

I changed table.file.compress.type on a table and after writing to it so the vfs2 exception in the shell:

06:37:55 {master} ~/sw/uno$ accumulo org.apache.accumulo.test.TestIngest -i uno -u root -p secret --rows 3000
2018-04-24 18:38:08,541 [trace.DistributedTrace] INFO : SpanReceiver org.apache.accumulo.tracer.ZooTraceClient was loaded successfully.
       3,000 records written |   10,489 records/sec |    3,087,000 bytes written | 10,793,706 bytes/sec |  0.286 secs   
Exception in thread "Thread-2" java.lang.RuntimeException: No files-cache implementation set.
	at org.apache.commons.vfs2.provider.AbstractFileSystem.getCache(AbstractFileSystem.java:209)
	at org.apache.commons.vfs2.provider.AbstractFileSystem.getFileFromCache(AbstractFileSystem.java:222)
	at org.apache.commons.vfs2.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:332)
	at org.apache.commons.vfs2.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:317)
	at org.apache.commons.vfs2.provider.AbstractFileObject.resolveFile(AbstractFileObject.java:2007)
	at org.apache.commons.vfs2.provider.AbstractFileObject.resolveFiles(AbstractFileObject.java:2052)
	at org.apache.commons.vfs2.provider.AbstractFileObject.getChildren(AbstractFileObject.java:1222)
	at org.apache.commons.vfs2.impl.DefaultFileMonitor$FileMonitorAgent.checkForNewChildren(DefaultFileMonitor.java:553)
	at org.apache.commons.vfs2.impl.DefaultFileMonitor$FileMonitorAgent.check(DefaultFileMonitor.java:667)
	at org.apache.commons.vfs2.impl.DefaultFileMonitor$FileMonitorAgent.access$200(DefaultFileMonitor.java:423)
	at org.apache.commons.vfs2.impl.DefaultFileMonitor.run(DefaultFileMonitor.java:376)
	at java.lang.Thread.run(Thread.java:748)

06:40:10 {master} ~/sw/uno$ accumulo shell -u root -p secret -e 'flush -t test_ingest'
2018-04-24 18:40:22,106 [trace.DistributedTrace] INFO : SpanReceiver org.apache.accumulo.tracer.ZooTraceClient was loaded successfully.
2018-04-24 18:40:22,387 [shell.Shell] INFO : Flush of table test_ingest initiated...
Exception in thread "Thread-2" java.lang.RuntimeException: No files-cache implementation set.
	at org.apache.commons.vfs2.provider.AbstractFileSystem.getCache(AbstractFileSystem.java:209)
	at org.apache.commons.vfs2.provider.AbstractFileSystem.getFileFromCache(AbstractFileSystem.java:222)
	at org.apache.commons.vfs2.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:332)
	at org.apache.commons.vfs2.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:317)
	at org.apache.commons.vfs2.provider.AbstractFileObject.resolveFile(AbstractFileObject.java:2007)
	at org.apache.commons.vfs2.provider.AbstractFileObject.resolveFiles(AbstractFileObject.java:2052)
	at org.apache.commons.vfs2.provider.AbstractFileObject.getChildren(AbstractFileObject.java:1222)
	at org.apache.commons.vfs2.impl.DefaultFileMonitor$FileMonitorAgent.checkForNewChildren(DefaultFileMonitor.java:553)
	at org.apache.commons.vfs2.impl.DefaultFileMonitor$FileMonitorAgent.check(DefaultFileMonitor.java:667)
	at org.apache.commons.vfs2.impl.DefaultFileMonitor$FileMonitorAgent.access$200(DefaultFileMonitor.java:423)
	at org.apache.commons.vfs2.impl.DefaultFileMonitor.run(DefaultFileMonitor.java:376)
	at java.lang.Thread.run(Thread.java:748)

The data was written and flushed. Nothing in the tserver logs. Could be a configuration error but I was using Accumulo 1.8.1 and Hadoop 2.9.0.

Bulk import scans all table metadata when removing load flags.

This is follow up work to #436. When bulk import is complete, load flags are removed. To do this all table metadata is scanned. Could just scan the metadata range that was bulk imported to.

Ensure all files are present in bulk import load mapping.

This is follow up work to #436. It would be nice to add a sanity check to the new bulk import code that ensures every file in the dir is represented in the load mapping file.

Anonymous types can be replaced by lamdba

GCS Hadoop Connector can't recover from Tablet Server failure.

I was attempting to use the GCS Connector (https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/gcs) to back Accumulo on GCP.

All pretty straight forward (just pointed instance.volumes to my bucket gs://<bucketname>/accumulo

One of my Tablet Servers OOMed, and when I try to recover it, I end up getting the following error:

Failed to initiate log sort gs://<bucketname>/accumulo/wal/accumulo-gcs-w-1+9997/aa166493-637e-48e8-a9a6-3655dfb59a6c
	java.lang.IllegalStateException: Don't know how to recover a lease for com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem
		at org.apache.accumulo.server.master.recovery.HadoopLogCloser.close(HadoopLogCloser.java:70)
		at org.apache.accumulo.master.recovery.RecoveryManager$LogSortTask.run(RecoveryManager.java:96)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
		at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
		at java.lang.Thread.run(Thread.java:748)

I traced that back to HadoopLogCloser, and since the GoogleHadoopFileSystem is a FileSystem, not a DistributedFileSystem, LocalFileSystem, RawLocalFileSystem, it bottoms out here:

accumulo/server/base/src/main/java/org/apache/accumulo/server/master/recovery/HadoopLogCloser.java

Line 71 in 5d216d4

"Don't know how to recover a lease for " + ns.getClass().getName());

I'm not sure what to do from here. I was also told that Azure should work with accumulo no problem, but it looks like the NativeAzureFileSystem similarly doesn't implement any of these interfaces, so I would presume it would hit the same issue.

Generate native header files with `-h` javac option instead of `javah`

We currently use javah via the native-maven-plugin to generate our native code JNI headers. This may not be necessary anymore, since https://bugs.openjdk.java.net/browse/JDK-7150368, because we should be able to just use the normal Java compiler (with the -h flag in Java 8+) to output those files during the Java compilation step.

(Also, javah is deprecated in Java 9: http://openjdk.java.net/jeps/313, so we should probably try to figure out the new method of generation.)

New bulk importer fails if directory contains files other than RFiles

This may be intended behavior... I'm not sure.
I tried to convert some bulk import code from AuditMessageIT:

auditConnector.tableOperations().importDirectory(THIRD_TEST_TABLE_NAME, exportDir.toString(), failDir.toString(), false);

to:

auditConnector.tableOperations().addFilesTo(THIRD_TEST_TABLE_NAME).from(exportDir.toString()).load();

The former works fine with the export directory containing:

distcp.txt
exportMetadata.zip
tmp/

However, the latter failed because it did not recognize the zip file format. I'm not sure why this test is trying to bulk import from an exportTable directory, but it works fine using the old method (which seems to silently ignore unrecognized files... although I can't find the relevant filtering code).

Do we want the new bulk importer to ignore unrecognized files like the old method did, or not?

Shell not working

I built the latest master and ran using Uno. Trying to run the Shell prints this error on the command line (with no other errors in the logs):
[shell.Shell] ERROR: Path must not end with / character

Sounds like an error coming from ZooKeeper or an external library because I couldn't find that string in the Accumulo code base.

Update formatter codestyle

We should update the settings in contrib/Eclipse-Accumulo-Codestyle.xml for Java 8 and where the line wrapping breaks. At the very least we should use the google style guide: https://google.github.io/styleguide/javaguide.html

Find unused Accumulo properties

Several unused Accumulo properties were found and removed in #447 but there may be more! It would nice if someone went through Property.java and looked for others.

Rename ConnectionInfo to ClientInfo

When a bulkImport fails to update the MetadataTable, which bulk import failed is hard to determine

While trying to track back issues that were seen with ConstraintViolationExeptions thrown in MetadataTableUtil#update, I noticed that tracking back exactly which bulk import threw the exception was not simple. It would be great if the error message could include more information so that we could more easily track back to which import failed

Exception while running integration tests using SunnyDay profile

Exception occurred in monitor module:

java.lang.RuntimeException: Unable to load category: org.apache.accumulo.test.categories.SunnyDayTests
	at org.apache.maven.surefire.group.match.SingleGroupMatcher.loadGroupClasses(SingleGroupMatcher.java:142)
	at org.apache.maven.surefire.common.junit48.FilterFactory.createGroupFilter(FilterFactory.java:100)
	at org.apache.maven.surefire.junitcore.JUnitCoreProvider.createJUnit48Filter(JUnitCoreProvider.java:279)
	at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:126)
	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
Caused by: java.lang.ClassNotFoundException: org.apache.accumulo.test.categories.SunnyDayTests
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.apache.maven.surefire.group.match.SingleGroupMatcher.loadGroupClasses(SingleGroupMatcher.java:138)
	... 7 more

Seeing data loss when running continuous ingest with agitation

I did not get a chance to run continuous ingest with agitation before 1.9.0 was released. I ran it after the release and saw some data loss. I have tracked the bug down and it happens when a closed write ahead log is only referenced by minor compacting tablets. When this happens the tablet server may prematurely mark the write ahead log as unreferenced in zookeeper. The following is an example of this bug

tserverA creates WAL1
Data is written to tabletX on tserverA
tserverA creates WAL2
Data is written to tabletX on tserverA
Minor compaction of tabletX starts
tserver A marks WAL1 as unreferenced
tserver A is killed
the master adds only WAL2 to tabletX
tabletX is loaded on tserverB and recovers data from WAL2

In the example above the tablet has data in WAL1, however since the tablet server marked it as unreferenced the master did not assign it to the tablet. The tablet server should only mark WAL1 as unreferenced after the minor compaction finishes, not after it starts.

This bug probably exists in 1.8.0 and later.

Seeing more data loss when running continuous ingest with agitation.

With the fixes for #432 and #441 applied to 1.9.0-SNAPSHOT I ran conitunuous ingest for 24hrs with agitation. I ran the agitator as follows.

nohup ./tserver-agitator.pl 1:10 1:10 1 3 &> logs/tserver-agitator.log &

The above command sleeps 1 to 10 minutes randomly between killing tserver processes and randomly kills 1 to 3 tablets servers. I think in the past I have run with a non random period, but not sure. I suspect the random period may be uncovering some new bugs.

After running verify I saw the following data UNDEFINED count, which indicates data was lost. This count is much lower than what I saw before #441 (saw ~600K). I am going to attempt to track the cause of this down.

        org.apache.accumulo.test.continuous.ContinuousVerify$Counts
                REFERENCED=22559054297
                UNDEFINED=6176
                UNREFERENCED=8007075

1.8 Master marks bulk import as failed when it could still be successful

Using default configurations for bulk.timeout, bulk.retry.max, the master will sometimes mark a bulk import as failed even though it was successful. Sample timings:

11:18:08 master asks tserver1 to import
11:18:08 tserver1 gets request
11:23:08 master marks attempt 1 failed (SocketTimeoutException)
11:23:08 master asks tserver2 to import
11:23:08 tserver2 gets request
11:25:27 tserver1 finishes calculating overlapping tablets
11:25:27 tserver1 completes import
11:28:08 master marks attempt 2 failed (SocketTimeoutException)
11:28:22 tserver2 finishes calculating overlapping tablets
11:28:22 tserver2 successfully completes import
etc until marked completely failed.

Include examination of map files in tserver timeout? Check isActive between master reattempts? Other?

Add User initiated Compaction Information to IteratorEnvironment

The IteratorEnvironment only indicates that a major compaction is a full compaction if it is in the final round of the full compaction.

For example, let's say there are 19 files in the tablet and the maximum number of files per major compaction is 10. Accumulo will compact those 19 files in two rounds. The first round will merge 10 files into 1, leaving the other 9 untouched. Once that round is complete , there will be 10 remaining files. This is considered a major compaction, but not a full major compaction, according to the IteratorEnvironment.

The second and final round will compact all the remaining files into 1 file. This final round is considered a full and major compaction.

There are some filter use cases that don't need to the complete compacted information before they operate and thus could optimize operations during round 1 of the compaction. There are also use cases that may only run during a user initiated compaction. By exposing this information filters could be constructed that can use both rounds of compaction optimally and save tablet server resources.

Tiered Storage Volume Chooser

For systems with a mixture of high density, slower disks and low density faster disks provide a volume chooser that can be configured to put data accesses more often onto the faster disks. This will require having the fast/slow disks partitioned into seperate volumes, and will also require that the _majorCompact code be modified to use the volume chooser to determiner where to put new files, instead of relying on srv:dir. This choice will be made based on the tablet that is being compacted using a user defined strategy for how to decide which volume to use.

Needed:

New TieredStorageVolumeChooser
New configuration options for the volume chooser
Updates to major compaction code that determines the output file location (Specifcally around line 1945 of Tablet.java)
A VolumeChooserStrategy that can be user defined to allow users to define how to pick the new volumes

NoSuchElementException in AccumuloMonitorAppenderTest

Originally mentioned by @joshelser on https://issues.apache.org/jira/browse/ACCUMULO-4409. @milleruntime told me he also saw it.

It looks like the problem is that the Enumeration in the set is not thread-safe. We could synchronize on updates to the appender's children, but it's only ever updated in a single thread; the only other thread reading this set is the test... so it's easier to just have the test handle the race condition by catching the NoSuchElementException.

Use file len cache when obtaining summary data

for #467 a cache file lens was added. This cache can be used when opening rfiles to get summary data.

WAL Recovery directories not being removed

While running test for 1.9.0, I noticed some files in /accumulo/recovery that were a few days old. I investigated this and could not find any code in the garbage collector that actually deletes WALs in the recovery directory. There is code in 1.7 to delete recovered WALs, so I suspect this problem was introduced in 1.8.0 with the change in how WALs are tracked.

I also found the property master.recovery.max.age has not been used by Accumulo internals since 1.4

Improve build script to show SHA512 checksums in vote email

Support bulk import into offline tables

This is follow up work to #436. The new bulk import API could easily operate on an offline table.

Refactor Compression.java

Follow on to #438. We forked Compression.java from Hadoop at one point and then continued adding new compression Codecs to this enum. The only codecs Hadoop has in this class are just GZ and LZO: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/file/tfile/Compression.java
So there is no need for us to have this class the way it is and we can condense the enum, reducing duplicate code.

Make bulk import metadata scan split tolerant

This is follow on work for #436. The metadata scan that is done for loading files should be split tolerant. The scan should check that the metadata table form a linked list and back up when it does not. The GC scans the metadata table like this.

Format using 100 character lines

Formatting to 100 characters improves readability, especially for contributors with difficulty reading small fonts on their screens.

See this mailing list thread.

Can not create Connector from existing Connector

Using only the new Connector builder APIs introduced in ACCUMULO-4784 there is no way to create a Connector from an exisitng Connector with a different user. I noticed this when working on #410 and wrote a test that creates a user and then creates a Connector for that user. With the old APIs this could be done as follows

Connector conn = ...
//create connector as another user for same instance.
conn = conn.getInstance().getConnector("user", "pass");

TODO: Revert Eclipse compiler error workaround

Revert commit b49865d after Eclipse Photon 4.8M7 is released (expected in June 2018, I think), which contains fix for https://bugs.eclipse.org/bugs/show_bug.cgi?id=534559