GithubHelp home page GithubHelp logo

scylla-jmx's Introduction

Scylla JMX Server

Scylla JMX server implements the Apache Cassandra JMX interface for compatibility with tooling such as nodetool. The JMX server uses Scylla's REST API to communicate with a Scylla server.

Compiling

To compile JMX server, run:

$ mvn --file scylla-jmx-parent/pom.xml package

Running

To start the JMX server, run:

$ ./scripts/scylla-jmx

To get help on supported options:

$ ./scripts/scylla-jmx --help

scylla-jmx's People

Contributors

ambantis avatar amnonh avatar amoskong avatar asias avatar avelanarius avatar avikivity avatar bhalevy avatar deexie avatar denesb avatar dependabot[bot] avatar duarten avatar elcallio avatar haaawk avatar jul-stas avatar lmr avatar mykaul avatar nyh avatar penberg avatar sitano avatar slivne avatar syuu1228 avatar tarzanek avatar tchaikov avatar tgrabiec avatar yaronkaikov avatar ycui1984 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scylla-jmx's Issues

dist: when building Ubuntu package, maven tries to use /root/.m2/repository for local repository

mvn install
[ERROR] Could not create local repository at /root/.m2/repository -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/LocalRepositoryNotAccessibleException
make[1]: *** [override_dh_auto_build] Error 1
make[1]: Leaving directory `/home/ubuntu/workspace/scylla-ubuntu-deb/scylla-jmx'
make: *** [binary] Error 2
dpkg-buildpackage: error: fakeroot debian/rules binary gave error exit status 2
debuild: fatal error at line 1364:
dpkg-buildpackage -rfakeroot -D -us -uc failed

Build fails on Ubuntu 18.04

While building on Ubuntu 18.04, java.security.InvalidAlgorithmParameterException causes when starting download .pom from HTTPS URL:

[INFO] ------------------------------------------------------------------------
[INFO] Building Scylla JMX 1.0
[INFO] ------------------------------------------------------------------------
[INFO] Downloading from central: https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-resources-plugin/2.6/maven-resources-plugin-2.6.pom
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.587 s
[INFO] Finished at: 2018-06-01T08:53:27Z
[INFO] Final Memory: 8M/120M
[INFO] ------------------------------------------------------------------------
[ERROR] Plugin org.apache.maven.plugins:maven-resources-plugin:2.6 or one of its dependencies could not be resolved: Failed to read artifact descriptor for org.apache.maven.plugins:maven-resources-plugin:jar:2.6: Could not transfer artifact org.apache.maven.plugins:maven-resources-plugin:pom:2.6 from/to central (https://repo.maven.apache.org/maven2): java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionException
debian/rules:8: recipe for target 'override_dh_auto_build' failed
make[1]: *** [override_dh_auto_build] Error 1
make[1]: Leaving directory '/build/scylla-jmx-666.development-20180601.71f857b'
debian/rules:32: recipe for target 'build' failed
make: *** [build] Error 2

It looks like ca-certs problem, may solve by executing update-ca-certificates prior to start building:
https://stackoverflow.com/questions/6784463/error-trustanchors-parameter-must-be-non-empty

Missing dependency on /usr/bin/hostname

From scylla-jmx.spec.in:

%pre
/usr/sbin/groupadd scylla 2> /dev/null || :
/usr/sbin/useradd -g scylla -s /sbin/nologin -r -d ${_sharedstatedir}/scylla scylla 2> /dev/null || :
ping -c1 `hostname` > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo
echo "**************************************************************"
echo "* WARNING: You need to add hostname on /etc/hosts, otherwise *"
echo "*          scylla-jmx will not able to start up.             *"
echo "**************************************************************"
echo
fi

If hostname is not installed, this fails.

Could be trivially fixed by adding Requires: /usr/bin/hostname, but do we really need this?

Log spammed

Dec 28 19:20:18 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:18 db scylla-jmx[29165]: Dec 28, 2015 5:20:18 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:18 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:17 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:17 db scylla-jmx[29165]: Dec 28, 2015 5:20:17 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:17 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:16 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:16 db scylla-jmx[29165]: Dec 28, 2015 5:20:16 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:16 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:15 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:15 db scylla-jmx[29165]: Dec 28, 2015 5:20:15 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:15 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:14 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:14 db scylla-jmx[29165]: Dec 28, 2015 5:20:14 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:14 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:13 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:13 db scylla-jmx[29165]: Dec 28, 2015 5:20:13 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:13 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:12 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:12 db scylla-jmx[29165]: Dec 28, 2015 5:20:12 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:12 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:11 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:11 db scylla-jmx[29165]: Dec 28, 2015 5:20:11 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:11 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:10 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:10 db scylla-jmx[29165]: Dec 28, 2015 5:20:10 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:10 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:09 db scylla-jmx[29165]: INFO:  getDroppedMessages()
Dec 28 19:20:09 db scylla-jmx[29165]: Dec 28, 2015 5:20:09 PM org.apache.cassandra.net.MessagingService log
Dec 28 19:20:09 db scylla-jmx[29165]: getDroppedMessages()
Dec 28 19:20:08 db scylla-jmx[29165]: INFO:  getDroppedMessages()

These should be debug or trace.

Unimplemented methods in StorageService.java silently doing the wrong thing

scylla-jmx's StorageService.java includes a dozen or so variants of the method forceAsyncRepair(). Each one takes a different combination of paramters, and supposed to call the generic repairAsync() function which takes a general option map.

However, some of these variants wrongly run repairAsync() with an empty option map, and some are even more broken, in that they don't even call repairAsync()!

Eventually we should correctly fix all these variants, as I did for one variant - forceRepairAsync(String keyspace, int parallelismDegree, Collection dataCenters, Collection hosts, boolean primaryRange, boolean fullRepair, String... columnFamilies).

But perhaps it is wiser to start with all the wrong variants throwing some sort of "unimplemented" exception - instead of silently doing the wrong thing, as they do now.

By the way, looking at other methods in the same file (StorageService.java), I see many other methods, not just forceRepair variants, doing the wrong thing. For example, we have

    @Override
    public void takeMultipleColumnFamilySnapshot(String tag,
            String... columnFamilyList) throws IOException {
        // TODO Auto-generated method stub
        log(" takeMultipleColumnFamilySnapshot");
    }

User of this method will not know this was never actually implemented, and assume it worked correctly... I think all these methods should either be fixed, or be made to somehow report the fact that an unimplemented method was used.

alternate port switch not being respected

Tried to start jmx in a non-standard port. Option was ignored and we are listening to 7199 nevertheless.

$ java -jar target/scylla-jmx-1.0.jar -Dcassandra.jmx.local.port=7200
Connecting to http://localhost:10000
Starting the JMX server
JMX is not enabled to receive remote connections.
service:jmx:rmi://localhost/jndi/rmi://localhost:7199/jmxrmi

RMI class loader errors in nodetool if JMX proxy is not connected

If the proxy is unable to connect to Scylla:

Connecting to http://localhost:10000
Starting the JMX server
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "Dropped messages" javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused
    at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:287)
    at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:255)
    at org.glassfish.jersey.client.JerseyInvocation$2.call(JerseyInvocation.java:700)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:228)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:444)
    at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:696)
    at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:420)
    at org.glassfish.jersey.client.JerseyInvocation$Builder.get(JerseyInvocation.java:316)
    at com.scylladb.jmx.api.APIClient.getRawValue(APIClient.java:152)
    at com.scylladb.jmx.api.APIClient.getRawValue(APIClient.java:169)
    at com.scylladb.jmx.api.APIClient.getReader(APIClient.java:196)
    at com.scylladb.jmx.api.APIClient.getJsonArray(APIClient.java:600)
    at com.scylladb.jmx.api.APIClient.getJsonArray(APIClient.java:607)
    at org.apache.cassandra.net.MessagingService.getDroppedMessages(MessagingService.java:190)
    at org.apache.cassandra.net.MessagingService$CheckDroppedMessages.run(MessagingService.java:129)
    at java.util.TimerThread.mainLoop(Timer.java:555)
    at java.util.TimerThread.run(Timer.java:505)
Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
    at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
    at sun.net.www.http.HttpClient.New(HttpClient.java:308)
    at sun.net.www.http.HttpClient.New(HttpClient.java:326)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1513)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
    at org.glassfish.jersey.client.internal.HttpUrlConnector._apply(HttpUrlConnector.java:394)
    at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:285)
    ... 18 more

We report RMI classloader issues to nodetool

[penberg@nero scylla-tools-java]$ ./bin/nodetool getlogginglevels

Logger Name                                        Log Level
error: javax.ws.rs.ProcessingException (no security manager: RMI class loader disabled)
-- StackTrace --
java.lang.ClassNotFoundException: javax.ws.rs.ProcessingException (no security manager: RMI class loader disabled)
    at sun.rmi.server.LoaderHandler.loadClass(LoaderHandler.java:396)
    at sun.rmi.server.LoaderHandler.loadClass(LoaderHandler.java:186)
    at java.rmi.server.RMIClassLoader$2.loadClass(RMIClassLoader.java:637)
    at java.rmi.server.RMIClassLoader.loadClass(RMIClassLoader.java:264)
    at sun.rmi.server.MarshalInputStream.resolveClass(MarshalInputStream.java:214)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
    at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:245)
    at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:162)
    at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
    at javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown Source)
    at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:906)
    at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:273)
    at com.sun.proxy.$Proxy7.getLoggingLevels(Unknown Source)
    at org.apache.cassandra.tools.NodeProbe.getLoggingLevels(NodeProbe.java:1293)
    at org.apache.cassandra.tools.NodeTool$GetLoggingLevels.execute(NodeTool.java:2684)
    at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:288)
    at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)

We should clean this up and return a "Scylla API server is unavailable" exception or something like that.

Scylla-JMX remote connectivity lost in DPDK mode

Versions:

scylla-jmx-1.2.3-20160731.ce0f696.el7.centos.noarch
scylla-tools-1.2.3-20160731.d26e9da.el7.centos.noarch

In DPDK mode, after scylla-jmx startup, a remote nodetool could not connect to scylla-jmx.
I had the remote nodetool connectivity working in the same setup when experimenting with POSIX mode.
Moving to DPDK mode, I lost this connectivity.

When Scylla-server is run in DPDK mode it consumes a complete physical interface with the native stack owning it.
To provide remote JMX connectivity, Scylla-JMX requires to listen on a different interface/address.

Current configurable params only deal with JMX port (see PARAM_JMX_PORT in /usr/lib/scylla/jmx/scylla-jmx).
There are API listen port and API address params but no JMX address param.

After some debugging/experimentation/googling I came across "-Djava.rmi.server.hostname" as the option
Setting this to the non-DPDK owned IP address in /usr/lib/scylla/jmx/scylla-jmx made the connectivity work when remote mode is enabled

Heres a summary of my observations -

(DPDK mode) When scylla-jmx is run in local only mode then this setting is not needed.
(DPDK mode) When scylla-jmx is run in remote mode then this setting is needed. Both local and remote nodetools need this to be set.
(POSIX mode) When scylla-jmx is run in either local/remote mode this setting is not needed.

The configuration should be made more consistent across the POSIX and DPDK modes if possible or else documented.

scylla-jmx process crashes on out of memory

scylla 2.1.1

Apr 23 23:55:01 ... scylla-jmx[10764]: Exception in thread "main" javax.management.RuntimeErrorException: Error thrown in preRegister method
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.throwMBeanRegistrationException(DefaultMBeanServerInterceptor.java:9
88)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.preRegister(DefaultMBeanServerInterceptor.java:1009)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:919)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.metrics.APIMBean.checkRegistration(APIMBean.java:78)
Apr 23 23:55:01 ... scylla-jmx[10764]: at org.apache.cassandra.db.ColumnFamilyStore.checkRegistration(ColumnFamilyStore.java:112)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.utils.APIMBeanServer.checkRegistrations(APIMBeanServer.java:280)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.utils.APIMBeanServer.queryNames(APIMBeanServer.java:95)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.main.Main.main(Main.java:49)
Apr 23 23:55:01 ... scylla-jmx[10764]: Caused by: java.lang.OutOfMemoryError: Java heap space
Apr 23 23:55:01 ... scylla-jmx[10764]: at java.util.Arrays.copyOfRange(Arrays.java:3664)
Apr 23 23:55:01 ... scylla-jmx[10764]: at java.lang.String.(String.java:207)
Apr 23 23:55:01 ... scylla-jmx[10764]: at java.lang.String.substring(String.java:1969)
Apr 23 23:55:01 ... scylla-jmx[10764]: at javax.management.ObjectName.getDomain(ObjectName.java:1566)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.metrics.MetricsMBean.lambda$getTypePredicate$0(MetricsMBean.java:46)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.metrics.MetricsMBean$$Lambda$8/1784662007.test(Unknown Source)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.metrics.APIMBean$1.apply(APIMBean.java:105)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.objectNamesFromFilteredNamedObjects(DefaultMBeanServerInterceptor.ja
va:1521)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.queryNamesImpl(DefaultMBeanServerInterceptor.java:564)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.queryNames(DefaultMBeanServerInterceptor.java:554)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.mbeanserver.JmxMBeanServer.queryNames(JmxMBeanServer.java:619)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.metrics.APIMBean.queryNames(APIMBean.java:97)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.metrics.MetricsMBean.register(MetricsMBean.java:52)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.scylladb.jmx.metrics.MetricsMBean.preRegister(MetricsMBean.java:66)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.mbeanserver.MBeanSupport.preRegister(MBeanSupport.java:167)
Apr 23 23:55:01 ... scylla-jmx[10764]: at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.preRegister(DefaultMBeanServerInterceptor.java:1007)
Apr 23 23:55:01 ... scylla-jmx[10764]: ... 9 more
Apr 23 23:55:01 ... systemd[1]: scylla-jmx.service: main process exited, code=exited, status=1/FAILURE
Apr 23 23:55:01 ... systemd[1]: Unit scylla-jmx.service entered failed state.
Apr 23 23:55:01 ... systemd[1]: scylla-jmx.service failed.

scylla-jmx is not restarted when scylla is killed

extracted from scylladb/scylladb#1776

all of this issues seem to be related to the fact that the jmx process is not restarted if scylla is killed

I reproduced this on the AMI using the sinstruction that lucas provided

  1. started a cluster of 6 servers - 3 should be enough in my view (i2.2x)
  2. started a loader (c4.8x)
  3. ran c-s according to the command provided
  4. I waited for the load to run a bit (5 minutes)
  5. entered a server
  6. found the pid of scylla and killed only it sudo kill -9 <id>
  7. the scylla-jmx does not come up (even after waiting 15 minutes)
Oct 30 15:50:34 ip-172-30-0-149 scylla[2392]:  [shard 5] compaction - Compacting [/var/lib/scylla/data/ks2/standard1-19ab33009eb811e68c7d000000000003/ks2-standard1-ka-37-Data.db:level=0, /var/lib/scylla/data/ks2
Oct 30 15:50:50 ip-172-30-0-149 scylla[2392]:  [shard 5] compaction - Compacted 4 sstables to [/var/lib/scylla/data/ks2/standard1-19ab33009eb811e68c7d000000000003/ks2-standard1-ka-69-Data.db:level=0, ]. 94138259
Oct 30 15:51:22 ip-172-30-0-149 scylla[2392]:  [shard 2] compaction - Compacting [/var/lib/scylla/data/ks2/standard1-19ab33009eb811e68c7d000000000003/ks2-standard1-ka-18-Data.db:level=0, /var/lib/scylla/data/ks2
Oct 30 15:51:34 ip-172-30-0-149 sudo[3737]:   centos : TTY=pts/0 ; PWD=/home/centos ; USER=root ; COMMAND=/bin/kill -9 2392
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-server.service: main process exited, code=killed, status=9/KILL
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Stopped Run Scylla Housekeeping daily.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Stopping Run Scylla Housekeeping daily.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Stopping Scylla JMX...
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Unit scylla-server.service entered failed state.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-server.service failed.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Requested transaction contradicts existing jobs: Resource deadlock avoided
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-server.service holdoff time over, scheduling restart.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Requested transaction contradicts existing jobs: Transaction is destructive.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-server.service failed to schedule restart job: Transaction is destructive.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Unit scylla-server.service entered failed state.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-server.service failed.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-jmx.service: main process exited, code=exited, status=143/n/a
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Stopped Scylla JMX.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Unit scylla-jmx.service entered failed state.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-jmx.service failed.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Starting Scylla Server...
Oct 30 15:51:34 ip-172-30-0-149 scylla_prepare[3748]: Setting a physical interface eth0...
Oct 30 15:51:34 ip-172-30-0-149 scylla_prepare[3748]: Setting mask 00000011 in /proc/irq/100/smp_affinity
Oct 30 15:51:34 ip-172-30-0-149 scylla_prepare[3748]: Setting mask 00000022 in /proc/irq/101/smp_affinity
Oct 30 15:51:34 ip-172-30-0-149 scylla_prepare[3748]: Setting mask 00000044 in /proc/irq/102/smp_affinity
Oct 30 15:51:34 ip-172-30-0-149 scylla_prepare[3748]: Setting mask 00000011 in /sys/class/net/eth0/queues/tx-0/xps_cpus
Oct 30 15:51:34 ip-172-30-0-149 scylla_prepare[3748]: Setting mask 00000022 in /sys/class/net/eth0/queues/tx-1/xps_cpus
Oct 30 15:51:35 ip-172-30-0-149 scylla[3783]: Scylla version 1.4.rc3-20161028.e87bed5 starting ...
Oct 30 15:51:35 ip-172-30-0-149 collectd[2651]: network plugin: Ignoring notification with unknown severity 0.
[centos@ip-172-30-0-149 ~]$ sudo systemctl status scylla-jmx.service
โ— scylla-jmx.service - Scylla JMX
   Loaded: loaded (/usr/lib/systemd/system/scylla-jmx.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sun 2016-10-30 15:51:34 UTC; 14min ago
  Process: 2409 ExecStart=/usr/lib/scylla/jmx/scylla-jmx -l /usr/lib/scylla/jmx (code=exited, status=143)
 Main PID: 2409 (code=exited, status=143)

Oct 30 15:30:09 ip-172-30-0-149 scylla-jmx[2409]: Using config file: /etc/scylla/scylla.yaml
Oct 30 15:30:11 ip-172-30-0-149 scylla-jmx[2409]: Connecting to http://127.0.0.1:10000
Oct 30 15:30:11 ip-172-30-0-149 scylla-jmx[2409]: Starting the JMX server
Oct 30 15:30:11 ip-172-30-0-149 scylla-jmx[2409]: JMX is not enabled to receive remote connections.
Oct 30 15:30:12 ip-172-30-0-149 scylla-jmx[2409]: service:jmx:rmi://localhost/jndi/rmi://localhost:7199/jmxrmi
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Stopping Scylla JMX...
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-jmx.service: main process exited, code=exited, status=143/n/a
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Stopped Scylla JMX.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: Unit scylla-jmx.service entered failed state.
Oct 30 15:51:34 ip-172-30-0-149 systemd[1]: scylla-jmx.service failed.

checking the scylla-jmx.service definition

cat /usr/lib/systemd/system/scylla-jmx.service
[Unit]
Description=Scylla JMX
Requisite=scylla-server.service
After=scylla-server.service
BindsTo=scylla-server.service

[Service]
Type=simple
EnvironmentFile=/etc/sysconfig/scylla-jmx
User=scylla
Group=scylla
ExecStart=/usr/lib/scylla/jmx/scylla-jmx -l /usr/lib/scylla/jmx
KillMode=process
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

this is also true when using the ami of 1.4 / 1.3 and not running any load - killing scylla process with kill -9 causes the scylla-jmx to exit and fail and it does not restart (scylla does restart).

Nodetool cfstats stops with an error

Nodetool cfstats return an error and does not print all the tables and keyspaces for keyspace different than system

for example:
Bloom filter space used: 0
Bloom filter off heap memory used: 524336
Index summary off heap memory used: 0
Compression metadata off heap memory used: 0
Compacted partition minimum bytes: 259
Compacted partition maximum bytes: 310
nodetool: For input string: "310.000000"
See 'nodetool help' or 'nodetool help '.

dist: 0.13.1 on centos - jmx client is not able to connect to server

When installing scylla-server and scylla-jmx from 0.13.1 on centos 7 the jmx client is not able to connect to server

The scylla-jmx is using the correct conf file yet is connecting to 127.0.0.1 that is not present - @amnonh how is the scylla-jmx deciding to which IP it should connect ?

Dec 12 21:03:44 server-01.localdomain scylla-jmx[8951]: Using config file: /etc/scylla/scylla.yaml
Dec 12 21:03:44 server-01.localdomain scylla-jmx[8951]: Connecting to http://127.0.0.1:10000
Dec 12 21:03:44 server-01.localdomain scylla-jmx[8951]: Starting the JMX server
Dec 12 21:03:44 server-01.localdomain scylla-jmx[8951]: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
Dec 12 21:03:44 server-01.localdomain scylla-jmx[8951]: SLF4J: Defaulting to no-operation (NOP) logger implementation
Dec 12 21:03:44 server-01.localdomain scylla-jmx[8951]: SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: getDroppedMessages()
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: Dec 12, 2015 9:03:45 PM org.apache.cassandra.net.MessagingService log
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: INFO:  getDroppedMessages()
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: Exception in thread "Dropped messages" javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:287)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:255)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.client.JerseyInvocation$2.call(JerseyInvocation.java:700)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.internal.Errors.process(Errors.java:228)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:444)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:696)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:420)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.client.JerseyInvocation$Builder.get(JerseyInvocation.java:316)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at com.cloudius.urchin.api.APIClient.getRawValue(APIClient.java:150)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at com.cloudius.urchin.api.APIClient.getRawValue(APIClient.java:167)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at com.cloudius.urchin.api.APIClient.getReader(APIClient.java:194)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at com.cloudius.urchin.api.APIClient.getJsonArray(APIClient.java:598)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at com.cloudius.urchin.api.APIClient.getJsonArray(APIClient.java:605)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.apache.cassandra.net.MessagingService.getDroppedMessages(MessagingService.java:190)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.apache.cassandra.net.MessagingService$CheckDroppedMessages.run(MessagingService.java:128)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.util.TimerThread.mainLoop(Timer.java:555)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.util.TimerThread.run(Timer.java:505)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: Caused by: java.net.ConnectException: Connection refused
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.net.PlainSocketImpl.socketConnect(Native Method)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.net.Socket.connect(Socket.java:589)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.http.HttpClient.New(HttpClient.java:308)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.http.HttpClient.New(HttpClient.java:326)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1513)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.client.internal.HttpUrlConnector._apply(HttpUrlConnector.java:394)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: at org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:285)
Dec 12 21:03:45 server-01.localdomain scylla-jmx[8951]: ... 18 more

nodetool cfstats show invalid

When using nodetool cfstats without level compaction it shows:
SSTables in each level: [ Space used (live): 0
The expected output is that there will be no line for SStables and Spaced used will be on a new line

README mistake?

@amnonh
Your instructions to run jmx-urchin are:

java -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -jar target/urchin-mbean-1.0.jar

The README is just

java -jar target/urchin-mbean-1.0.jar

which is it?

scylla-jmx has significant GC activity when not used

There is no external load on scylla-jmx, but we can see in the GC log that there occur minor and major collections inducing pauses up to 200ms.

This is probably caused by scylla-jmx polling scylla every second for some metrics even when not used.

1080.775: Application time: 0.2937128 seconds
1080.775: Total time for which application threads were stopped: 0.0001911 seconds, Stopping threads took: 0.0000178 seconds
1080.775: Application time: 0.0000257 seconds
1080.775: Total time for which application threads were stopped: 0.0000717 seconds, Stopping threads took: 0.0000089 seconds
1080.775: Application time: 0.0000209 seconds
1080.775: Total time for which application threads were stopped: 0.0000731 seconds, Stopping threads took: 0.0000084 seconds
1080.775: Application time: 0.0000132 seconds
1080.775: Total time for which application threads were stopped: 0.0000630 seconds, Stopping threads took: 0.0000062 seconds
1080.775: Application time: 0.0000136 seconds
1080.775: Total time for which application threads were stopped: 0.0000631 seconds, Stopping threads took: 0.0000062 seconds
1080.776: Application time: 0.0000350 seconds
1080.776: Total time for which application threads were stopped: 0.0000699 seconds, Stopping threads took: 0.0000063 seconds
1095.481: Application time: 14.7054517 seconds
1095.482: [Full GC (Allocation Failure)  237491K->30993K(253440K), 0.2362866 secs]
1095.718: Total time for which application threads were stopped: 0.2369596 seconds, Stopping threads took: 0.0000210 seconds
1097.718: Application time: 2.0001543 seconds
1097.718: Total time for which application threads were stopped: 0.0001689 seconds, Stopping threads took: 0.0000208 seconds
1110.482: Application time: 12.7636455 seconds
1110.483: [GC (Allocation Failure)  100945K->52084K(253440K), 0.0860028 secs]
1110.569: Total time for which application threads were stopped: 0.0868157 seconds, Stopping threads took: 0.0003010 seconds
1126.776: Application time: 16.2074052 seconds
1126.777: [GC (Allocation Failure)  122036K->71547K(253440K), 0.1061292 secs]
1126.883: Total time for which application threads were stopped: 0.1065841 seconds, Stopping threads took: 0.0000201 seconds
1142.778: Application time: 15.8951925 seconds
1142.778: [GC (Allocation Failure)  141499K->91035K(253440K), 0.1146070 secs]
1142.894: Total time for which application threads were stopped: 0.1155050 seconds, Stopping threads took: 0.0000227 seconds
1158.781: Application time: 15.8874965 seconds
1158.781: [GC (Allocation Failure)  160987K->110368K(253440K), 0.1066694 secs]

Build fails on JDK 1.7 environment

Right now our package script accepts both 1.7 and 1.8 JDK, but from commit 9c2d6ce we specifies target VM as 1.8.
It cause compile fail on JDK 1.7 environments:

[INFO] Compiling 63 source files to /home/syuu/scylla-jmx/target/classes
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
[ERROR] Failure executing javac, but could not parse the error:
javac: invalid target release: 1.8
Usage: javac <options> <source files>
use -help for a list of possible options

It may break packaging on multiple distributions, because even it has JDK 1.8 on repository, we just specified 'default version of SDK' to install, built will fail.

Question of this problem is, do we need Java 1.8 for JMX, since we recently started to use 1.8 feature?
Or the commit was wrong, we can keep using 1.7?

dist: errors/warnings on ubuntu package building

Need similar patchset we merged on scylla-server

Now running lintian...
warning: the authors of lintian do not recommend running it with root privileges!
W: scylla-jmx source: native-package-with-dash-version
W: scylla-jmx source: diff-contains-git-control-dir .git
W: scylla-jmx source: diff-contains-editor-backup-file .pom.xml.swp
W: scylla-jmx source: missing-license-paragraph-in-dep5-copyright agpl-3.0 (paragraph at line 10)
W: scylla-jmx source: ancient-standards-version 3.9.2 (current is 3.9.5)
W: scylla-jmx: extended-description-line-too-long
W: scylla-jmx: extra-license-file usr/share/doc/scylla-jmx/LICENSE.AGPL.gz
E: scylla-jmx: postrm-does-not-call-updaterc.d-for-init.d-script etc/init.d/scylla-jmx
W: scylla-jmx: init.d-script-not-marked-as-conffile etc/init.d/scylla-jmx
E: scylla-jmx: init.d-script-not-included-in-package etc/init.d/scylla-jmx
W: scylla-jmx: jar-not-in-usr-share usr/lib/scylla/jmx/urchin-mbean-1.0.jar
Finished running lintian.

scylla-jmx rpm caused NoClassDefFoundError on service startup

Looks like I'm missing dependency package.

Sep 14 09:11:48 ip-172-30-0-168.ec2.internal java[5638]: Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
Sep 14 09:11:48 ip-172-30-0-168.ec2.internal java[5638]: at com.cloudius.urchin.main.Main.main(Main.java:22)
Sep 14 09:11:48 ip-172-30-0-168.ec2.internal java[5638]: Caused by: java.lang.ClassNotFoundException: com.sun.jersey.api.client.config.ClientConfig
Sep 14 09:11:48 ip-172-30-0-168.ec2.internal java[5638]: at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
Sep 14 09:11:48 ip-172-30-0-168.ec2.internal java[5638]: at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
Sep 14 09:11:48 ip-172-30-0-168.ec2.internal java[5638]: at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
Sep 14 09:11:48 ip-172-30-0-168.ec2.internal java[5638]: at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

Bad error message when non-existing keyspace is passed to cfhistograms

In the following command, by mistake I queried keyspace instead of keyspace1. Instead of a graceful error message, I got a scary stack trace:

$ nodetool cfhistograms keyspace standard1
error: org.apache.cassandra.metrics:type=Table,keyspace=keyspace,scope=standard1,name=EstimatedPartitionSizeHistogram
-- StackTrace --
javax.management.InstanceNotFoundException: org.apache.cassandra.metrics:type=Table,keyspace=keyspace,scope=standard1,name=EstimatedPartitionSizeHistogram
	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:643)
	at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
	at com.scylladb.jmx.utils.APIMBeanServer.getAttribute(APIMBeanServer.java:130)
	at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445)
	at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
	at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
	at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401)
	at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639)
	at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
	at sun.rmi.transport.Transport$1.run(Transport.java:200)
	at sun.rmi.transport.Transport$1.run(Transport.java:197)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
	at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:283)
	at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:260)
	at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
	at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
	at javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown Source)
	at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:903)
	at javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:273)
	at com.sun.proxy.$Proxy21.getValue(Unknown Source)
	at org.apache.cassandra.tools.NodeProbe.getColumnFamilyMetric(NodeProbe.java:1155)
	at org.apache.cassandra.tools.nodetool.TableHistograms.execute(TableHistograms.java:49)
	at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:265)
	at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:175)

build_deb.sh fails on Debian8

apt-get install openjdk-8-jre-headless fails because it depends to ca-certificates-java/jessie-backports 20161107~bpo8+1, but apt doesn't override jessie official packages by default.
So need to install newer ca-certificates-java before installing openjdk-8.

Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
debhelper is already the newest version.
debhelper set to manually installed.
devscripts is already the newest version.
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 openjdk-8-jdk-headless : Depends: openjdk-8-jre-headless (= 8u121-b13-1~bpo8+1) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

nodetool status should display formatted Load

The nodetool status command should display the load parameter formated

$nodetool status

Datacenter: datacenter1

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 127.0.0.1 4.14157e+08 256 ? 292a6c7f-2063-484c-b54d-9015216f1750 rack1
UN 127.0.0.2 1.58411e+08 256 ? 102b6ecd-2081-4073-8172-bf818c35e27b rack1

The status field should be displayed with the proper units

JMX no longer shutdown when connection to scylla drope

The jmx used to check that the scylla API is available. if it's not, after 30s it shuts down.

Following the change in commit "APIClient: Make API server errors human readable" which changes the thrown exception, this mechanism no longer works and the JMX does not shutdown.

Scylla-JMX should expose the number of shards

Relates to #3612
There are cases where the nodetool needs the number of shards.
For example, for tablestats it needs the number of shards to know how many sstables are there per level in a level compaction.

read parameters from configuration file

Right now, if we want to change the behavior of scylla-jmx, for example to accept remote connections, we need to change the systemd file itself.

That is both inconvenient and ugly by systemd standards. The best way to handle this is what we do in Scylla, using environment variables that are present in /etc/sysconfig/scylla-jmx.

The latter file already exists, but it is only passing some path variables. We should extend it to allow us to change scylla-jmx behavior (remote, ssl, auth, etc), through external configuration files.

MBean test coverage

Scylla MBeans (via scylla-jmx process) are tested indirectly via nodetool.
It would be good to include direct JMX testing, for each of the MBeans methods, properties.

-Dcom.sun.management.jmxremote.host generate error on ubuntu

It seems that bash under ubuntu does not respect string substitution and generate error instead
~$ SCYLLA_HOME=/var/lib/scylla SCYLLA_CONF=/etc/scylla /usr/lib/scylla/jmx/scylla-jmx -l /usr/lib/scylla/jmx -r -Dcom.sun.management.jmxremote.host=0.0.0.0
/usr/lib/scylla/jmx/scylla-jmx: 101: /usr/lib/scylla/jmx/scylla-jmx: Bad substitution

dist: errors/warnings with rpmlint

build/rpmbuild/SPECS/scylla-jmx.spec:29: E: hardcoded-library-path in %{_prefix}/lib/scylla/
build/rpmbuild/SPECS/scylla-jmx.spec:33: E: hardcoded-library-path in %{_prefix}/lib/scylla
build/rpmbuild/SPECS/scylla-jmx.spec:34: E: hardcoded-library-path in %{_prefix}/lib/scylla/jmx
build/rpmbuild/SPECS/scylla-jmx.spec:35: E: hardcoded-library-path in %{_prefix}/lib/scylla/jmx/
build/rpmbuild/SPECS/scylla-jmx.spec:36: E: hardcoded-library-path in %{_prefix}/lib/scylla/jmx
build/rpmbuild/SPECS/scylla-jmx.spec:69: E: hardcoded-library-path in %{_prefix}/lib/scylla/jmx/jmx_run
build/rpmbuild/SPECS/scylla-jmx.spec:70: E: hardcoded-library-path in %{_prefix}/lib/scylla/jmx/urchin-mbean-1.0.jar
build/rpmbuild/SPECS/scylla-jmx.spec: W: invalid-url Source0: scylla-jmx-1.0-20151027.3bc0754.tar
0 packages and 1 specfiles checked; 7 errors, 1 warnings.

issue running cfstats

Running scylla-jmx master (dd8d5c8) against scylladb master (scylladb/scylladb@d9c80ca)

calling nodetool cfstats, I get:

Keyspace: keyspace1
	Read Count: 0
	Read Latency: NaN ms.
	Write Count: 8330327
	Write Latency: 7.1027223781251324E-6 ms.
	Pending Flushes: 0
		Table: standard1
		SSTable count: 52
		SSTables in each level: [0, 4, 48]
		Space used (live): 8803137728
		Space used (total): 8803137728
		Space used by snapshots (total): 0
		Off heap memory used (total): 183969660
		SSTable Compression Ratio: 0.0
		Number of keys (estimate): 8236890
		Memtable cell count: 93437
		Memtable data size: 157908530
		Memtable off heap memory used: 183500800
		Memtable switch count: 53
		Local read count: 0
		Local read latency: NaN ms
		Local write count: 8330327
		Local write latency: 0.007 ms
		Pending flushes: 0
		Bloom filter false positives: 0
		Bloom filter false ratio: 0.00000
		Bloom filter space used: 121456
		Bloom filter off heap memory used: 121452
		Index summary off heap memory used: 347408
		Compression metadata off heap memory used: 0
		Compacted partition minimum bytes: 925
		Compacted partition maximum bytes: 1109
		Compacted partition mean bytes: 1109
		Average live cells per slice (last five minutes): 0.0
error: java.lang.Long cannot be cast to java.lang.Double
-- StackTrace --
java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Double
	at com.sun.proxy.$Proxy22.getMax(Unknown Source)
	at org.apache.cassandra.tools.NodeTool$CfStats.execute(NodeTool.java:902)
	at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:288)
	at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)

dist: remove / comment out some of the sysconfig args

need to remove the address and port from /etc/sysconfig/scylla-jmx as the java process now extracts this info from the scylla.yaml

the scylla-jmx service (/usr/lib/systemd/system/scylla-jmx.service) start command should also be updated to reflect that.

The proxy need to cache some of the requests

Some of the requests, like network topology tokens and nodes are used multiple times in a single nodetool command and they are not likely to change in a sort interval.

The end result is that the nodetool takes a long time, a simple solution would be to add a local cache for the APIClient, some request will be able to mark to cache the reply (with some reasonable TTL) This will speed up the nodetool performance with little change in the code

api: Add force_remove_endpoint for gossip

It is used to force remove a node from gossip membership if something
goes wrong or to accelerate the decommissioned node's gossip state
from the cluster.

Note: run the force_remove_endpoint api at the same time on all the
nodes in the cluster in order to prevent the removed nodes come back.
Becasue nodes without running the force_remove_endpoint api cmd can
gossip around the removed node information to other nodes in 2 *
ring_delay (2 * 30 seconds by default) time.

For instance, in a 3 nodes cluster, node 3 is decommissioned, to remove
node 3 from gossip membership prior the auto removal (3 days by
default), run the api cmd on both node 1 and node 2 at the same time.

$ curl -X POST --header "Accept: application/json"
"http://127.0.0.1:10000/gossiper/force_remove_endpoint/127.0.0.3"
$ curl -X POST --header "Accept: application/json"
"http://127.0.0.2:10000/gossiper/force_remove_endpoint/127.0.0.3"

Then run 'nodetool gossipinfo' on all the nodes to check the removed nodes
are not present.

cfstats command fails

[root@m2 apache-cassandra-2.1.9]# nodetool --port=7199 cfstats
Keyspace: system
error: null
-- StackTrace --
java.lang.NullPointerException
    at com.cloudius.urchin.api.APIClient.getHistogramValue(APIClient.java:554)
    at com.cloudius.urchin.api.APIClient.getHistogramValue(APIClient.java:564)
    at com.yammer.metrics.core.APIHistogram.update(APIHistogram.java:105)
    at com.yammer.metrics.core.APIHistogram.count(APIHistogram.java:129)
    at com.yammer.metrics.core.Timer.count(Timer.java:108)
    at com.yammer.metrics.reporting.JmxReporter$Meter.getCount(JmxReporter.java:118)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
    at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
    at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
    at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
    at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
    at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
    at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
    at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
    at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1448)
    at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
    at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1312)
    at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1404)
    at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:641)
    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
    at sun.rmi.transport.Transport$1.run(Transport.java:200)
    at sun.rmi.transport.Transport$1.run(Transport.java:197)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
    at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$93(TCPTransport.java:683)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)


scyla-jmx SLF4J errors

When starting nodetool service on Scylla AMI 0.15 I get the following errors in the log:

Jan 12 08:33:35 ip-172-31-26-164 scylla-jmx[1792]: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
Jan 12 08:33:35 ip-172-31-26-164 scylla-jmx[1792]: SLF4J: Defaulting to no-operation (NOP) logger implementation
Jan 12 08:33:35 ip-172-31-26-164 scylla-jmx[1792]: SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

nodetool seems to work fine, but we need to clean up this errors.

nodetool repair st/et options broken

As explained in scylladb's commit f9ee74f5, scylladb expects the nodetool options "st" and "et" to generate the "startToken" and "endToken" REST API parameters, respectively. The scylladb code in repair.cc intersects this user-given range with the actual ranges held by the node.

scylla-jmx commit 4ed0497 implemented new variants of repairRangeAsync() functions which in this case set the "ranges" parameter instead of the startToken/endToken parameters. This is incorrect. It should set the startToken/endToken parameters, as one pre-existing forceRepairRangeAsync() implementation does.

While at it we should:

  1. check that nothing else got broken in those new repair variants
  2. add a test for "-st"/"-et" to the dtest, so we can't break it again.

urchin-jmx should "kill it self" in case it can not connect to the urchin process

if urchin process crashes/does not boot - we end up in a situation in which the jmx process is up - this is not compatible with ORIGIN in which there is only a single process.

urchin-jmx should assume that it is started after a urchin process is up and if it is not available it should not start.

The same holds for the case when urchin crashes - urchin jmx should also "kill it self"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.