GithubHelp home page GithubHelp logo

hazelcast-gcp's Issues

Members don't find each other when network interfaces are provided in certain order.

I used terraform to deploy 2 VMs on GCP. When I used just one network interface defined in the VM definition as:

  network_interface {
    subnetwork = google_compute_subnetwork.vpc_common_subnetwork.self_link
    access_config {
      nat_ip = google_compute_address.vm_static_ip[count.index].address
    }
  }

Two members could find each other on the cloud and form a cluster.

However when I tried to add a second network interface for both instances they couldn't form a cluster. The definitions of network interfaces and VMs are as follows:


#COMMON NETWORK - SUBNETWORK - FIREWALL

resource "google_compute_network" "vpc_common_network" {
  name                    = "terraform-common-network"
  auto_create_subnetworks = false
}


resource "google_compute_subnetwork" "vpc_common_subnetwork" {
  name          = "terraform-common-subnetwork"
  ip_cidr_range = "10.0.10.0/24"
  region        = var.region
  network       = google_compute_network.vpc_common_network.id
}

# PER INSTANCE NETWORK

resource "google_compute_network" "vpc_network" {
  count                   = var.member_count
  name                    = "terraform-network-${count.index}"
  auto_create_subnetworks = false
}

resource "google_compute_subnetwork" "vpc_subnetwork" {
  count         = var.member_count
  name          = "terraform-subnetwork-${count.index}"
  ip_cidr_range = "10.0.${count.index + 1}.0/24"
  region        = var.region
  network       = google_compute_network.vpc_network[count.index].id
}


# PUBLIC IP PER INSTANCE

resource "google_compute_address" "vm_static_ip" {
  count = var.member_count
  name  = "terraform-static-ip${count.index}"
}

# INSTANCES
resource "google_compute_instance" "hazelcast_vm" {
  count                     = var.member_count
  name                      = "hazelcast-instance-${count.index}-test"
  machine_type              = "f1-micro"
  hostname                  = "hazelcast-instance-${count.index}-test.com"
  allow_stopping_for_update = "true"
  zone                      = var.zone
  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-9"
    }
  }

  network_interface {
    subnetwork = google_compute_subnetwork.vpc_common_subnetwork.self_link
    access_config {
      nat_ip = google_compute_address.vm_static_ip[count.index].address
    }
  }

  network_interface {
    subnetwork = google_compute_subnetwork.vpc_subnetwork[count.index].self_link
  }

  service_account {
    scopes = ["cloud-platform"]
  }

  metadata = {
    ssh-keys = "${var.gce_ssh_user}:${file(var.gce_ssh_pub_key_file)}"
  }

}

In both VMs I ran the following command to start members:

CLASSPATH="${HOME}/jars/hazelcast-4.0.2.jar:${HOME}/jars/hazelcast-gcp-2.0.1.jar:${HOME}/hazelcast.yaml"
nohup java -cp ${CLASSPATH} -server com.hazelcast.core.server.HazelcastMemberStarter >> ${HOME}/logs/hazelcast.stderr.log 2>> ${HOME}/logs/hazelcast.stdout.log &

hazelcast.yaml file is as follows:

hazelcast:
  network:
    join:
      multicast:
        enabled: false
      gcp:
        enabled: true

LOGS of one of the members is as follows:

Aug 06, 2020 10:48:51 AM com.hazelcast.internal.config.AbstractConfigLocator
INFO: Loading 'hazelcast.yaml' from the working directory.
Aug 06, 2020 10:48:52 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.2] Prefer IPv4 stack is true, prefer IPv6 addresses is false
Aug 06, 2020 10:48:52 AM com.hazelcast.instance.AddressPicker
INFO: [LOCAL] [dev] [4.0.2] Picked [10.0.1.2]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
Aug 06, 2020 10:48:52 AM com.hazelcast.system
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Hazelcast 4.0.2 (20200702 - 2de3027) starting at [10.0.1.2]:5701
Aug 06, 2020 10:48:52 AM com.hazelcast.system
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Copyright (c) 2008-2020, Hazelcast, Inc. All Rights Reserved.
Aug 06, 2020 10:48:52 AM com.hazelcast.spi.impl.operationservice.impl.BackpressureRegulator
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Backpressure is disabled
Aug 06, 2020 10:48:54 AM com.hazelcast.instance.impl.Node
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Activating Discovery SPI Joiner
Aug 06, 2020 10:48:54 AM com.hazelcast.cp.CPSubsystem
WARNING: [10.0.1.2]:5701 [dev] [4.0.2] CP Subsystem is not enabled. CP data structures will operate in UNSAFE mode! Please note that UNSAFE mode will not provide strong consistency guarantees.
Aug 06, 2020 10:48:55 AM com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Starting 2 partition threads and 3 generic threads (1 dedicated for priority tasks)
Aug 06, 2020 10:48:55 AM com.hazelcast.internal.diagnostics.Diagnostics
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.
Aug 06, 2020 10:48:55 AM com.hazelcast.core.LifecycleService
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5701 is STARTING
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.1.2:5703, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.1.2:5703. Reason: SocketException[Connection refused to address /10.0.1.2:5703]
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.1.2:5704, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.1.2:5704. Reason: SocketException[Connection refused to address /10.0.1.2:5704]
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5704 is added to the blacklist.
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5703 is added to the blacklist.
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.2.2:5702, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.2.2:5701, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.2.2:5704, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.1.2:5702, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.1.2:5702. Reason: SocketException[Connection refused to address /10.0.1.2:5702]
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5702 is added to the blacklist.
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.1.2:5705, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.1.2:5705. Reason: SocketException[Connection refused to address /10.0.1.2:5705]
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5705 is added to the blacklist.
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.1.2:5708, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.1.2:5708. Reason: SocketException[Connection refused to address /10.0.1.2:5708]
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5708 is added to the blacklist.
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.2.2:5707, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.2.2:5703, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.2.2:5706, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.1.2:5706, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.1.2:5706. Reason: SocketException[Connection refused to address /10.0.1.2:5706]
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.1.2:5707, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.1.2:5707. Reason: SocketException[Connection refused to address /10.0.1.2:5707]
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5707 is added to the blacklist.
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.2.2:5705, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Connecting to /10.0.2.2:5708, timeout: 10000, bind-any: true
Aug 06, 2020 10:48:56 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5706 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.2.2:5702. Reason: SocketTimeoutException[null]
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.2.2:5704. Reason: SocketTimeoutException[null]
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.2.2]:5702 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.2.2:5701. Reason: SocketTimeoutException[null]
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.2.2]:5704 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.2.2]:5701 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.2.2:5707. Reason: SocketTimeoutException[null]
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.2.2]:5707 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.2.2:5703. Reason: SocketTimeoutException[null]
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.2.2]:5703 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.2.2:5706. Reason: SocketTimeoutException[null]
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.2.2]:5706 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.2.2:5705. Reason: SocketTimeoutException[null]
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.2.2]:5705 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.nio.tcp.TcpIpConnector
INFO: [10.0.1.2]:5701 [dev] [4.0.2] Could not connect to: /10.0.2.2:5708. Reason: SocketTimeoutException[null]
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.impl.DiscoveryJoiner
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.2.2]:5708 is added to the blacklist.
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.cluster.ClusterService
INFO: [10.0.1.2]:5701 [dev] [4.0.2] 

Members {size:1, ver:1} [
        Member [10.0.1.2]:5701 - 559d6cdb-1937-4561-bb11-7465cd83bd39 this
]

Aug 06, 2020 10:49:06 AM com.hazelcast.core.LifecycleService
INFO: [10.0.1.2]:5701 [dev] [4.0.2] [10.0.1.2]:5701 is STARTED
Aug 06, 2020 10:49:06 AM com.hazelcast.internal.diagnostics.HealthMonitor
INFO: [10.0.1.2]:5701 [dev] [4.0.2] processors=1, physical.memory.total=592.5M, physical.memory.free=41.4M, swap.space.total=0, swap.space.free=0, heap.memory.used=14.4M, heap.memory.free=3.4M, heap.memory.total=17.9M, heap.memory.max=145.0M, heap.memory.used/total=80.30%, heap.memory.used/max=9.90%, minor.gc.count=56, minor.gc.time=75ms, major.gc.count=2, major.gc.time=42ms, load.process=0.00%, load.system=100.00%, load.systemAverage=0.63, thread.count=46, thread.peakCount=46, cluster.timeDiff=0, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.client.query.size=0, executor.q.client.blocking.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operations.size=0, executor.q.priorityOperation.size=0, operations.completed.count=1, executor.q.mapLoad.size=0, executor.q.mapLoadAllKeys.size=0, executor.q.cluster.size=0, executor.q.response.size=0, operations.running.count=0, operations.pending.invocations.percentage=0.00%, operations.pending.invocations.count=0, proxy.count=0, clientEndpoint.count=0, connection.active.count=0, client.connection.count=0, connection.count=0
Aug 06, 2020 10:51:26 AM com.hazelcast.internal.diagnostics.HealthMonitor
INFO: [10.0.1.2]:5701 [dev] [4.0.2] processors=1, physical.memory.total=592.5M, physical.memory.free=39.5M, swap.space.total=0, swap.space.free=0, heap.memory.used=14.6M, heap.memory.free=3.3M, heap.memory.total=17.9M, heap.memory.max=145.0M, heap.memory.used/total=81.48%, heap.memory.used/max=10.04%, minor.gc.count=57, minor.gc.time=78ms, major.gc.count=2, major.gc.time=42ms, load.process=0.00%, load.system=75.00%, load.systemAverage=0.06, thread.count=35, thread.peakCount=47, cluster.timeDiff=0, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.client.query.size=0, executor.q.client.blocking.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operations.size=0, executor.q.priorityOperation.size=0, operations.completed.count=1, executor.q.mapLoad.size=0, executor.q.mapLoadAllKeys.size=0, executor.q.cluster.size=0, executor.q.response.size=0, operations.running.count=0, operations.pending.invocations.percentage=0.00%, operations.pending.invocations.count=0, proxy.count=0, clientEndpoint.count=0, connection.active.count=0, client.connection.count=0, connection.count=0

Apparently they are using the subnetwork defined in the second interface and cannot find the common subnetwork they have in the first subnetwork and form the cluster.

Improve error messages when GCP Plugin is misconfigured

We should address the following scenarios with a message which is clear to users:

  • Insufficient GCP Permissions of VM Instance
  • Insufficient GCP Permissions of Service Account (when run with private-key-path)
  • No file in private-key-path
  • Client running outside GCP but not specifying any of the mandatory parameters (private-key-path, projects, zones)
  • Hazelcast member running outside GCP

In case of any "known error situation", we should print a comprehensive error message to the end user.

Create "region" property

Currently, there is no way to filter instances by region. You can only do it by zone.
We should:

  • Add a parameter region
  • Make the default discovery to all zones from the current region (currently the default is discovery in the current zone).

Is "region" property in configuration deprecated?

Hi!

I am using the GCP discovery plugin and I am trying to build a "region-aware" cluster.

I am getting an error staying "Unknown properties: '[region]' on discovery strategy" when I set the region property.

I then debugged the code and saw that region is NOT in fact a valid setting. However the documentation says it is.

factory.getConfigurationProperties() = {ArrayList@9631} size = 5
0 = {SimplePropertyDefinition@9633}
key = "private-key-path"
optional = true
typeConverter = {PropertyTypeConverter$1@9639} "STRING"
validator = null
1 = {SimplePropertyDefinition@9634}
key = "projects"
optional = true
typeConverter = {PropertyTypeConverter$1@9639} "STRING"
validator = null
2 = {SimplePropertyDefinition@9635}
key = "zones"
optional = true
typeConverter = {PropertyTypeConverter$1@9639} "STRING"
validator = null
3 = {SimplePropertyDefinition@9636}
key = "label"
optional = true
typeConverter = {PropertyTypeConverter$1@9639} "STRING"
validator = null
4 = {SimplePropertyDefinition@9637}
key = "hz-port"
optional = true
typeConverter = {PropertyTypeConverter$1@9639} "STRING"
validator = null

It seems as if this setting is not needed anymore given the fact that no region and no zones properties make it "the region you are in" and "all zones", but...

The documentation and/or the code seem to be in error.

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.