weaviate / weaviate-helm Goto Github PK

View Code? Open in Web Editor NEW

45.0 21.0 59.0 1007 KB

Helm charts to deploy Weaviate to k8s

Home Page: https://weaviate.io/developers/weaviate/current/

License: BSD 3-Clause "New" or "Revised" License

Shell 78.96% Smarty 21.04%

weaviate-helm's Introduction

Weaviate Helm Chart

Helm chart for Weaviate application. Weaviate can be deployed to a Kubernetes cluster using this chart.

Usage

Helm must be installed in order to use the weaviate chart. Please refer to Helm's documentation on how to get started.

Once Helm is set up properly, add the repo as follows:

helm repo add weaviate https://weaviate.github.io/weaviate-helm
helm install my-weaviate weaviate/weaviate

Documentation can be found here.

Migration from older versions to v1.25.x and above

Weaviate v1.25 has brought a significant change in how we bootstrap the Weaviate cluster. We have changed the podManagementPolicy from OrderedReady to Parallel. This change is required for the Raft-based consensus model that Weaviate now utilizes under the hood. For the Raft cluster to be properly bootstrapped, all pods in the cluster must start simultaneously.

Please note that once the Raft cluster is established, rolling updates are possible. This change will only take effect during migration from versions prior to v1.25 (or when bootstrapping a new v1.25 cluster).

If you are upgrading from a version older than v1.25 to v1.25 and above, you must first delete Weaviate's Statefulset. This is a one-time operation and will not remove your data, it is necessary to make the update of Statefulset settings possible.

Detailed information can be found in the documentation.

(for contributors) How to make new releases

Bump chart version in ./weaviate/Chart.yaml
Create a commit
Create an annotated tag matching the version number in Chart.yaml (prefix with a v, such as v1.4.3)
Push commit with git push
Push tag with git push origin --tags
Wait for GH Action to complete, it will create a drafted release with the packaged chart attached
Edit the draft to include useful release notes and publish when appropriate

weaviate-helm's People

Contributors

Stargazers

Watchers

weaviate-helm's Issues

Add support for external inference services

Adapt to Weaviate standalone

Required Changes

Previously the weaviate container was stateless and did not require local persistence. With standalone requiring disk access, weaviate can no longer be a Deployment, but needs to be a StatefulSet
Remove esvector
Remove etcd (unless still required by the contextionary?)
Possibly update contextionary requirements if necessary. Unknown as of today, since this depends on the outcome of semitechnologies/weaviate#1252

Add configuration for the Service port (hardcoded as 80)

Update IMAGE_INFERENCE_API name change

CrashLoopBackOff etcd-0 kubernetes docker for windows

I discovering your product and I tried to deploy weaviate on my locak kubernetes cluster that comes with Docker for windows.

The deployment using helm was succesfully.

D:\repos\weaviate\weaviate-helm\weaviate>helm install --values .\values-minimal.yaml --namespace "weaviate" "weaviate" weaviate-8.2.1.tgz
NAME: weaviate
LAST DEPLOYED: Tue Jan 21 08:45:25 2020
NAMESPACE: weaviate
STATUS: deployed
REVISION: 1

D:\repos\weaviate\weaviate-helm\weaviate>kubectl get deployment --namespace weaviate
NAME READY UP-TO-DATE AVAILABLE AGE
contextionary 1/1 1 1 21s
weaviate 0/1 1 0 21s

But etcd-0 seems to have a hard time.

D:\repos\weaviate\weaviate-helm\weaviate>kubectl get pods --namespace weaviate
NAME READY STATUS RESTARTS AGE
contextionary-fd94d5bb4-vt9tp 1/1 Running 0 39s
esvector-master-0 0/1 Running 1 39s
etcd-0 0/1 CrashLoopBackOff 2 39s
weaviate-7d5b6fb5c4-pkrqb 0/1 Running 0 39s

The log contains:

==> Creating data dir...
mkdir: cannot create directory '/bitnami/etcd/data': Permission denied

Make all images overwritable

The subcharts already allow this.

Our custom charts, don't.

Todos

Make weaviate image overwritable
Make contextionary image overwritable

Background

This is required for GCP Marketplace integration where every image needs to be pushed to a Google registry.

Add authN/authZ config options

Depends on #5
Required for weaviate/weaviate#876

All auth options must be configurable in config.yaml
new version is released (depends on weaviate/weaviate#877 being merged and a successful CI run)

Set resource requests/limits for weaviate and docker container

At the moment, we can only set resource requests and limits for the esvector and etcd containers, but not for the other two. This should be fixed.

Todos

add to templates
add appropriate defaults in values.yaml
add appropriate defaults in values-minimal.yaml

Bug: DNS names hard-coded, not based on chart name

All DNS discovery names seem to be hardcoded, however the services are named dynamically based on the chart Release name. This means, unless the user choses to name the Chart exactly weaviate all service discovery breaks.

Add support for image module

Vectorizer deployments can become unhealthy under heavy load due to liveness probe timeout being reached

I was running the transformer vectorizer on CPU and sending in heavy traffic at times maxing out its allotted CPU resources. I was observing these transformer vectorizer pods failing and restarting - causing errors being returned by the Weaviate app.

After investigation it became clear that these pods become unhealthy because the liveness probe times out 3 times in a row, which then triggers the pod being marked as unhealthy and so restarted by K8s.

The issue is that the vectorizer deployments are using the default timeout (1s) for the liveness probe, which is low and under heavy load the liveness probes might hit this limit causing the restart of the pod. This 1s timeout could be increased to 3s to decrease the chance of being restarted when under heavy load.

Mulitple instances of the same 'weaviate' pod might end up on the same k8s node

As currently there are no antiAffinity rules in the StatefulSet deploying the weaviate instances, these might end up on the same Kubernetes node.
This is an issue for high(er) availability.

See example below where this is exactly what happened:

Enable CUDA support on Kubernetes for transformers module

As a followup to #32

the inference container should run with CUDA
document the necessary steps (e.g. for GKE)

Bug: Compound Splitting value not used in template

Missing support for the new Weaviate monitoring (Prometheus scraping)

Since v1.14.0 Weaviate publishes metrics for Prometheus to scrape, unfortunately this can only be taken advantage of in Kubernetes, if the Weaviate container exposes the metrics port (port 2112 by default).

I have done this in my own fork: d7ad98e

UPDATE: made a PR for this, see below.

CI job that deploys Weaviate on K8s cluster

Seems currently the PR CI job does only basic helm tests but not an actual integration test. It would be better to use e.g. a small kind cluster in the CI job and deploy the actual helm chart and verify that weaviate works as intended.

See following resources:
https://github.com/helm/kind-action

https://github.com/helm/chart-testing-action

Add custom pvc for etcd disaster recovery

To support other NFS provisioners like amazons EFS some additional properties must be set in the PVC e.g. annotations. This is not supported by the ETCD chart. However it allows to set a custom PCV this can be part of the helm chart.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: etcd-snapshot-disaster-recovery
  namespace: weaviate
  annotations:
    volume.beta.kubernetes.io/storage-class: aws-efs
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 8Gi
  storageClassName: aws-efs

etcd.disasterRecovery.pvc.existingClaim: etcd-snapshot-disaster-recovery

Move contextionary into module section (BREAKING CHANGE)

storageClassName improvements

Currently storageClassName has a default to premium-rwo (which only works in GCP) which is confusing to users.

https://github.com/semi-technologies/weaviate-helm/blob/08e3d5e3634fa78785cf757e62c41d1bee5b23a3/weaviate/values.yaml#L54

We need additional documentation to use different values for Azure i.e. managed-csi-premium, and AWS gp2. I wonder if we should set this field blank to force users to choose? cc @etiennedi

support contextionary microservice (v0.14.2)

Fix enable_modules setting when transformers, qna , ner and spellcheck modules are turned on

Support OpenAI module

support multi2vec-clip module

Support Horizontally Scalable Weaviate (1.8.0?) with inter-node service discovery

Add check for persistent volume claim existance

Set NVIDIA_DRIVER_CAPABILITIES env var for CUDA

If NVIDIA_DRIVER_CAPABILITIES is not set alongside the currently utilized NVIDIA_VISIBLE_DEVICES=all, the NVIDIA runtime seems to not pass through properly.

The documentation states the default value is "compute, utility" but in my testing, not setting this explicitly causes nvidia-smi to report no CUDA version available, while setting it explicitly makes nvidia-smi report an available CUDA version (based on the host's).

Make /weaviate-config/conf.yaml configurable in Helm chart

Issue

Currently the information is hard-coded in the template file for the config map.

Consequence

This means the user cannot set any config values in the k8s setup that deviate from the default configuration

Fix

Instead the contents should be configurable in the values.yaml similar to other configuration.

Add support for qna-transformers module

Changing resource.request.cpu doesn't actually do anything

oops nvm seems I commented out the whole line

example values.yml:

image:
  # registry where weaviate image is stored
  registry: docker.io
  # Tag of weaviate image to deploy
  # Note: We strongly recommend you overwrite this value in your own values.yaml.
  # Otherwise a mere upgrade of the chart could lead to an unexpected upgrade
  # of weaviate. In accordance with Infra-as-code, you should pin this value
  # down and only change it if you explicitly want to upgrade the Weaviate
  # version.
  # TODO change to proper v1.14 version after weaviate release
  tag: latest@sha256:6089441e49cf24a0bd453d8609621bafebf3b292e989ef35e6cec5028f61ece8
  repo: semitechnologies/weaviate

# overwrite command and args if you want to run specific startup scripts, for
# example setting the nofile limit
command: ["/bin/weaviate"]
args:
  - '--host'
  - '0.0.0.0'
  - '--port'
  - '8080'
  - '--scheme'
  - 'http'
  - '--config-file'
  - '/weaviate-config/conf.yaml'

# below is an example that can be used to set an arbitrary nofile limit at
# startup:
#
# command: 
#   - "/bin/sh"
# args: 
#   - "-c"
#   - "ulimit -n 65535 && /bin/weaviate --host 0.0.0.0 --port 8080 --scheme http --config-file /weaviate-config/conf.yaml"

# Scale replicas of Weaviate. Note that as of v1.8.0 dynamic scaling is limited
# to cases where no data is imported yet. Scaling down after importing data may
# break usability. Full dynamic scalability will be added in a future release.
replicas: 1
resources: {}
  # requests:
  #   cpu: '500m'
  #   memory: '300Mi'
  # limits:
  #   cpu: '1000m'
  #   memory: '1Gi'

# The Persistent Volume Claim settings for Weaviate. If there's a
# storage.fullnameOverride field set, then the default pvc will not be
# created, instead the one defined in fullnameOverride will be used
storage:
  size: 32Gi
  storageClassName: gp2

# The service controls how weaviate is exposed to the outside world. If you
# don't want a public load balancer, you can also choose 'ClusterIP' to make
# weaviate only accessible within your cluster.
service:
  name: weaviate
  type: LoadBalancer
  loadBalancerSourceRanges: []
  # optionally set cluster IP if you want to set a static IP
  clusterIP:
  annotations: {}

# Adjust liveness, readiness and startup probes configuration
startupProbe:
  # For kubernetes versions prior to 1.18 startupProbe is not supported thus can be disabled.
  enabled: false

  initialDelaySeconds: 300
  periodSeconds: 60
  failureThreshold: 50
  successThreshold: 1
  timeoutSeconds: 3

livenessProbe:
  initialDelaySeconds: 900
  periodSeconds: 10
  failureThreshold: 30
  successThreshold: 1
  timeoutSeconds: 3

readinessProbe:
  initialDelaySeconds: 3
  periodSeconds: 10
  failureThreshold: 3
  successThreshold: 1
  timeoutSeconds: 3


terminationGracePeriodSeconds: 600

# Weaviate Config
#
# The following settings allow you to customize Weaviate to your needs, for
# example set authentication and authorization options. See weaviate docs
# (https://www.semi.technology/documentation/weaviate/current/) for all
# configuration.
authentication:
  anonymous_access:
    enabled: true
authorization:
  admin_list:
    enabled: false
query_defaults:
  limit: 100
debug: false


# Insert any custom environment variables or envSecrets by putting the exact name
# and desired value into the settings below. Any env name passed will be automatically
# set for the statefulSet.
env:
  # The aggressiveness of the Go Garbage Collector. 100 is the default value.
  GOGC: 100

  # Expose metrics on port 2112 for Prometheus to scrape
  PROMETHEUS_MONITORING_ENABLED: false

envSecrets:


# Configure backup providers
backups:
  # The backup-filesystem module enables creation of the DB backups in
  # the local filesystem
  filesystem:
    enabled: false
    envconfig:
      # Configure folder where backups should be saved
      BACKUP_FILESYSTEM_PATH: /tmp/backups
  
  s3:
    enabled: false
    # If one is using AWS EKS and has already configured K8s Service Account
    # that holds the AWS credentials one can pass a name of that service account
    # here using this setting:
    # serviceAccountName: service-account-name
    envconfig:
      # Configure bucket where backups should be saved, this setting is mandatory
      BACKUP_S3_BUCKET: weaviate-backups

      # Optional setting. Defaults to empty string. 
      # Set this option if you want to save backups to a given location
      # inside the bucket
      # BACKUP_S3_PATH: path/inside/bucket

      # Optional setting. Defaults to AWS S3 (s3.amazonaws.com). 
      # Set this option if you have a MinIO storage configured in your environment
      # and want to use it instead of the AWS S3.
      # BACKUP_S3_ENDPOINT: custom.minio.endpoint.address

      # Optional setting. Defaults to true. 
      # Set this option if you don't want to use SSL.
      # BACKUP_S3_USE_SSL: true

      # You can pass environment AWS settings here:
      # Define the region
      # AWS_REGION: eu-west-1

    # If one uses access and secret key to authorize in AWS one can set them using secrets
    secrets: {}
    #   AWS_ACCESS_KEY_ID: access-key
    #   AWS_SECRET_ACCESS_KEY: secret-key

    # If one has already defined secrets with AWS credentials one can pass them using
    # this setting:
    envSecrets: {}
    #   AWS_ACCESS_KEY_ID: name-of-the-k8s-secret-containing-that-key

  gcs:
    enabled: false
    envconfig:
      # Configure bucket where backups should be saved, this setting is mandatory
      BACKUP_GCS_BUCKET: weaviate-backups

      # Optional setting. Defaults to empty string.
      # Set this option if you want to save backups to a given location
      # inside the bucket
      # BACKUP_GCS_PATH: path/inside/bucket

      # You can pass environment Google settings here:
      # Define the project
      # GOOGLE_CLOUD_PROJECT: project-id

    # In order to pass GOOGLE_APPLICATION_CREDENTIALS one do this using secrets
    secrets: {}
    #   GOOGLE_APPLICATION_CREDENTIALS: credentials-json-string

    # If one has already defined a secret with GOOGLE_APPLICATION_CREDENTIALS one can pass them using
    # this setting:
    envSecrets: {}
    #   GOOGLE_APPLICATION_CREDENTIALS: name-of-the-k8s-secret-containing-that-key


# modules are extensions to Weaviate, they can be used to support various
# ML-models, but also other features unrelated to model inference.
# An inference/vectorizer module is not required, you can also run without any
# modules and import your own vectors.
modules:

  # The text2vec-contextionary module uses a fastText-based vector-space to
  # derive vector embeddings for your objects. It is very efficient on CPUs,
  # but in some situations it cannot reach the same level of accuracy as
  # transformers-based models.
  text2vec-contextionary:
    # disable if you want to use transformers or import or own vectors
    enabled: false

    # The configuration below is ignored if enabled==false
    fullnameOverride: contextionary
    tag: en0.16.0-v1.0.2
    repo: semitechnologies/contextionary
    registry: docker.io
    replicas: 1
    envconfig:
      occurrence_weight_linear_factor: 0.75
      neighbor_occurrence_ignore_percentile: 5
      enable_compound_splitting: false
      extensions_storage_mode: weaviate
    resources:
      requests:
        cpu: '600m'
        memory: '500Mi'
      limits:
        cpu: '1000m'
        memory: '5000Mi'

    # You can guide where the pods are scheduled on a per-module basis,
    # as well as for Weaviate overall. Each module accepts nodeSelector,
    # tolerations, and affinity configuration. If it is set on a per-
    # module basis, this configuration overrides the global config.

    nodeSelector: {}

    tolerations:
      - key: "my-example-key"
        operator: "Exists"
        effect: "NoSchedule"

    affinity: {}

  # The text2vec-transformers modules uses neural networks, such as BERT,
  # DistilBERT, etc. to dynamically compute vector embeddings based on the
  # sentence's context. It is very slow on CPUs and should run with
  # CUDA-enabled GPUs for optimal performance.
  text2vec-transformers:

    # enable if you want to use transformers instead of the
    # text2vec-contextionary module
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    # The configuration below is ignored if enabled==false

    # replace with model of choice, see
    # https://www.semi.technology/developers/weaviate/current/modules/text2vec-transformers.html
    # for all supported models or build your own container.
    tag: distilbert-base-uncased
    repo: semitechnologies/transformers-inference
    registry: docker.io
    replicas: 1
    fullnameOverride: transformers-inference
    probeInitialDelaySeconds: 120
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

    passageQueryServices:
      passage:
        enabled: false
        # You can set directly an inference URL of this module without deploying it with this release.
        # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
        inferenceUrl: {}

        tag: facebook-dpr-ctx_encoder-single-nq-base
        repo: semitechnologies/transformers-inference
        registry: docker.io
        replicas: 1
        fullnameOverride: transformers-inference-passage
        envconfig:
          # enable for CUDA support. Your K8s cluster needs to be configured
          # accordingly and you need to explicitly set GPU requests & limits below
          enable_cuda: false

          # only used when cuda is enabled
          nvidia_visible_devices: all

          # only used when cuda is enabled
          ld_library_path: /usr/local/nvidia/lib64

        resources:
          requests:
            cpu: '1000m'
            memory: '3000Mi'

            # enable if running with CUDA support
            # nvidia.com/gpu: 1
          limits:
            cpu: '1000m'
            memory: '5000Mi'

            # enable if running with CUDA support
            # nvidia.com/gpu: 1
      query:
        enabled: false
        # You can set directly an inference URL of this module without deploying it with this release.
        # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
        inferenceUrl: {}

        tag: facebook-dpr-question_encoder-single-nq-base
        repo: semitechnologies/transformers-inference
        registry: docker.io
        replicas: 1
        fullnameOverride: transformers-inference-query
        envconfig:
          # enable for CUDA support. Your K8s cluster needs to be configured
          # accordingly and you need to explicitly set GPU requests & limits below
          enable_cuda: false

          # only used when cuda is enabled
          nvidia_visible_devices: all

          # only used when cuda is enabled
          ld_library_path: /usr/local/nvidia/lib64

        resources:
          requests:
            cpu: '1000m'
            memory: '3000Mi'

            # enable if running with CUDA support
            # nvidia.com/gpu: 1
          limits:
            cpu: '1000m'
            memory: '5000Mi'

            # enable if running with CUDA support
            # nvidia.com/gpu: 1

  # The text2vec-openai module uses OpenAI Embeddings API
  # to dynamically compute vector embeddings based on the
  # sentence's context.
  # More information about OpenAI Embeddings API can be found here:
  # https://beta.openai.com/docs/guides/embeddings/what-are-embeddings
  text2vec-openai:

    # enable if you want to use OpenAI module
    enabled: false

    # Set your OpenAI API Key to be passed to Weaviate pod as
    # an environment variable
    apiKey: ''

  # The text2vec-huggingface module uses HuggingFace API
  # to dynamically compute vector embeddings based on the
  # sentence's context.
  # More information about HuggingFace API can be found here:
  # https://huggingface.co/docs/api-inference/detailed_parameters#feature-extraction-task
  text2vec-huggingface:

    # enable if you want to use HuggingFace module
    enabled: false

    # Set your HuggingFace API Key to be passed to Weaviate pod as
    # an environment variable
    apiKey: ''

  # The multi2vec-clip modules uses CLIP transformers to vectorize both images
  # and text in the same vector space. It is typically slow(er) on CPUs and should
  # run with CUDA-enabled GPUs for optimal performance.
  multi2vec-clip:

    # enable if you want to use transformers instead of the
    # text2vec-contextionary module
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}

    # The configuration below is ignored if enabled==false

    # replace with model of choice, see
    # https://www.semi.technology/developers/weaviate/current/modules/multi2vec-clip.html
    # for all supported models or build your own container.
    tag: sentence-transformers-clip-ViT-B-32-multilingual-v1
    repo: semitechnologies/multi2vec-clip
    registry: docker.io
    replicas: 1
    fullnameOverride: clip-inference
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The qna-transformers module uses neural networks, such as BERT,
  # DistilBERT, to find an aswer in text to a given question
  qna-transformers:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: bert-large-uncased-whole-word-masking-finetuned-squad-34d66b1
    repo: semitechnologies/qna-transformers
    registry: docker.io
    replicas: 1
    fullnameOverride: qna-transformers
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The img2vec-neural module uses neural networks, to generate
  # a vector representation of the image
  img2vec-neural:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: resnet50
    repo: semitechnologies/img2vec-pytorch
    registry: docker.io
    replicas: 1
    fullnameOverride: img2vec-neural
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The text-spellcheck module uses spellchecker library to check
  # misspellings in a given text
  text-spellcheck:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: pyspellchecker-en
    repo: semitechnologies/text-spellcheck-model
    registry: docker.io
    replicas: 1
    fullnameOverride: text-spellcheck
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The ner-transformers module uses spellchecker library to check
  # misspellings in a given text
  ner-transformers:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: dbmdz-bert-large-cased-finetuned-conll03-english-0.0.2
    repo: semitechnologies/ner-transformers
    registry: docker.io
    replicas: 1
    fullnameOverride: ner-transformers
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The sum-transformers module makes result texts summarizations
  sum-transformers:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: facebook-bart-large-cnn-1.0.0
    repo: semitechnologies/sum-transformers
    registry: docker.io
    replicas: 1
    fullnameOverride: sum-transformers
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # by choosing the default vectorizer module, you can tell Weaviate to always
  # use this module as the vectorizer if nothing else is specified. Can be
  # overwritten on a per-class basis.
  # set to text2vec-transformers if running with transformers instead
  default_vectorizer_module: none

# It is also possible to configure authentication and authorization through a
# custom configmap The authorization and authentication values defined in
# values.yaml will be ignored when defining a custom config map.
custom_config_map:
  enabled: false
  name: 'custom-config'

# The collector proxy collects meta data over the requests.  It deploys a
# second service that, if used, will capture meta data over the incoming
# requests.  The collected data may be used to optimize the software or detect
# malicious attacks.
collector_proxy:
  enabled: false
  tag: latest
  weaviate_enterprise_usage_collector_origin: ''
  weaviate_enterprise_token: ''
  weaviate_enterprise_project: ''
  service:
    name: 'usage-proxy'
    port: 80
    type: LoadBalancer
    annotations: {}

# Pass any annotations to Weaviate pods
annotations: {}

nodeSelector: {}

tolerations: []

affinity: {}

However the pod spec doesn't have resource.request.cpu set:

k get pod weaviate-0 -o yaml | grep resources -C 5 
        scheme: HTTP
      initialDelaySeconds: 3
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 3
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /weaviate-config
      name: weaviate-config

Add support for collector usage domain variable

Support transformers module

Goals

when the contextionary module is enabled: false and no other module is disabled, weaviate is deployed without any module
when the contextionary module is enabled: false and the transformers module is enabled: true, weaviate is deployed with the transformers module, but without the contextionary module
the transformers inference container can be configured (by replacing the docker image)

Scaling weaviate and contextionary not in values

It is only manually possible to scale both the weaviate and the contextionary helm chart manually.

While the values has a value for the weaviate replicas this value is never used: https://github.com/semi-technologies/weaviate-helm/blob/master/weaviate/values.yaml#L16

The value for the contextionary is completely missing.

Not that scaling the weaviate pod might lead to issues when changing the schema: weaviate/weaviate#725

Unable to install multiple Weaviate clusters in same namespace

The templates hardcode a resource name instead of relying on the helm release name. So when you try to deploy another weaviate cluster it would fail because there is already a statefulset named weaviate.

I propose to instead use this format for all resources:

kind: StatefulSet     
metadata:     
  name: "{{ .Release.Name }}-weaviate"     
  labels:                                                                                                             
    name: "{{ .Release.Name }}-weaviate"                                                                          
    app: "{{ .Release.Name }}-weaviate"                                                                          
    app.kubernetes.io/name: "{{ .Release.Name }}-weaviate"                                                     
    app.kubernetes.io/managed-by: helm                         
    app.kubernetes.io/component: database                       
    app.kubernetes.io/version: "{{ .Values.image.tag }}"

Note this feature will only be considered if there is end-user need for this. Please upvote this issue if this would be needed for your use case and leave a comment with your use case.

Set documentation and rbac config

Add support for img2vec-neural module

Typo in sample values.yaml for config

Just wanted to point out a typo in the GOMEMLIMIT config in the default values.yaml. Ran into this when trying to use it and just blindly uncommenting.

https://github.com/weaviate/weaviate-helm/blob/master/weaviate/values.yaml#L131

Service Discovery between c11y svc and etcd broken

The env var set for the c11y service doesn't match the service name.

Add support for Weaviate 1.7.0

Bug: etcd cluster unrecoverable if majority of nodes die

This is because of an issue in the etcd chart we depend on:
see bitnami/charts#1125
and bitnami/charts#1209

Warning, upgrading will break backward compatibility!

adapt to move of weaviate from creativesoftwarefdn to semi-technologies

Required after weaviate/weaviate#860

fix docs links
fix default image tag
investigate if other stuff needs to be fixed

Support multiple modules

Add Setup optimized for Sandboxes and Minikube

Use workload identity for GCS backups in GKE

it should be possible to use workload identity for GCS backups instead of relying on passing JSON keys. Many large enterprise have an org policy that prevents them from downloading JSON keys for service accounts.

https://cloud.google.com/kubernetes-engine/docs/concepts/workload-identity

Publish the helm chart to a helm repo

it would be nice to have the helm chart published here https://artifacthub.io/packages/search?kind=0

Then getting new versions and users always getting the latest versions will be easier. Right now the docs tell us to do this:

# Set the Weaviate chart version
export CHART_VERSION="v15.0.0"
# Download the Weaviate Helm chart
wget https://github.com/weaviate/weaviate-helm/releases/download/$CHART_VERSION/weaviate.tgz
# Download an example values.yml (with the default configuration)
wget https://raw.githubusercontent.com/weaviate/weaviate-helm/$CHART_VERSION/weaviate/values.yaml
# Create a Weaviate namespace
$ kubectl create namespace weaviate

# set the desired Weaviate version
export WEAVIATE_VERSION="1.17.2"

# Deploy
$ helm upgrade \
  "weaviate" \
  weaviate.tgz \
  --install \
  --namespace "weaviate" \
  --values ./values.yaml \
  --set "image.tag=$WEAVIATE_VERSION"

Issues with this approach:

The docs are always outdated since right now v15.4.0 would be the latest version
more steps
People have to download the chart locally and sometimes seem to get the location incorrect (saw that on slack once)

What would be nicer instead:

helm repo add weaviate https://weaviate.github.io/helm-charts
helm install my-weaviate-stack weaviate/weaviate

^^ which would automatically use the latest published chart

Weaviate stand-alone by default

Can this value be false so that Weaviate always is started stand-alone by default and that it aligns with our soon to be updated docs?

Follow helm and K8s best practices on labeling

https://helm.sh/docs/chart_best_practices/labels/

Link to docs no longer works

Link in readme redirects to https://www.semi.technology/documentation/weaviate/current/installation.html#kubernetes which 404s.

add ci pipeline to package and tag chart

build
make downloadable as release
run automatically on TAG/PUSH with Travis

Helm chart incompatible with K8s version >= 1.16

I was following the documentation and get stuck on:

Deploy

$ helm upgrade
--values ./values.yaml
--install
--namespace "weaviate"
"weaviate"
weaviate.tgz

Here is what I get:
$ helm upgrade --values ./values.yaml --install --namespace "weaviate" "weaviate" weaviate.tgz
Release "weaviate" does not exist. Installing it now.
Error: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "Deployment" in version "apps/v1beta1", unable to recognize "": no matches for kind "StatefulSet" in version "apps/v1beta1", unable to recognize "": no matches for kind "StatefulSet" in version "apps/v1beta2"]

My setup:
$ helm version
version.BuildInfo{Version:"v3.0.1", GitCommit:"7c22ef9ce89e0ebeb7125ba2ebf7d421f3e82ffa", GitTreeState:"clean", GoVersion:"go1.13.4"}
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-9vx7p 1/1 Running 0 2d17h
coredns-6955765f44-g67tv 1/1 Running 0 2d17h
etcd-minikube 1/1 Running 0 2d17h
kube-addon-manager-minikube 1/1 Running 0 2d17h
kube-apiserver-minikube 1/1 Running 0 2d17h
kube-controller-manager-minikube 1/1 Running 0 2d17h
kube-proxy-mfdkp 1/1 Running 0 2d17h
kube-scheduler-minikube 1/1 Running 1 2d17h
storage-provisioner 1/1 Running 0 2d17h

spread weaviate replicas across nodes

Here are some docs:

https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.