GithubHelp home page GithubHelp logo

weaviate / weaviate-helm Goto Github PK

View Code? Open in Web Editor NEW
45.0 21.0 59.0 1007 KB

Helm charts to deploy Weaviate to k8s

Home Page: https://weaviate.io/developers/weaviate/current/

License: BSD 3-Clause "New" or "Revised" License

Shell 78.96% Smarty 21.04%

weaviate-helm's Introduction

Weaviate Helm Chart Weaviate logo

Helm chart for Weaviate application. Weaviate can be deployed to a Kubernetes cluster using this chart.

Usage

Helm must be installed in order to use the weaviate chart. Please refer to Helm's documentation on how to get started.

Once Helm is set up properly, add the repo as follows:

helm repo add weaviate https://weaviate.github.io/weaviate-helm
helm install my-weaviate weaviate/weaviate

Documentation can be found here.

Migration from older versions to v1.25.x and above

Weaviate v1.25 has brought a significant change in how we bootstrap the Weaviate cluster. We have changed the podManagementPolicy from OrderedReady to Parallel. This change is required for the Raft-based consensus model that Weaviate now utilizes under the hood. For the Raft cluster to be properly bootstrapped, all pods in the cluster must start simultaneously.

Please note that once the Raft cluster is established, rolling updates are possible. This change will only take effect during migration from versions prior to v1.25 (or when bootstrapping a new v1.25 cluster).

If you are upgrading from a version older than v1.25 to v1.25 and above, you must first delete Weaviate's Statefulset. This is a one-time operation and will not remove your data, it is necessary to make the update of Statefulset settings possible.

Detailed information can be found in the documentation.

(for contributors) How to make new releases

  1. Bump chart version in ./weaviate/Chart.yaml
  2. Create a commit
  3. Create an annotated tag matching the version number in Chart.yaml (prefix with a v, such as v1.4.3)
  4. Push commit with git push
  5. Push tag with git push origin --tags
  6. Wait for GH Action to complete, it will create a drafted release with the packaged chart attached
  7. Edit the draft to include useful release notes and publish when appropriate

weaviate-helm's People

Contributors

ahsanemon avatar alembiewski avatar aliszka avatar andrewisplinghoff avatar antas-marcin avatar bobvanluijt avatar cdpierse avatar dirkkul avatar donomii avatar dvanderrijst avatar etiennedi avatar fefi42 avatar goodgravy avatar hkhairy avatar jgoldin-skillz avatar kcm avatar kristofvc avatar kvbutler avatar laura-ham avatar lewiky avatar matthew-graves avatar parkerduckworth avatar rbtz-openai avatar redouan-rhazouani avatar samos123 avatar stefanbogdan avatar trengrj avatar vukor avatar zoltan-fedor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

weaviate-helm's Issues

Adapt to Weaviate standalone

Required Changes

  • Previously the weaviate container was stateless and did not require local persistence. With standalone requiring disk access, weaviate can no longer be a Deployment, but needs to be a StatefulSet
  • Remove esvector
  • Remove etcd (unless still required by the contextionary?)
  • Possibly update contextionary requirements if necessary. Unknown as of today, since this depends on the outcome of semitechnologies/weaviate#1252

CrashLoopBackOff etcd-0 kubernetes docker for windows

I discovering your product and I tried to deploy weaviate on my locak kubernetes cluster that comes with Docker for windows.

The deployment using helm was succesfully.

D:\repos\weaviate\weaviate-helm\weaviate>helm install --values .\values-minimal.yaml --namespace "weaviate" "weaviate" weaviate-8.2.1.tgz
NAME: weaviate
LAST DEPLOYED: Tue Jan 21 08:45:25 2020
NAMESPACE: weaviate
STATUS: deployed
REVISION: 1

D:\repos\weaviate\weaviate-helm\weaviate>kubectl get deployment --namespace weaviate
NAME READY UP-TO-DATE AVAILABLE AGE
contextionary 1/1 1 1 21s
weaviate 0/1 1 0 21s

But etcd-0 seems to have a hard time.

D:\repos\weaviate\weaviate-helm\weaviate>kubectl get pods --namespace weaviate
NAME READY STATUS RESTARTS AGE
contextionary-fd94d5bb4-vt9tp 1/1 Running 0 39s
esvector-master-0 0/1 Running 1 39s
etcd-0 0/1 CrashLoopBackOff 2 39s
weaviate-7d5b6fb5c4-pkrqb 0/1 Running 0 39s

The log contains:

==> Creating data dir...
mkdir: cannot create directory '/bitnami/etcd/data': Permission denied

Make all images overwritable

The subcharts already allow this.

Our custom charts, don't.

Todos

  • Make weaviate image overwritable
  • Make contextionary image overwritable

Background

This is required for GCP Marketplace integration where every image needs to be pushed to a Google registry.

Set resource requests/limits for weaviate and docker container

At the moment, we can only set resource requests and limits for the esvector and etcd containers, but not for the other two. This should be fixed.

Todos

  • add to templates
  • add appropriate defaults in values.yaml
  • add appropriate defaults in values-minimal.yaml

Bug: DNS names hard-coded, not based on chart name

All DNS discovery names seem to be hardcoded, however the services are named dynamically based on the chart Release name. This means, unless the user choses to name the Chart exactly weaviate all service discovery breaks.

Vectorizer deployments can become unhealthy under heavy load due to liveness probe timeout being reached

I was running the transformer vectorizer on CPU and sending in heavy traffic at times maxing out its allotted CPU resources. I was observing these transformer vectorizer pods failing and restarting - causing errors being returned by the Weaviate app.

After investigation it became clear that these pods become unhealthy because the liveness probe times out 3 times in a row, which then triggers the pod being marked as unhealthy and so restarted by K8s.

The issue is that the vectorizer deployments are using the default timeout (1s) for the liveness probe, which is low and under heavy load the liveness probes might hit this limit causing the restart of the pod. This 1s timeout could be increased to 3s to decrease the chance of being restarted when under heavy load.

Add custom pvc for etcd disaster recovery

To support other NFS provisioners like amazons EFS some additional properties must be set in the PVC e.g. annotations. This is not supported by the ETCD chart. However it allows to set a custom PCV this can be part of the helm chart.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: etcd-snapshot-disaster-recovery
  namespace: weaviate
  annotations:
    volume.beta.kubernetes.io/storage-class: aws-efs
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 8Gi
  storageClassName: aws-efs

etcd.disasterRecovery.pvc.existingClaim: etcd-snapshot-disaster-recovery

Set NVIDIA_DRIVER_CAPABILITIES env var for CUDA

If NVIDIA_DRIVER_CAPABILITIES is not set alongside the currently utilized NVIDIA_VISIBLE_DEVICES=all, the NVIDIA runtime seems to not pass through properly.

The documentation states the default value is "compute, utility" but in my testing, not setting this explicitly causes nvidia-smi to report no CUDA version available, while setting it explicitly makes nvidia-smi report an available CUDA version (based on the host's).

Make /weaviate-config/conf.yaml configurable in Helm chart

Issue

Currently the information is hard-coded in the template file for the config map.

Consequence

This means the user cannot set any config values in the k8s setup that deviate from the default configuration

Fix

Instead the contents should be configurable in the values.yaml similar to other configuration.

Changing resource.request.cpu doesn't actually do anything

oops nvm seems I commented out the whole line

example values.yml:

image:
  # registry where weaviate image is stored
  registry: docker.io
  # Tag of weaviate image to deploy
  # Note: We strongly recommend you overwrite this value in your own values.yaml.
  # Otherwise a mere upgrade of the chart could lead to an unexpected upgrade
  # of weaviate. In accordance with Infra-as-code, you should pin this value
  # down and only change it if you explicitly want to upgrade the Weaviate
  # version.
  # TODO change to proper v1.14 version after weaviate release
  tag: latest@sha256:6089441e49cf24a0bd453d8609621bafebf3b292e989ef35e6cec5028f61ece8
  repo: semitechnologies/weaviate

# overwrite command and args if you want to run specific startup scripts, for
# example setting the nofile limit
command: ["/bin/weaviate"]
args:
  - '--host'
  - '0.0.0.0'
  - '--port'
  - '8080'
  - '--scheme'
  - 'http'
  - '--config-file'
  - '/weaviate-config/conf.yaml'

# below is an example that can be used to set an arbitrary nofile limit at
# startup:
#
# command: 
#   - "/bin/sh"
# args: 
#   - "-c"
#   - "ulimit -n 65535 && /bin/weaviate --host 0.0.0.0 --port 8080 --scheme http --config-file /weaviate-config/conf.yaml"

# Scale replicas of Weaviate. Note that as of v1.8.0 dynamic scaling is limited
# to cases where no data is imported yet. Scaling down after importing data may
# break usability. Full dynamic scalability will be added in a future release.
replicas: 1
resources: {}
  # requests:
  #   cpu: '500m'
  #   memory: '300Mi'
  # limits:
  #   cpu: '1000m'
  #   memory: '1Gi'

# The Persistent Volume Claim settings for Weaviate. If there's a
# storage.fullnameOverride field set, then the default pvc will not be
# created, instead the one defined in fullnameOverride will be used
storage:
  size: 32Gi
  storageClassName: gp2

# The service controls how weaviate is exposed to the outside world. If you
# don't want a public load balancer, you can also choose 'ClusterIP' to make
# weaviate only accessible within your cluster.
service:
  name: weaviate
  type: LoadBalancer
  loadBalancerSourceRanges: []
  # optionally set cluster IP if you want to set a static IP
  clusterIP:
  annotations: {}

# Adjust liveness, readiness and startup probes configuration
startupProbe:
  # For kubernetes versions prior to 1.18 startupProbe is not supported thus can be disabled.
  enabled: false

  initialDelaySeconds: 300
  periodSeconds: 60
  failureThreshold: 50
  successThreshold: 1
  timeoutSeconds: 3

livenessProbe:
  initialDelaySeconds: 900
  periodSeconds: 10
  failureThreshold: 30
  successThreshold: 1
  timeoutSeconds: 3

readinessProbe:
  initialDelaySeconds: 3
  periodSeconds: 10
  failureThreshold: 3
  successThreshold: 1
  timeoutSeconds: 3


terminationGracePeriodSeconds: 600

# Weaviate Config
#
# The following settings allow you to customize Weaviate to your needs, for
# example set authentication and authorization options. See weaviate docs
# (https://www.semi.technology/documentation/weaviate/current/) for all
# configuration.
authentication:
  anonymous_access:
    enabled: true
authorization:
  admin_list:
    enabled: false
query_defaults:
  limit: 100
debug: false


# Insert any custom environment variables or envSecrets by putting the exact name
# and desired value into the settings below. Any env name passed will be automatically
# set for the statefulSet.
env:
  # The aggressiveness of the Go Garbage Collector. 100 is the default value.
  GOGC: 100

  # Expose metrics on port 2112 for Prometheus to scrape
  PROMETHEUS_MONITORING_ENABLED: false

envSecrets:


# Configure backup providers
backups:
  # The backup-filesystem module enables creation of the DB backups in
  # the local filesystem
  filesystem:
    enabled: false
    envconfig:
      # Configure folder where backups should be saved
      BACKUP_FILESYSTEM_PATH: /tmp/backups
  
  s3:
    enabled: false
    # If one is using AWS EKS and has already configured K8s Service Account
    # that holds the AWS credentials one can pass a name of that service account
    # here using this setting:
    # serviceAccountName: service-account-name
    envconfig:
      # Configure bucket where backups should be saved, this setting is mandatory
      BACKUP_S3_BUCKET: weaviate-backups

      # Optional setting. Defaults to empty string. 
      # Set this option if you want to save backups to a given location
      # inside the bucket
      # BACKUP_S3_PATH: path/inside/bucket

      # Optional setting. Defaults to AWS S3 (s3.amazonaws.com). 
      # Set this option if you have a MinIO storage configured in your environment
      # and want to use it instead of the AWS S3.
      # BACKUP_S3_ENDPOINT: custom.minio.endpoint.address

      # Optional setting. Defaults to true. 
      # Set this option if you don't want to use SSL.
      # BACKUP_S3_USE_SSL: true

      # You can pass environment AWS settings here:
      # Define the region
      # AWS_REGION: eu-west-1

    # If one uses access and secret key to authorize in AWS one can set them using secrets
    secrets: {}
    #   AWS_ACCESS_KEY_ID: access-key
    #   AWS_SECRET_ACCESS_KEY: secret-key

    # If one has already defined secrets with AWS credentials one can pass them using
    # this setting:
    envSecrets: {}
    #   AWS_ACCESS_KEY_ID: name-of-the-k8s-secret-containing-that-key

  gcs:
    enabled: false
    envconfig:
      # Configure bucket where backups should be saved, this setting is mandatory
      BACKUP_GCS_BUCKET: weaviate-backups

      # Optional setting. Defaults to empty string.
      # Set this option if you want to save backups to a given location
      # inside the bucket
      # BACKUP_GCS_PATH: path/inside/bucket

      # You can pass environment Google settings here:
      # Define the project
      # GOOGLE_CLOUD_PROJECT: project-id

    # In order to pass GOOGLE_APPLICATION_CREDENTIALS one do this using secrets
    secrets: {}
    #   GOOGLE_APPLICATION_CREDENTIALS: credentials-json-string

    # If one has already defined a secret with GOOGLE_APPLICATION_CREDENTIALS one can pass them using
    # this setting:
    envSecrets: {}
    #   GOOGLE_APPLICATION_CREDENTIALS: name-of-the-k8s-secret-containing-that-key


# modules are extensions to Weaviate, they can be used to support various
# ML-models, but also other features unrelated to model inference.
# An inference/vectorizer module is not required, you can also run without any
# modules and import your own vectors.
modules:

  # The text2vec-contextionary module uses a fastText-based vector-space to
  # derive vector embeddings for your objects. It is very efficient on CPUs,
  # but in some situations it cannot reach the same level of accuracy as
  # transformers-based models.
  text2vec-contextionary:
    # disable if you want to use transformers or import or own vectors
    enabled: false

    # The configuration below is ignored if enabled==false
    fullnameOverride: contextionary
    tag: en0.16.0-v1.0.2
    repo: semitechnologies/contextionary
    registry: docker.io
    replicas: 1
    envconfig:
      occurrence_weight_linear_factor: 0.75
      neighbor_occurrence_ignore_percentile: 5
      enable_compound_splitting: false
      extensions_storage_mode: weaviate
    resources:
      requests:
        cpu: '600m'
        memory: '500Mi'
      limits:
        cpu: '1000m'
        memory: '5000Mi'

    # You can guide where the pods are scheduled on a per-module basis,
    # as well as for Weaviate overall. Each module accepts nodeSelector,
    # tolerations, and affinity configuration. If it is set on a per-
    # module basis, this configuration overrides the global config.

    nodeSelector: {}

    tolerations:
      - key: "my-example-key"
        operator: "Exists"
        effect: "NoSchedule"

    affinity: {}

  # The text2vec-transformers modules uses neural networks, such as BERT,
  # DistilBERT, etc. to dynamically compute vector embeddings based on the
  # sentence's context. It is very slow on CPUs and should run with
  # CUDA-enabled GPUs for optimal performance.
  text2vec-transformers:

    # enable if you want to use transformers instead of the
    # text2vec-contextionary module
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    # The configuration below is ignored if enabled==false

    # replace with model of choice, see
    # https://www.semi.technology/developers/weaviate/current/modules/text2vec-transformers.html
    # for all supported models or build your own container.
    tag: distilbert-base-uncased
    repo: semitechnologies/transformers-inference
    registry: docker.io
    replicas: 1
    fullnameOverride: transformers-inference
    probeInitialDelaySeconds: 120
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

    passageQueryServices:
      passage:
        enabled: false
        # You can set directly an inference URL of this module without deploying it with this release.
        # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
        inferenceUrl: {}

        tag: facebook-dpr-ctx_encoder-single-nq-base
        repo: semitechnologies/transformers-inference
        registry: docker.io
        replicas: 1
        fullnameOverride: transformers-inference-passage
        envconfig:
          # enable for CUDA support. Your K8s cluster needs to be configured
          # accordingly and you need to explicitly set GPU requests & limits below
          enable_cuda: false

          # only used when cuda is enabled
          nvidia_visible_devices: all

          # only used when cuda is enabled
          ld_library_path: /usr/local/nvidia/lib64

        resources:
          requests:
            cpu: '1000m'
            memory: '3000Mi'

            # enable if running with CUDA support
            # nvidia.com/gpu: 1
          limits:
            cpu: '1000m'
            memory: '5000Mi'

            # enable if running with CUDA support
            # nvidia.com/gpu: 1
      query:
        enabled: false
        # You can set directly an inference URL of this module without deploying it with this release.
        # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
        inferenceUrl: {}

        tag: facebook-dpr-question_encoder-single-nq-base
        repo: semitechnologies/transformers-inference
        registry: docker.io
        replicas: 1
        fullnameOverride: transformers-inference-query
        envconfig:
          # enable for CUDA support. Your K8s cluster needs to be configured
          # accordingly and you need to explicitly set GPU requests & limits below
          enable_cuda: false

          # only used when cuda is enabled
          nvidia_visible_devices: all

          # only used when cuda is enabled
          ld_library_path: /usr/local/nvidia/lib64

        resources:
          requests:
            cpu: '1000m'
            memory: '3000Mi'

            # enable if running with CUDA support
            # nvidia.com/gpu: 1
          limits:
            cpu: '1000m'
            memory: '5000Mi'

            # enable if running with CUDA support
            # nvidia.com/gpu: 1

  # The text2vec-openai module uses OpenAI Embeddings API
  # to dynamically compute vector embeddings based on the
  # sentence's context.
  # More information about OpenAI Embeddings API can be found here:
  # https://beta.openai.com/docs/guides/embeddings/what-are-embeddings
  text2vec-openai:

    # enable if you want to use OpenAI module
    enabled: false

    # Set your OpenAI API Key to be passed to Weaviate pod as
    # an environment variable
    apiKey: ''

  # The text2vec-huggingface module uses HuggingFace API
  # to dynamically compute vector embeddings based on the
  # sentence's context.
  # More information about HuggingFace API can be found here:
  # https://huggingface.co/docs/api-inference/detailed_parameters#feature-extraction-task
  text2vec-huggingface:

    # enable if you want to use HuggingFace module
    enabled: false

    # Set your HuggingFace API Key to be passed to Weaviate pod as
    # an environment variable
    apiKey: ''

  # The multi2vec-clip modules uses CLIP transformers to vectorize both images
  # and text in the same vector space. It is typically slow(er) on CPUs and should
  # run with CUDA-enabled GPUs for optimal performance.
  multi2vec-clip:

    # enable if you want to use transformers instead of the
    # text2vec-contextionary module
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}

    # The configuration below is ignored if enabled==false

    # replace with model of choice, see
    # https://www.semi.technology/developers/weaviate/current/modules/multi2vec-clip.html
    # for all supported models or build your own container.
    tag: sentence-transformers-clip-ViT-B-32-multilingual-v1
    repo: semitechnologies/multi2vec-clip
    registry: docker.io
    replicas: 1
    fullnameOverride: clip-inference
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The qna-transformers module uses neural networks, such as BERT,
  # DistilBERT, to find an aswer in text to a given question
  qna-transformers:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: bert-large-uncased-whole-word-masking-finetuned-squad-34d66b1
    repo: semitechnologies/qna-transformers
    registry: docker.io
    replicas: 1
    fullnameOverride: qna-transformers
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The img2vec-neural module uses neural networks, to generate
  # a vector representation of the image
  img2vec-neural:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: resnet50
    repo: semitechnologies/img2vec-pytorch
    registry: docker.io
    replicas: 1
    fullnameOverride: img2vec-neural
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The text-spellcheck module uses spellchecker library to check
  # misspellings in a given text
  text-spellcheck:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: pyspellchecker-en
    repo: semitechnologies/text-spellcheck-model
    registry: docker.io
    replicas: 1
    fullnameOverride: text-spellcheck
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The ner-transformers module uses spellchecker library to check
  # misspellings in a given text
  ner-transformers:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: dbmdz-bert-large-cased-finetuned-conll03-english-0.0.2
    repo: semitechnologies/ner-transformers
    registry: docker.io
    replicas: 1
    fullnameOverride: ner-transformers
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # The sum-transformers module makes result texts summarizations
  sum-transformers:
    enabled: false
    # You can set directly an inference URL of this module without deploying it with this release.
    # You can do so by setting a value for the `inferenceUrl` here AND by setting the `enable` to `false`
    inferenceUrl: {}
    tag: facebook-bart-large-cnn-1.0.0
    repo: semitechnologies/sum-transformers
    registry: docker.io
    replicas: 1
    fullnameOverride: sum-transformers
    envconfig:
      # enable for CUDA support. Your K8s cluster needs to be configured
      # accordingly and you need to explicitly set GPU requests & limits below
      enable_cuda: false

      # only used when cuda is enabled
      nvidia_visible_devices: all

      # only used when cuda is enabled
      ld_library_path: /usr/local/nvidia/lib64

    resources:
      requests:
        cpu: '1000m'
        memory: '3000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1
      limits:
        cpu: '1000m'
        memory: '5000Mi'

        # enable if running with CUDA support
        # nvidia.com/gpu: 1

  # by choosing the default vectorizer module, you can tell Weaviate to always
  # use this module as the vectorizer if nothing else is specified. Can be
  # overwritten on a per-class basis.
  # set to text2vec-transformers if running with transformers instead
  default_vectorizer_module: none

# It is also possible to configure authentication and authorization through a
# custom configmap The authorization and authentication values defined in
# values.yaml will be ignored when defining a custom config map.
custom_config_map:
  enabled: false
  name: 'custom-config'

# The collector proxy collects meta data over the requests.  It deploys a
# second service that, if used, will capture meta data over the incoming
# requests.  The collected data may be used to optimize the software or detect
# malicious attacks.
collector_proxy:
  enabled: false
  tag: latest
  weaviate_enterprise_usage_collector_origin: ''
  weaviate_enterprise_token: ''
  weaviate_enterprise_project: ''
  service:
    name: 'usage-proxy'
    port: 80
    type: LoadBalancer
    annotations: {}

# Pass any annotations to Weaviate pods
annotations: {}

nodeSelector: {}

tolerations: []

affinity: {}

However the pod spec doesn't have resource.request.cpu set:

k get pod weaviate-0 -o yaml | grep resources -C 5 
        scheme: HTTP
      initialDelaySeconds: 3
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 3
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /weaviate-config
      name: weaviate-config

Support transformers module

Goals

  • when the contextionary module is enabled: false and no other module is disabled, weaviate is deployed without any module
  • when the contextionary module is enabled: false and the transformers module is enabled: true, weaviate is deployed with the transformers module, but without the contextionary module
  • the transformers inference container can be configured (by replacing the docker image)

Unable to install multiple Weaviate clusters in same namespace

The templates hardcode a resource name instead of relying on the helm release name. So when you try to deploy another weaviate cluster it would fail because there is already a statefulset named weaviate.

I propose to instead use this format for all resources:

kind: StatefulSet     
metadata:     
  name: "{{ .Release.Name }}-weaviate"     
  labels:                                                                                                             
    name: "{{ .Release.Name }}-weaviate"                                                                          
    app: "{{ .Release.Name }}-weaviate"                                                                          
    app.kubernetes.io/name: "{{ .Release.Name }}-weaviate"                                                     
    app.kubernetes.io/managed-by: helm                         
    app.kubernetes.io/component: database                       
    app.kubernetes.io/version: "{{ .Values.image.tag }}"

Note this feature will only be considered if there is end-user need for this. Please upvote this issue if this would be needed for your use case and leave a comment with your use case.

Publish the helm chart to a helm repo

it would be nice to have the helm chart published here https://artifacthub.io/packages/search?kind=0

Then getting new versions and users always getting the latest versions will be easier. Right now the docs tell us to do this:

# Set the Weaviate chart version
export CHART_VERSION="v15.0.0"
# Download the Weaviate Helm chart
wget https://github.com/weaviate/weaviate-helm/releases/download/$CHART_VERSION/weaviate.tgz
# Download an example values.yml (with the default configuration)
wget https://raw.githubusercontent.com/weaviate/weaviate-helm/$CHART_VERSION/weaviate/values.yaml
# Create a Weaviate namespace
$ kubectl create namespace weaviate

# set the desired Weaviate version
export WEAVIATE_VERSION="1.17.2"

# Deploy
$ helm upgrade \
  "weaviate" \
  weaviate.tgz \
  --install \
  --namespace "weaviate" \
  --values ./values.yaml \
  --set "image.tag=$WEAVIATE_VERSION"

Issues with this approach:

  • The docs are always outdated since right now v15.4.0 would be the latest version
  • more steps
  • People have to download the chart locally and sometimes seem to get the location incorrect (saw that on slack once)

What would be nicer instead:

helm repo add weaviate https://weaviate.github.io/helm-charts
helm install my-weaviate-stack weaviate/weaviate

^^ which would automatically use the latest published chart

Helm chart incompatible with K8s version >= 1.16

I was following the documentation and get stuck on:

Deploy

$ helm upgrade
--values ./values.yaml
--install
--namespace "weaviate"
"weaviate"
weaviate.tgz

Here is what I get:
$ helm upgrade --values ./values.yaml --install --namespace "weaviate" "weaviate" weaviate.tgz
Release "weaviate" does not exist. Installing it now.
Error: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "Deployment" in version "apps/v1beta1", unable to recognize "": no matches for kind "StatefulSet" in version "apps/v1beta1", unable to recognize "": no matches for kind "StatefulSet" in version "apps/v1beta2"]

My setup:
$ helm version
version.BuildInfo{Version:"v3.0.1", GitCommit:"7c22ef9ce89e0ebeb7125ba2ebf7d421f3e82ffa", GitTreeState:"clean", GoVersion:"go1.13.4"}
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-9vx7p 1/1 Running 0 2d17h
coredns-6955765f44-g67tv 1/1 Running 0 2d17h
etcd-minikube 1/1 Running 0 2d17h
kube-addon-manager-minikube 1/1 Running 0 2d17h
kube-apiserver-minikube 1/1 Running 0 2d17h
kube-controller-manager-minikube 1/1 Running 0 2d17h
kube-proxy-mfdkp 1/1 Running 0 2d17h
kube-scheduler-minikube 1/1 Running 1 2d17h
storage-provisioner 1/1 Running 0 2d17h

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.