GithubHelp home page GithubHelp logo

oracle / weblogic-kubernetes-operator Goto Github PK

View Code? Open in Web Editor NEW
249.0 56.0 210.0 840.69 MB

WebLogic Kubernetes Operator

Home Page: https://oracle.github.io/weblogic-kubernetes-operator/

License: Universal Permissive License v1.0

Shell 8.96% Java 86.92% Python 2.34% Dockerfile 0.07% Smarty 0.54% HCL 1.11% XSLT 0.05% HTML 0.01% Mustache 0.01%
weblogic weblogic-server kubernetes operator

weblogic-kubernetes-operator's Introduction

WebLogic Kubernetes Operator

The WebLogic Kubernetes Operator (the “operator”) supports running your WebLogic Server and Fusion Middleware Infrastructure domains on Kubernetes, an industry standard, cloud neutral deployment platform. It lets you encapsulate your entire WebLogic Server installation and layered applications into a portable set of cloud neutral images and simple resource description files. You can run them on any on-premises or public cloud that supports Kubernetes where you've deployed the operator.

Furthermore, the operator is well suited to CI/CD processes. You can easily inject changes when moving between environments, such as from test to production. For example, you can externally inject database URLs and credentials during deployment or you can inject arbitrary changes to most WebLogic configurations.

The operator takes advantage of the Kubernetes operator pattern, which means that it uses Kubernetes APIs to provide support for operations, such as: provisioning, lifecycle management, application versioning, product patching, scaling, and security. The operator also enables the use of tooling that is native to this infrastructure for monitoring, logging, tracing, and security.

You can:

  • Deploy an operator that manages all WebLogic domains in all namespaces in a Kubernetes cluster, or that only manages domains in a specific subset of the namespaces, or that manages only domains that are located in the same namespace as the operator. At most, a namespace can be managed by one operator.
  • Supply WebLogic domain configuration using:
    • Model in Image: Includes WebLogic Deploy Tooling models and archives in a container image.
    • Domain in Image: Includes a WebLogic domain home in a container image.
    • Domain on PV: Locates WebLogic domain homes in a Kubernetes PersistentVolume (PV). This PV can reside in an NFS file system or other Kubernetes volume types.
  • Configure deployment of WebLogic domains as Kubernetes resources (using Kubernetes custom resource definitions).
  • Override certain aspects of the WebLogic domain configuration; for example, use a different database password for different deployments.
  • Start and stop servers and clusters in the domain based on declarative startup parameters and desired states.
  • Scale WebLogic domains by starting and stopping Managed Servers on demand, Kubernetes scale commands, setting up a Kubernetes Horizontal Pod Autoscaler, or by integrating with a REST API to initiate scaling based on the WebLogic Diagnostics Framework (WLDF), Prometheus, Grafana, or other rules.
  • Expose HTTP paths on a WebLogic domain outside the Kubernetes domain with load balancing, and automatically update the load balancer when Managed Servers in the WebLogic domain are started or stopped.
  • Expose the WebLogic Server Administration Console outside the Kubernetes cluster, if desired.
  • Expose T3 channels outside the Kubernetes domain, if desired.
  • Publish operator and WebLogic Server logs into Elasticsearch and interact with them in Kibana.

The fastest way to experience the operator is to follow the Quick Start guide, or you can peruse our documentation, read our blogs, or try out the samples.

Documentation

Documentation for the operator is available here.

This documentation includes information for users and for developers. It provides samples, reference material, security information and a Quick Start guide if you just want to get up and running quickly.

Documentation for prior releases of the operator: 3.4, 4.0, and 4.1.

Backward compatibility guidelines

The 2.0 release introduced some breaking changes and did not maintain compatibility with previous releases.

Starting with the 2.0.1 release, operator releases are intended to be backward compatible with respect to the domain resource schema, operator Helm chart input values, configuration overrides template, Kubernetes resources created by the operator Helm chart, Kubernetes resources created by the operator, and the operator REST interface. We intend to maintain compatibility for three releases, except in the case of a clearly communicated deprecated feature, which will be maintained for one release after a replacement is available.

Need more help? Have a suggestion? Come and say, "Hello!"

We have a public Slack channel where you can get in touch with us to ask questions about using the operator or give us feedback or suggestions about what features and improvements you would like to see. We would love to hear from you. To join our channel, please visit this site to get an invitation. The invitation email will include details of how to access our Slack workspace. After you are logged in, please come to #operator and say, "hello!"

Contributing

This project welcomes contributions from the community. Before submitting a pull request, please review our contribution guide

Security

Please consult the security guide for our responsible security vulnerability disclosure process.

License

Copyright (c) 2017, 2024, Oracle and/or its affiliates.

Released under the Universal Permissive License v1.0 as shown at https://oss.oracle.com/licenses/upl/.

weblogic-kubernetes-operator's People

Contributors

alai8 avatar ankedia avatar anpanigr avatar bhavaniravichandran avatar ddsharpe avatar dependabot[bot] avatar doxiao avatar galiacheng avatar hzhao-github avatar jgrundback avatar jshum2479 avatar lennyphan avatar lilyhe123 avatar maggiehe00 avatar marinakog avatar markxnelson avatar moreaut avatar mriccell avatar pfmackin avatar rjeberhard avatar robertpatrick avatar rosemarymarano avatar russgold avatar sankarpn avatar simon-meng-cn avatar swapnanitin avatar thefrogpad avatar vanajamukkara avatar vkraemer avatar xiancao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

weblogic-kubernetes-operator's Issues

WebLogic Admin Console not available

Hello,
in the documentation in the architecture section says :

NodePort type service is created for the Administration Server pod. This service provides HTTP access to the Administration Server to clients that are outside the Kubernetes cluster. This service is intended to be used to access the WebLogic Server Administration Console only. This service is labeled with weblogic.domainUID and weblogic.domainName.

From the kubernetes :

NAME                                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
domain1-admin-server                  ClusterIP   10.21.239.105   <none>        7001/TCP            37m
domain1-cluster-1-traefik             NodePort    10.21.184.4     <none>        80:30305/TCP        37m
domain1-cluster-1-traefik-dashboard   NodePort    10.21.49.200    <none>        8080:30315/TCP      37m
domain1-managed-server1               ClusterIP   10.21.22.33     <none>        8001/TCP            36m
domain1-managed-server2               ClusterIP   10.21.160.52    <none>        8001/TCP            36m
elasticsearch                         ClusterIP   10.21.52.22     <none>        9200/TCP,9300/TCP   40m
external-weblogic-operator-service    NodePort    10.21.27.227    <none>        8081:31001/TCP      41m
internal-weblogic-operator-service    ClusterIP   10.21.68.70     <none>        8082/TCP            41m
kibana                                NodePort    10.21.211.209   <none>        5601:32344/TCP      40m
kubernetes                            ClusterIP   10.21.0.1       <none>        443/TCP             41m

It seems that is missing.

Provide an easy way to remove a domain

Provide a script that will remove a domain and all of the associated resources, e.g. load balancer, ingress.

Should not actually delete the domain directory, etc. on the PV. Just remove the domain from the cluster.

Provide an option to overwrite an existing domain

Currently, the create domain job will give an error and stop if there is any data already in the PV where it wants to create the domain. We should provide an "override" option which will delete & replace instead of failing.

Order of entries in weblogic-operator.yaml

It may have just been a timing issue in my deploys, but the deployment would fail with the config map and secret at the bottom of the yaml file, but when I moved them to the top instead, it deployed successfully every time (6 deployed 5 deletes).

Pod killed does not restart

Hello,
I've a WebLogic domain running. I delete a pod manually ( can be the admin or any of the managed servers) with

kubectl delete pod managed2

and I notice that the pod does not come back. Is it expected that the weblogic operator does not react in this case?
I checked the weblogic config and the definition of the server deleted is still there so if the weblogic-operator takes care of keeping the domain updated, it should restart the pod.

Instead if I connect to the minion where is running the pod and try with
docker rm -f CONTAINER_ID , the weblogic-operator restart it.

I tried to have look to the weblogic-operator logs when i perform the deletion and it just says this :

{"exception":"","headers":{},"code":"","method":"isReady","level":"INFO","thread":36,"timeInMillis":1517581530299,"message":"Pod domain1-managed-server2 is ready","body":"","class":"oracle.kubernetes.operator.PodWatcher","timestamp":"02-02-2018T14:25:30.299+0000"}

Thank you.
Antonio

Issues within kubernetes/internal/generate-security-policy.sh

While installing operator (./create-weblogic-operator.sh -i create-operator-inputs.yaml) got following issues:

  1. if test in line 47:

/home/ubuntu/wls-operator/git/weblogic-kubernetes-operator/kubernetes/internal/generate-security-policy.sh: 47: [: -o: unexpected operator

looks like wrong comparison operator is used:
if [ "$1" == "-o" ] ; then
following seems to be more correct:
if [ "$1" = "-o" ] ; then

  1. I found no mention of shell version to be used with scripts, on my ubuntu 16.04 I have dash version 0.5.8-2.1ubuntu2 configured as /bin/sh, seems like it has problems with line 191 substitution operation in kubernetes/internal/generate-security-policy.sh, current version:

for i in ${TARGET_NAMESPACES//,/ }
maybe worth to use more secure one, e.g.:
for i in $(echo $TARGET_NAMESPACES | sed "s/,/ /g")

Error when using CoreOS Tectonic - HealthCheckHelper.java

I have received the following error from the pod running the operator:
{"exception":"","headers":{},"code":"","method":"verifyK8sVersion","level":"INFO","thread":1,"timeInMillis":1517275411833,"message":"Verifying Kubernetes minimum version","body":"","class":"oracle.kubernetes.operator.helpers.HealthCheckHelper","timestamp":"01-30-2018T01:23:31.833+0000"}
{"exception":"\njava.lang.NumberFormatException: For input string: "9+coreos"\n\tat java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)\n\tat java.lang.Integer.parseInt(Integer.java:580)\n\tat java.lang.Integer.parseInt(Integer.java:615)\n\tat oracle.kubernetes.operator.helpers.HealthCheckHelper.verifyK8sVersion(HealthCheckHelper.java:251)\n\tat oracle.kubernetes.operator.helpers.HealthCheckHelper.performNonSecurityChecks(HealthCheckHelper.java:170)\n\tat oracle.kubernetes.operator.Main.main(Main.java:150)\n","headers":{},"code":"","method":"main","level":"WARNING","thread":1,"timeInMillis":1517275411852,"message":"Exception thrown: {0}","body":"","class":"oracle.kubernetes.operator.Main","timestamp":"01-30-2018T01:23:31.852+0000"}
{"exception":"","headers":{},"code":"","method":"main","level":"INFO","thread":1,"timeInMillis":1517275411858,"message":"The Oracle WebLogic Server Operator for Kubernetes is shutting down","body":"","class":"oracle.kubernetes.operator.Main","timestamp":"01-30-2018T01:23:31.858+0000"}

kubectl version output:
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T21:12:46Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.9+coreos.0", GitCommit:"2ded8a1912d014561208d882cfcc12dfa5374f22", GitTreeState:"clean", BuildDate:"2017-10-24T13:07:42Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

It seems the 9+coreos in the GitVersion of the server throws an error in the HealthCheckHelper.java.

Any thoughts?

doc update

From Grzegorz
Need doc update for the PV removal - needs namespace on PVC.

Domain Custom Resource Definition

Hello,
I installed a Kubernetes Cluster on OCI and added a NFS volume to all worker.

But I have problem to run a WebLogic domain.

Basically the script create-domain-job and kubectl get domain --all-namespaces return me this :

the server doesn't have a resource type "domain"

I was not able to find any CustomResourceDefinition resource. (the kubectl get crd returns nothing).

Can be that you defined the API but they are not in the Kubernetes version that I have installed on my cluster ? Do I need to restart the api server with a particular flag ?

My versions are :

`Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"6e937839ac04a38cac63e6a7a306c5d035fe7b0a", GitTreeState:"clean", BuildDate:"2017-09-28T22:57:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.9+coreos.0", GitCommit:"2ded8a1912d014561208d882cfcc12dfa5374f22", GitTreeState:"clean", BuildDate:"2017-10-24T13:07:42Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}`

Thank you.
Cheers,
Antonio

verifyK8sVersion fails with java.lang.ArrayIndexOutOfBoundsException exception

Hello,
I was running the source code of the Operator in Eclipse and I noticed that if was failing on the function verifyK8sVersion in HealthCheckHelper.java .
That's because the gitVersion of my server is not just v1.7.9 but v1.7.9+coreos.0. (As you can see from the kubectl command below).

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"6e937839ac04a38cac63e6a7a306c5d035fe7b0a", GitTreeState:"clean", BuildDate:"2017-09-28T22:57:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.9+coreos.0", GitCommit:"2ded8a1912d014561208d882cfcc12dfa5374f22", GitTreeState:"clean", BuildDate:"2017-10-24T13:07:42Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

I fix it adding this piece of code :

         // git version is of the form v1.7.5 or v1.7.5+coreos.0
         // Check the 3rd part of the version.
      	 String[] splitVersion = gitVersion.split("\\.");
         splitVersion = splitVersion[2].split("\\+");
         if (Integer.parseInt(splitVersion[0]) < 5) {

It works with both case.

Cheers,
Antonio

Validation for create-domain-job.sh request

The scripts success relies on the namespace for the domain existing and the secret for the weblogic credentials existing, as well as the persistent volume for the hostpath (with proper permissions).

It would be helpful to provide a few checks for the existence of these resources and exiting the script with an error message describing the failure. A little validation goes a long way.

Where to upload operator.tar to?

From https://github.com/oracle/weblogic-kubernetes-operator/blob/master/site/installation.md

Next, upload your image to your Kubernetes server as follows:

on your build machine

docker save weblogic-kubernetes-operator:developer > operator.tar
scp operator.tar YOUR_USER@YOUR_SERVER:/some/path/operator.tar

on the Kubernetes server

docker load < /some/path/operator.tar

Which kubernetes server is the operator.tar uploaded to? Is it the kubernetes master or worker? If multiple masters and workers exist then is the tar uploaded to all nodes?

AdminServer is not starting

Hello,
The AdminServer fails with:

Update JVM arguments                                                                                
Start the server                                                                                    
                                                                                                    
Initializing WebLogic Scripting Tool (WLST) ...                                                     
                                                                                                    
Welcome to WebLogic Server Administration Scripting Shell                                           
                                                                                                    
Type help() for help on available commands                                                          
                                                                                                    
Problem invoking WLST - Traceback (innermost last):                                                 
  File "/shared/domain/base_domain/servers/admin-server/nodemgr_home/start-server.py", line 5, in ? 
IOError: No such file or directory: /weblogic-operator/secrets/username                             
                                                                                                    
Wait indefinitely so that the Kubernetes pod does not exit and try to restart                       

This because the secret with the credential is mounted in the /var/run/secrets-{DOMAIN-UID} ( PodHelper.java line 149) and in the template of the job there is :

# Validate the domain secrets exist before proceeding.
    if [ ! -f /weblogic-operator/secrets/username ]; then
      fail "The domain secret /weblogic-operator/secrets/username was not found"
    fi
    if [ ! -f /weblogic-operator/secrets/password ]; then
      fail "The domain secret /weblogic-operator/secrets/password was not found"
    fi

Before the template was parameterize but here it changed.

"image" parameter in create-operator-inputs.yaml

For me it was unclear how to name the image parameter in the configuration file. The default points to container-registry.oracle.com. I had built my image within the kubernetes master. I could not find out how to use that. After pushing that image into my local docker-registry, the create-weblogic-operator.sh script finished successfully.

Installation Instructions Dockerfile

The installation instructions give the command:
docker build -t weblogic-kubernetes-operator:developer --no-cache=true .
But do not explicitly state that it is expected to be run from the weblogic-kubernetes-operator directory. This should be clearly stated at the beginning of the instructions.

Create prometheus annotations on pods

We need to add a parameter that tells the operator that it should add these annotations to pods that it creates when we are using the prometheus integration and WLS Exporter:

	prometheus.io/port: "8001"    # should be the ListenPort of the server in the pod 
	prometheus.io/path: /wls-exporter/metrics
	prometheus.io/scrape: "true"

generate-security-policy.sh execution throws -o: unexpected operator

when executing ./create-weblogic-operator.sh -i create-operator-inputs.yaml the following errors are thrown:

Import command completed: 1 entries successfully imported, 0 entries failed or cancelled MAC verified OK Generating /home/molina/WLSoperator/weblogic-kubernetes-operator/kubernetes/weblogic-operator.yaml Running the rbac customization script /home/molina/WLSoperator/weblogic-kubernetes-operator/kubernetes/internal/generate-security-policy.sh: 47: [: -o: unexpected operator /home/molina/WLSoperator/weblogic-kubernetes-operator/kubernetes/internal/generate-security-policy.sh: 50: /home/molina/WLSoperator/weblogic-kubernetes-operator/kubernetes/internal/generate-security-policy.sh: [[: not found ...

Reviewing generate-security file seems that at line 47 is using "==" as operator and some shells might not like it. For instance, I'm running weblogic-kubernetes-operator on Ubuntu 16.04 and /bin/sh is linked to dash (and dash seems not to like that operator):

molina@iceland:~/WLSoperator/weblogic-kubernetes-operator/kubernetes$

 ll /bin/sh
lrwxrwxrwx 1 root root 4 oct  9  2016 /bin/sh -> dash*

molina@iceland:~/WLSoperator/weblogic-kubernetes-operator/kubernetes$ uname -a
Linux iceland 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Anyway, I think it would be better if all the script used the same operators (files 23, 31, 39, etc are using "=" instead of "==".

As shown in above output, line 50 throws an error to because of using "[[" instead of "["

I made these two changes in my machine and the execution continued successfully.

I'm running weblogic-kubernetes-operator with minikube but I do not think is related to the issue.

Questions I'm thinking about on the WebLogic Operator

One concern is the necessity to have a unique domainUID...is this required only among those resources managed by a particular weblogic-operator?

How will HA work for the operator? Is it just down to the replication set, etc.?

What are the primary resources that must be considered when tuning the operator and it's required resources for relative load levels?

How many namespaces and domains should a deployment of weblogic operator be responsible for managing?

Has any work on using alternate ingress controller classes been done yet? Big vote for HAProxy here...maybe others...need a nice way to proxy tcp not just http and https for T3, etc.

adminserver starts but socket error stops operator from working - Close Me after reading

So I create a domain with domainUid = wi, domainName = widoamin.

Last lines of wi-adminserver log:
<Feb 2, 2018 8:20:39 PM GMT> <Starting WebLogic server with command line: /usr/java/jdk1.8.0_151/bin/java -Dweblogic.Name=adminserver -Dbea.home=/u01/oracle/wlserver/.. -Djava.security.policy=/u01/oracle/wlserver/server/lib/weblogic.policy -Djava.library.path=::/u01/oracle/wlserver/server/native/linux/x86_64:/u01/oracle/wlserver/server/native/linux/x86_64/oci920_8:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib -Djava.class.path=/usr/java/jdk1.8.0_151/lib/tools.jar:/u01/oracle/wlserver/server/lib/weblogic.jar:/u01/oracle/wlserver/../oracle_common/modules/thirdparty/ant-contrib-1.0b3.jar:/u01/oracle/wlserver/modules/features/oracle.wls.common.nodemanager.jar::/u01/oracle/wlserver/..:/u01/oracle/wlserver/modules/features/oracle.wls.common.grizzly.jar -Dweblogic.system.BootIdentityFile=/shared/domain/widomain/servers/adminserver/security/boot.properties -Dweblogic.nodemanager.ServiceEnabled=true -Dweblogic.nmservice.RotationEnabled=true -Djava.security.egd=file:/dev/./urandom -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap weblogic.Server >
<Feb 2, 2018 8:20:39 PM GMT> <Working directory is '/shared/domain/widomain'>
<Feb 2, 2018 8:20:39 PM GMT> <Server output log file is '/shared/domain/widomain/servers/adminserver/logs/adminserver.out'>
<Feb 2, 2018 8:20:39 PM GMT> <Wrote process id 448>
<Feb 2, 2018 8:23:02 PM GMT> <The server 'adminserver' is running now.>
Successfully started server adminserver ...
Successfully disconnected from Node Manager.
<Feb 2, 2018 8:23:02 PM GMT> <NM Command(s) processing completed>

Attached some files for review:

issue-svc-wi-adminserver.log
issue-describe-wi-adminserver.log
issue-wi-adminserver.log

Dockerfile Operator Snapshot

In the Dockerfile, where is the following file supposed to come from?

COPY target/weblogic-kubernetes-operator-0.1.0-alpha-SNAPSHOT.jar /operator/weblogic-kubernetes-operator.jar

This file does not appear to be included in the repository.

Add more detail to the doc for operator inputs

From Christophe.
Also strengthen the explanation of where to put the docker image after building it - which nodes, etc.
e.g. how to refer to a locally built image vs one from a repo.
which parameters do you actually need to set.
SANs - master or worker nodes, etc.
etc.

JCE Unlimited Strength Jurisdiction Policy Files

We received the following request in a comment on a separate issue:

@RisingPhorce

Similarly, what about Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files?
Our standard Weblogic install uses the policy files. I can open a separate issue if needed.
http://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html

We'll have to do some research... I assume we can't distribute these policy files because I think there are import restrictions in certain countries.

Update README.md to delete PVC in a namespace.

In the section "Removing a domain" in the README.md:

To permanently remove a domain from a Kubernetes cluster, first shut down the domain using the instructions provided above in the section titled “Shutting down a domain”, then remove the persistent volume claim and the persistent volume using these commands:

kubectl delete pvc PVC-NAME
kubectl delete pv PV-NAME

The PVC is created per namespace. So the delete pvc command will be:
kubectl delete pvc PVC-NAME -n NAMESPACE

Please add "-n NAMESPACE" on this command in the README.md file. The command "kubectl delete pv PV-NAME" in the README.md is correct.

Error pulling weblogic:12.2.1.3 before creating a domain

I'm following this guide to create a domain: https://github.com/oracle/weblogic-kubernetes-operator/blob/master/site/creating-domain.md (I have also checked the associated video in youtube).

I'm having troubles with step "Pull the WebLogic Server image"

I executed:

kubectl create secret docker-registry docker-domain1 -n domain1 --docker-server=index.docker.io/v1/ --docker-username=cmolinah --docker-password=************** --docker-email=******@mail.com

docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username: cmolinah
Password:
Login Succeeded
docker pull store/oracle/weblogic:12.2.1.3
Error response from daemon: pull access denied for store/oracle/weblogic, repository does not exist or may require 'docker login'

Is the documentation correct?

Connection problem with WLST script.

Hello,
I'm not able to connect to the Admin using a wlst script. On the kubernetes side it is correctly configured :

NAME                                        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
domain1-admin-server                        NodePort    10.21.64.58     <none>        7001:30701/TCP      3h
domain1-admin-server-extchannel-t3channel   NodePort    10.21.172.82    <none>        30012:30012/TCP     3h

I tried from my pc and I have this :

wls:/offline> connect('weblogic','welcome1','t3://PUBLIC_IP:30012')
Connecting to t3://PUBLIC_IP:30012 with userid weblogic ...
Traceback (innermost last):
  File "<console>", line 1, in ?
  File "<iostream>", line 19, in connect
  File "<iostream>", line 553, in raiseWLSTException
WLSTException: Error occurred while performing connect : Error getting the initial context. There is no server running at t3://PUBLIC_IP:30012 : Timed out while attempting to establish connection to :t3://PUBLIC_IP:30012 
Use dumpStack() to view the full stacktrace :

or

wls:/offline> connect('weblogic','welcome1','http://PUBLIC_IP:30701')
Connecting to http://PUBLIC_IP:30701 with userid weblogic ...
Traceback (innermost last):
  File "<console>", line 1, in ?
  File "<iostream>", line 19, in connect
  File "<iostream>", line 553, in raiseWLSTException
WLSTException: Error occurred while performing connect : Error getting the initial context. There is no server running at http://PUBLIC_IP:30701 : http://PUBLIC_IP:30701: [RJVM:000575]Destination PUBLIC_IP, 30701 unreachable.; nested exception is: 
        java.io.IOException: Could not connect to http://PUBLIC_IP:30701; [RJVM:000576]No available router to destination.; nested exception is: 
        java.rmi.ConnectException: [RJVM:000576]No available router to destination. 
Use dumpStack() to view the full stacktrace :

Tcpdump on my pc and on the OCI servers seems fine (there are packets exchange and also the telnet is working). The curl on the http port works as well:

curl --user weblogic:welcome1 -H X-Requested-By:MyClient -H Accept:application/json -X GET http://130.61.63.87:30701/management/weblogic

So I tried also from the server in the kubernetes cluster and i faced the same issue.

The only way that worked was to go to the server running the Admin Container and try to connect to the pod IP :

wls:/offline> connect('weblogic','welcome1','t3://10.99.66.5:7001')                        
Connecting to t3://10.99.66.5:7001 with userid weblogic ...                                
Successfully connected to Admin Server "admin-server" that belongs to domain "base_domain".
                                                                                           
Warning: An insecure protocol was used to connect to the server.                           
To ensure on-the-wire security, the SSL port or Admin port should be used instead.         
                                                                                           
wls:/base_domain/serverConfig/>                                                            

Any cloue on what is wrong ?

Domain deletion procedure issue

In shutdown-domain.md shutdown command:
kubectl delete domain DOMAINUID
should be extended with -n NAMESPACE, i.e.:
kubectl delete domain DOMAINUID -n NAMESPACE

After domain is stopped (AdminServer and ManagedServers) there is traefik Deployment (and related pod) left active/running. Please consider adding similar to following to shutdown procedure:
kubectl delete deployment domain1-cluster-1-traefik -n domain1

create-domain-job: status checks of PV can occur too quickly

It has been observed that the status checks of the persistent volume (e.g. state bound) can occur before the volume has time to get into that state, thus causing the script to fail. Use a similar loop as the create-weblogic-operator script to poll for the expected status and fail after 10 tries.

create-weblogic-operator.sh fails to complete successfully

Snippet of the output...

Checking if the persistent volume elk-pv-weblogic-operator already exists
The persistent volume elk-pv-weblogic-operator already exists and will not be re-created
Checking if the persistent volume claim elk-pvc already exists
No resources found.
Creating the persistent volume claim elk-pvc
persistentvolumeclaim "elk-pvc" created
Checking if the persistent volume elk-pv-weblogic-operator is Bound
[ERROR] The persistent volume state should be Bound but is Released

I have applied chmod -R 777 for elkPersistentVolume: /home/holuser/K8S/elkPersistentVolume

Docker Registry Creation : docker-server parameter

Hello,

I added the imagePullSecret variable in the domain-job-template.yaml file issue but was not enough.

I had also recreate the registry secret specifying in the --docker-server parameter the URL index.docker.io/v1/ instead of docker.com. In this way Kubernetes was able to pull the image itself.

Cheers,
Antonio

Installation Instructions mvn

The installation instructions begin by telling you to run:
mvn clean install
But do not explain why you would need Apache Maven, or even that that is what this command is calling. Why would this be needed?

create-domain-job.sh doesn't start any servers

Running ./create-domain-job.sh -i create-domain-job-inputs.yaml runs successfully, but doesn't start any wls server. How can I find out, what is going wrong?
My domain namespace is wls1domain (the one that the operator should manage).
kubectl get pods -n wls1domain
NAME READY STATUS RESTARTS AGE
domain1-cluster-1-traefik-6b75757549-w49wr 1/1 Running 0 1h

kubectl get pods -n weblogic-operator
NAME READY STATUS RESTARTS AGE
weblogic-operator-766c9f495f-fvqjj 1/1 Running 0 9h

But looking at the last line of the operator log files, it seems no longer running (?):
kubectl logs weblogic-operator-766c9f495f-fvqjj -n weblogic-operator
...
{"exception":"\njava.lang.NumberFormatException: For input string: "8+"\n\tat java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)\n\tat java.lang.Integer.parseInt(Integer.java:580)\n\tat java.lang.Integer.parseInt(Integer.java:615)\n\tat oracle.kubernetes.operator.helpers.HealthCheckHelper.verifyK8sVersion(HealthCheckHelper.java:251)\n\tat oracle.kubernetes.operator.helpers.HealthCheckHelper.performNonSecurityChecks(HealthCheckHelper.java:170)\n\tat oracle.kubernetes.operator.Main.main(Main.java:150)\n","headers":{},"code":"","method":"main","level":"WARNING","thread":1,"timeInMillis":1517052358865,"message":"Exception thrown: {0}","body":"","class":"oracle.kubernetes.operator.Main","timestamp":"01-27-2018T11:25:58.865+0000"}
{"exception":"","headers":{},"code":"","method":"main","level":"INFO","thread":1,"timeInMillis":1517052358867,"message":"The Oracle WebLogic Server Operator for Kubernetes is shutting down","body":"","class":"oracle.kubernetes.operator.Main","timestamp":"01-27-2018T11:25:58.867+0000"}

kubectl version
Client Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.4+2.0.1.el7", GitCommit:"538ac53c74231a70b7ceca01b8f8d09a735b4ffb", GitTreeState:""git", BuildDate:"2017-12-14T22:28:43Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

I am lost in finding out where to look, what I might have been doing wrong. Could it be my k8s version #(v1.8.4+2.0.1.el7)??

Thoughts on supporting dynamic persistent volumes

I have made edits to the input and template yaml files and the create shell scripts to support my own custom provisioner and storage class (AWS EFS based).

I added the following values to the inputs:
#expecting dynamic or static
persistenceVolumeType: dynamic
#expecting size - for dynamic 1Mi is sufficient
persistenceVolumeSize: 1Mi

I added the following to shell scripts:
# To check for value and set sane default if unavailable
function validateVolumeType { if [ -z $persistenceVolumeType ]; then persistenceVolumeType=static echo "Defaulting the input parameter persistenceVolumeType to be static" fi }
function validateVolumeSize { if [ -z $persistenceSize ]; then persistenceSize=1Mi echo "Defaulting the input parameter persistenceSize to be 1Mi" fi }

# To check for Bound status of PVC
`function checkPvcState {

echo "Checking if the persistent volume claim ${1:?} is ${2:?} in namespace ${namespace}"
pvc_state=`kubectl -n ${namespace}  get pvc $1 -o jsonpath='{$..status.phase}'`
attempts=0
while [ ! "$pvc_state" = "$2" ] && [ ! $attempts -eq 10 ]; do
    attempts=$((attempts + 1))
    sleep 1
    pvc_state=`kubectl -n ${namespace}  get pvc $1 -o jsonpath='{$..status.phase}'`
done
if [ "$pvc_state" != "$2" ]; then
    fail "The persistent volume claim state should be $2 but is $pvc_state"
fi

}`

I also added if statements around pc related sections of the shell scripts to skip them based on the value of persistenceVolumeType. As in the following excerpt:
if [ "${persistenceVolumeType}" = "static" ]; then pvInput="${scriptDir}/internal/persistent-volume-template.yaml" pvOutput="${scriptDir}/persistent-volume.yaml" if [ ! -f ${pvInput} ]; then validationError "The template file ${pvInput} for generating a persistent volume was not found" fi fi
from the function initialize.
I hope this helps. I like not having to worry about creating the persistent volumes.

It may also be helpful to add a few while loops (as much as I hate them) to check for creation of resources, similar to the checkPvcState function, with the attempts limit adjustable, etc.

Oracle Container Registry Access

The installation step:
./create-weblogic-operator.sh –i /path/to/create-operator-inputs.yaml

Is failing due to a missing authentication for the Oracle Container Registry. "Registering for access to the Oracle Container Registry" is mentioned as a step in the Installation section of the readme, but not is not mentioned in installation.md. Is this a step the installation guide is missing?

4m 1m 5 kubelet, k8s-worker-ad1-0.k8sworkerad1.k8sbmcs.oraclevcn.com spec.containers{weblogic-operator} Warning FaileFailed to pull image "container-registry.oracle.com/middleware/weblogic-kubernetes-operator:latest": rpc error: code = 2 desc = Error response from daemon: {"message":"Get https://container-registry.oracle.com/v2/middleware/weblogic-kubernetes-operator/manifests/latest: unauthorized: authentication required"}

The only mention of Oracle Container Registry in installation.md is in weblogic-operator.yaml:
# Update the imagePullSecrets with the name of your Kubernetes secret containing
# your credentials for the Docker Store/Oracle Container Registry.
imagePullSecrets:
- name: ocr-secret

Is the user meant to create these secrets before trying to deploy the weblogic operator?

Pull image store/oracle/weblogic:12.2.1.3

Hello there,

In the documentation you suggest for all the images to pull manually the images on the minions to avoid wrong credential or secret in wrong namespace.

But I let Kubernetes to pull the images for me in the case of the weblogic-operator(no problem) and I wanted to do the same for the job that will generate the domain.
But in the yaml generated ( domain-job.yaml) by the script create-domain-job.sh there is no reference to the registry secret, so the only solution is to pull it manually.

In particular I think that are missing, in the domain-job-template.yaml, two lines like these:

imagePullSecrets: name: DOCKER_STORE_REGISTRY_SECRET
Cheers,
Antonio

docker pull for early access users

I'm installing following this guide: https://github.com/oracle/weblogic-kubernetes-operator/blob/master/site/installation.md

It is said:

ATTENTION EARLY ACCESS USERS You will need to use the early access image in quay.io.
Please create your secret as shown below:

kubectl create namespace weblogic-operator
kubectl create secret docker-registry earlybird-secret
-n weblogic-operator
--docker-server=quay.io
--docker-username=earlybird
--docker-password=welcome1
--docker-email=[email protected]

After that, it is recommended to pull the Docker image but there is no reference por early access users.

You can let Kubernetes pull the Docker image for you the first time you try to create a pod that uses the image, but we have found that you can generally avoid various common issues like putting the secret in the wrong namespace or getting the credentials wrong by just manually pulling the image by running these commands on the Kubernetes master:

docker login container-registry.oracle.com
docker pull container-registry.oracle.com/middleware/weblogic-kubernetes-operator:latest

I tried to login in quay.io but I didn't found any Docker image.

Failed to create service account weblogic-operator

Create operator script fails with this message:

hecking to see if the service account weblogic-operator already exists
Error: unknown shorthand flag: 'n' in -n
Usage:
kubectl get [(-o|--output=)json|yaml|wide|go-template=...|go-template-file=...|jsonpath=...|jsonpath-file=...] (TYPE [NAME | -l label] | TYPE/NAME ...) [flags] [flags]

Examples:

List all pods in ps output

Problem with minor gitversion

I recreate the cluster from scratch and I had again this problem

{"exception":"\njava.lang.NumberFormatException: For input string: \"8+\"\n\tat java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)\n\tat java.lang.Integer.parseInt(Integer.java:580)\n\tat java.lang.Integer.parseInt(Integer.java:615)\n\tat oracle.kubernetes.operator.helpers.HealthCheckHelper.verifyK8sVersion(HealthCheckHelper.java:244)\n\tat oracle.kubernetes.operator.helpers.HealthCheckHelper.performNonSecurityChecks(HealthCheckHelper.java:170)\n\tat oracle.kubernetes.operator.Main.main(Main.java:150)\n","headers":{},"code":"","method":"main","level":"WARNING","thread":1,"timeInMillis":1518021925935,"message":"Exception thrown: {0}","body":"","class":"oracle.kubernetes.operator.Main","timestamp":"02-07-2018T16:45:25.935+0000"}
{"exception":"","headers":{},"code":"","method":"main","level":"INFO","thread":1,"timeInMillis":1518021925939,"message":"The Oracle WebLogic Server Operator for Kubernetes is shutting down","body":"","class":"oracle.kubernetes.operator.Main","timestamp":"02-07-2018T16:45:25.939+0000"}

The problem is coming from the fact that the minor version is not just a number :

class VersionInfo {
    buildDate: 2017-12-12T11:01:08Z
    compiler: gc
    gitCommit: b8e596026feda7b97f4337b115d1a9a250afa8ac
    gitTreeState: clean
    gitVersion: v1.8.5+coreos.0
    goVersion: go1.8.3
    major: 1
    minor: 8+
    platform: linux/amd64
}

And

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"6e937839ac04a38cac63e6a7a306c5d035fe7b0a", GitTreeState:"clean", BuildDate:"2017-09-28T22:57:57Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.5+coreos.0", GitCommit:"b8e596026feda7b97f4337b115d1a9a250afa8ac", GitTreeState:"clean", BuildDate:"2017-12-12T11:01:08Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

I'm fixing it.

Need to delete domain job before re-running create-domain-job.sh

In order to re-run create-domain-job.sh to re-create a domain, we need to shutdown the domain, remove the domain (including the removing of the domain directory) as mentioned in the doc README.md.

However, the doc does not mention the removal of the domain job.

If the domain job exists and we re-run the create-domain-job.sh, the shell script deletes the job first, re-create it next. This is correct and make sense. But then somehow the pod for the domain job cannot be created because it complains the domain directory already exists. This causes the create-domain-job.sh failed.

Apparently the domain directory has been manually removed before the create-domain-job.sh is re-run. Not sure when the domain directory is re-created during the run.

Here is the command console output:

====================
The persistent volume domain1-pv001 already exists and will not be re-created
Checking if the persistent volume claim domain1-pv001-claim already exists
The persistent volume claim domain1-pv001-claim already exists and will not be re-created
Checking if object type job with name domain-domain1-job exists
Deleting domain-domain1-job using /scratch/rpan/wls-decoperator/weblogic-operator/kubernetes/domain-job.yaml
job "domain-domain1-job" deleted
configmap "domain-domain1-scripts" deleted
Creating the domain by creating the job /scratch/rpan/wls-decoperator/weblogic-operator/kubernetes/domain-job.yaml
job "domain-domain1-job" created
configmap "domain-domain1-scripts" created
Waiting for the job to complete...
status on iteration 1 of 10
pod domain-domain1-job-03jsk status is Error
pod domain-domain1-job-3bkjc status is Error
pod domain-domain1-job-54wg9 status is Error
pod domain-domain1-job-fx149 status is Error

===============

The pod log shows: Error: /shared/domain/base_domain already exists.

If manually delete the domain job and the domain directory first, the create-domain-job.sh can be finished correctly.

Is this a code issue or a doc issue? If it's a doc issue, please update the doc to add the domain job removal.

adminserver environment and setup scripts - BLOCKER

It looks like it tries to use a directory to mount the secret to for the weblogic credentials based on /var/run/secrets-$DomainUid, instead of the secretsMountPath from create-domain-job-inputs.yaml.

Also, the domain creation script in the configmap seems to leave the domain name as base_doamin, even after writing the domain. when I exec to the adminserver pod and check the env it lists the DOMAIN_PATH correctly after sourcing the /shared/domain/widomain/bin/setDomainEnv.sh file, but the DOMAIN_NAME is always base_doamin. When I connect and check the name and read the domain via wlst.sh it seems correct. Thoughts???

Design retry on failure mechanism

We're discussing how to improve the retry / failure processing for the operator. In the current code base, we read domain resources when the operator starts and respond to watch events on these resources. If the activity for any given domain fails (from start-up or watch event), there is no follow-up retry activity. We do have retry behavior for each individual k8s request, so the overall activity rarely fails. Likewise, if an administrator intentionally or unintentionally deletes a pod, service, or ingress, etc., we have no mechanism to run through the configuration for that domain again.
Update: operator now detects delete of pod, service, or Ingress and recreates.

Current ideas:

  1. Definitely flag domains for which the activity flow failed and retry these after some delay.
  2. On some polling interval, recheck each domain to see that all of the pods/services/ingress entries, etc., are still correct.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.