GithubHelp home page GithubHelp logo

adoptium / infrastructure Goto Github PK

View Code? Open in Web Editor NEW
84.0 84.0 99.0 21.37 MB

This repo contains all information about machine maintenance.

License: Apache License 2.0

Shell 35.85% Ruby 12.78% Python 18.36% PowerShell 0.24% HTML 0.24% C 0.19% Jinja 32.35%
ansible backup hacktoberfest infrastructure infrastructure-systems nagios

infrastructure's Introduction

Eclipse Adoptium

This organization provides a home for Git repositories that contain the activities of the Adoptium Working Group, the Eclipse Adoptium Top Level Project and several Eclipse projects that fall under that top level project:

NOTE: The high-level project and issue tracking across all projects is kept in the Adoptium repo issue tracking system.


Please see the Eclipse Adoptium Project description for more information regarding the Adoptium top-level project or its sub-projects (visually depicted in the diagram below).

graph TD

subgraph Eclipse Adoptium
    classDef public fill:#CFE1F3,stroke:#333,stroke-width:4px,color:#000000
    classDef private fill:#FF0000,stroke:#333,stroke-width:4px,color:#000000
    style public fill:#CFE1F3,stroke:#333,stroke-width:4px,color:#000000
    style private fill:#FF0000,stroke:#333,stroke-width:4px,color:#000000
    subgraph Adoptium
        AdoptiumTrigger[adoptium]:::public --- website["adoptium.net"]:::public --- api["api.adoptium.net"]:::public --- blog["blog.adoptium.net"]:::public --- dash["dash.adoptium.net"]:::public
    end
    subgraph Temurin
        subgraph Build
            buildTrigger[temurin-build]:::public --- mirror["mirror-scripts"]:::public --- src["jdk, jdk8u, jdk8u-aarch32, jdk17u"]:::public --- release["github-release-scripts"]:::public --- binaries["temurin8-binaries,<br/>temurin11-binaries,<br/>temurin17-binaries,<br/>temurin19-binaries"]:::public --- installer["installer"]:::public --- build["build-jdk"]:::public
        end
        subgraph Infrastructure
            direction LR
            infraTrigger[infrastructure]:::public --- jenkins["ci-jenkins-pipelines"]:::public --- jenkinshelper["jenkins-helper"]:::public --- support["adoptium-support"]:::public
        end
    end
    subgraph Temurin Compliance
        TCKTrigger[temurin-compliance]:::private --- infra["infrastructure"]:::private --- jck8["JCK8-unzipped"]:::private --- jck11["JCK11-unzipped"]:::private --- jck17["JCK17-unzipped"]:::private --- jck19["JCK19-unzipped"]:::private
    end
    subgraph AQAvit
        AQAvitTrigger[aqa-tests]:::public --- tkg["TKG"]:::public --- test-tools["aqa-test-tools"]:::public --- stf["STF"]:::public --- systemtest["aqa-systemtest"]:::public --- bumblebench["bumblebench"]:::public --- run-aqa["run-aqa"]:::public
    end
    subgraph Incubator
        IncubatorTrigger[jdk11u-fast-startup-incubator]:::public
    end
end

Eclipse Adoptium Working Group

The Adoptium Working Group promotes and supports high-quality runtimes and associated technology for use across the Java ecosystem. Our vision is to meet the needs of Eclipse and the broader Java community by providing a marketplace for high-quality Java runtimes for Java-based applications. We embrace existing standards and a wide variety of hardware and cloud platforms.

Eclipse Adoptium Top Level Project

The mission of the Eclipse Adoptium Top-Level Project is to distribute high-quality runtimes and associated technology for use within the Java ecosystem. We achieve this through a set of Projects under the Adoptium Project Management Committee (PMC) and a close working partnership with external projects, most notably OpenJDK for providing the Java SE runtime implementation. Our goal is to meet the needs of both the Eclipse community and broader runtime users by providing a comprehensive set of technologies around runtimes for Java applications that operate alongside existing standards, infrastructures, and cloud platforms.

Eclipse AQAvit project

AQAvit is the quality and runtime branding evaluation project for Java SE runtimes and associated technology. During a release it takes a functionally complete Java runtime and ensures that all the additional qualities are present that make it suitable for production use. These quality criteria include good performance, exceptional security, resilience and endurance, and the ability to pass a wide variety of application test suites. In addition to verifying that functionally complete runtimes are release ready, the AQA tests may also serve to verify new functionality during runtime development.

Eclipse Temurin project

The Eclipse Temurin project provides code and processes that support the building of runtime binaries and associated technologies that are high performance, enterprise-caliber, cross-platform, open-source licensed, and Java SE TCK-tested for general use across the Java ecosystem.

Eclipse Temurin Compliance project

The Eclipse Temurin Compliance project is responsible for obtaining, managing, and executing the Oracle Java SE Compatibility Kit (JCK) on Eclipse Temurin binaries. The work is done on private infrastructure and using code managed in closed repositories only available to committers of Temurin Compliance. The public artefacts produced by this project are limited to an indication of whether a particular Eclipse Temurin binary is Java SE compliant or not.

Eclipse Mission Control project

Eclipse Mission Control enables you to monitor and manage Java applications without introducing the performance overhead normally associated with these types of tools. It uses data collected for normal adaptive dynamic optimization of the Java Virtual Machine (JVM). Besides minimizing the performance overhead, this approach eliminates the problem of the observer effect, which occurs when monitoring tools alter the execution characteristics of the system.

infrastructure's People

Contributors

aahlenst avatar adambrousseau avatar aixtools avatar ali-ince avatar aswinkr77 avatar bblondin avatar cjkwork avatar cwesmills avatar dependabot[bot] avatar fredg02 avatar gdams avatar geraintwjones avatar haroon-khel avatar husainyusufali avatar jdekonin avatar julian55455 avatar karianna avatar luhenry avatar lumuchris256 avatar mbarbero avatar neomatrix369 avatar olvap377 avatar pstankie avatar sej-jackson avatar steelhead31 avatar sxa avatar sxa555 avatar vsebe avatar willsparker avatar zdtsw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

infrastructure's Issues

Jenkins machine configuration on Windows test machines need to update

Openjdk tests build on windows got failures for permission issue:

java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\openjdk_test_x86-64_windows\openjdk-test\OpenJDK_Playlist\openjdk-jdk8u\jdk\test\sun\management\windows\revokeall.exe

According to last two comments in adoptium/aqa-tests#37 (comment) Jenkins machine configuration need to update to specify the tools location for git.

The issue are still there suppose the configuration isn't be updated.

zLinux machine(s) for non-JCK testing

Request for minimally 1 (eventually 2, if we do not start sharing machines across build/test functionality) zLinux machines for JCK testing (similar request to #77), "Spec wise a 2-core, 8Gb, and a fast disk of around 100Gb".

Bring AIX boxes on-line for build / test

The following new AIX build / test boxes are available to the project. I have added the keybox public key to the list of authorized_keys. Note that there are existing authorized keys that should be retained for the hoster's maintenance use.

power8-aix-openjdk1.osuosl.org - 140.211.9.10
power8-aix-openjdk2.osuosl.org - 140.211.9.12

Each system is 32GB memory, 5 vCPU, 1 CPU unit that can dynamically adapt to 10 CPU, and a minimal AIX 7.1 install. The AIX 7.1.4.4 DVD1 still is "mounted". The OS and related files are installed on filesystems allocated from rootvg, and /home is allocated from homevg. Each volume group is 80GB and most of rootvg is unallocated with considerable room for expansion.

Both systems have been set up with larger queue depth for the hdisks, which improves performance a little. One also can create a ramdisk.

You can customize the systems as you wish.

sigtest: Ubuntu machines need to have JDK 5, JDK 6 and JDK 9

In order to be able to build certain artefacts i.e. code-tools related (e.g. SigTest) we need to have the following JDKs installed:

  • version 5
  • version 6
  • version 9

As version 7 and 8 are already installed.

Note OpenJDK version 5 and 6 are not easily installable via ansible scripts. So an alternative source will need to be sought after. Which might add to the complications of our artefacts being built using different flavours of JDK (a bit inconsistent).

We could download from Oracle but with the latest changes, we will need a login and password to be able to download old versions of the JDK. Which means passing these details into the ansible script.

Get a Tier 1 x86 sponsor

Currently we are hosting a lot of our x86 hardware with packet.net

I want to move away from this as I want to free up our usage limit so that we can provision more arm machines for testing

FYI @vielmetti

jenkins: Add ability to let more users view the jenkins job configurations

I've had a few people as if they can see the job configuration to be able to understand what the jobs are doing. While jenkins doesn't have any integrated ability to allow read-only access (aso by default if you can view it you edit it) there are plugins such as https://wiki.jenkins.io/display/JENKINS/Extended+Read+Permission+Plugin which will change that. opening this issue for discussion to see if there is any reason not to have this in place - do we have sensitive stuff in the jobs that wouldn't be hidden by this plugin?

Move automated posting to Slack into their own channels

We have a number of automated 'bots' that post to Slack about various topics.

The bots are swamping some channels with automated messages, and hiding any real post. It is also unnecessary to archive most of the bot postings, so we can choose which are archived.

This issue is to create #<blah>-bots channels and switch the bots to posting on there so the humans have a chance.

Biggest offenders are likely:
#infrastructure where Nagios should be posting to #infrastructure-bot (un-archived), and
#website where Localize should be posting to #website-bot (un-archived).

Vagrant script for ubuntu fails when run in an isolated environment

Standing up an environment using vagrant (for ubuntu 14.04) and running the ubuntu.yml Ansible script halts with the below message:

fatal: [localhost]: FAILED! => {"failed": true, "msg": "An unhandled exception occurred while
running the lookup plugin 'file'. Error was a <class 'ansible.errors.AnsibleError'>, 
original message: could not locate file in lookup: /Vendor_Files/keys/id_rsa.pub"})

This occurs when run on a local machine (reproducible on Linux and MacOSX environments).

Re-running with -v flags will help, -vv, -vvv, etc... will give will more verbose info about the issue.

See #58 (comment) in #58

Add any additional info to https://github.com/AdoptOpenJDK/openjdk-infrastructure/blob/master/ansible/README.md, once resolved or any findings during the course of the investigation.

#helpwanted #bug

nagios.adoptopenjdk.net certificate about to expire

Hello,

Your certificate (or certificates) for the names listed below will expire in
19 days (on 26 Jul 17 00:42 +0000). Please make sure to renew
your certificate before then, or visitors to your website will encounter errors.

nagios.adoptopenjdk.net

For any questions or support, please visit https://community.letsencrypt.org/.
Unfortunately, we can't provide support by email.

For details about when we send these emails, please visit
https://letsencrypt.org/docs/expiration-emails/. In particular, note
that this reminder email is still sent if you've obtained a slightly
different certificate by adding or removing names. If you've replaced
this certificate with a newer one that covers more or fewer names than
the list above, you may be able to ignore this message.

If you want to stop receiving all email from this address, click
http://mandrillapp.com/track/unsub.php?u=30850198&id=8fb004715c47471b98c23130d1ca600a.OYLci%2Fk79LBUOvvM5JFmpLp8Mdw%3D&r=https%3A%2F%2Fmandrillapp.com%2Funsub%3Fmd_email%3Dbrad_blondin%2540ca.ibm.com
(Warning: this is a one-click action that cannot be undone)

Regards,
The Let's Encrypt Team

ci.adoptopenjdk.net package upgrade problems

The Jenkins host ci.adoptopenjdk.net had a number of critical OS package updates pending. Upgrading the packages has introduced problems with Jenkins.

Jenkins is up and running, but a number of nodes are currently flagged as offline.

MacOS machine for JCK testing

Agreed that Macstadium will provide us with two further mac's for this purpose. Waiting to find out which os level to deploy.

pLinux-LE machines for all non-JCK testing

I will piggy-back on the requests for JCK test machines (asking for same requirements as #76),
"Spec-wise something like 2-core, 8Gb RAM and an SSD of around 100Gb ".

One (or eventually two) machines, so that we can enable the following types of tests:

  • openjdk regression tests
  • system/stress tests
  • functional tests

(optionally/eventually some perf micro benchmarks).

Get a second windows build machine

We could do with a windows 2012 server with visual studio 2013 to build the openj9 binaries as this is the required level for openj9 to build. I will investigate where we could source one from.

Update Nagios to latest version

Our installation of Nagios Core 4.3.1 is outdated and should be upgraded. The latest version of Nagios Core is 4.3.4 was released on 2017-08-24.

build-marist-s390x-sles-12 can't resolve itself

I'm getting an issue on build-marist-s390x-sles-12 (148.100.110.56) where it is unable to resolve it's own hostname. Can we get an entry for openjdk-sles12 (The output from hostname) added to /etc/hosts on the machine - either with it's real IP or just to 127.0.0.1 please so that it resolves? This is causing some tests to fail as per adoptium/aqa-systemtest#9

Windows machine for JCK testing

We need to run the JCK suite on WIndows and it's access needs to be locked down so it cannot be shared with other jobs.

Needs to have a fast disk (so I'd say SSD) and ideally powerful cores (but doesn't need many of them) so something like 2 core/8Gb/100Gb SSD should suffice. Perhaps 16Gb+ if we decide to use a ramdrive for holding the JCK test suite itself.

Windows version TBD - what do Oracle test on?

Request for access to Packet ARM systems

I'm working on getting the OpenJDK/OpenJ9 builds working on ARM. Would it be possible to get access to the ARM build systems for some basic toe-in-the-water evaluations of my initial builds?

text/csv.pm

Text::CSV is installed via ansible playbook:

    - name: Install Text::CSV
      shell: |
        cpanm --with-recommends Text::CSV
      tags: text_csv 

However, it doesn't appear on the machine

[root@rhel7hcxrt1 ~]# perl -MText::CVS -e 1
Can't locate Text/CVS.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .).

Running the installation command manually says its already installed and up to date:

[root@rhel7hcxrt1 ~]# cpanm --with-recommends Text::CSV
Text::CSV is up to date. (1.95)

Issues has been seen everywhere we checked so far:
RHEL 6 PPC64, RHEL 7 x86 PPC64, UB 14/16 x86,

Create a Nagios System Configuration Tool

Create a Nagios System Configuration Tool (script) to help setup/configure new systems host.cfg files for Nagios to monitor

ask questions then generate the host.cfg file
test and enable monitoring

pLinux-LE machine for JCK testing

We need to run the JCK suite on pLinux-LE and it's access needs to be locked down so it cannot be shared with other jobs.

Spec-wise something like 2-core, 8Gb RAM and an SSD of around 100Gb would be ideal as that's what we're using for xLinux.

zLinux machine for JCK testing

We need to run the JCK suite on zLinux and it's access needs to be locked down so it cannot be shared with other jobs.

Spec wise a 2-core, 8Gb, and a fast disk of around 100Gb should be ideal.

build-marist-s390x-rhel-7.3 unable to resolve host

I am having issues using wget or curl on this machine. This is also preventing me from being able to connect it to jenkins

~ ssh [email protected]
[linux1@adoptopenjdk ~]$ wget https://google.com
--2017-06-26 04:02:53--  https://google.com/
Resolving google.com (google.com)... failed: Connection refused.
wget: unable to resolve host address ‘google.com’
[linux1@adoptopenjdk ~]$ 

CC @bblondin @AdoptOpenJDK/getopenjdk

Add new s390x Linux machines to build test farm

Marist have generously created two new Ubuntu 16.04 systems for us. One is the replacement for our old RHEL6 image (148.100.110.55) while the other is an extra one that we requested to cope with the additional workload.

Both images have 8 Gig Memory / 100G Disk / 4 CP's

Systems:
LXEOJ905 - 148.100.33.178
LXEOJ906 - 148.100.33.179

I have the login details for these for those that need them.

This task is to configure the machines for build/test as appropriate, add the new nodes to Jenkins, Nagios, etc.

Create missing Ansible playbooks for build machines

To ensure that machine images can be reliably recreated for AdoptOpenJDK build/test we need entirely scripted configuration that sets up a VM "from scratch".

A number of the Ansible playbooks exist in the openjdk-build repo, but they are not complete in their coverage.

Proposed steps are:

  • create an initial provisioning script to establish sufficient capability on a new node type to run as an Ansible client (e.g. keybox public key, python, more?)
  • ensure Ansible scripts exist for each CPU/OS type we manage, and are complete.

Goal is that we can discard a VM at any point and recreate it entirely using the public information in our scripts.

Add additional hosts and services to Nagios

The following machines are not currently known to our Nagios installation, and should be added to ensure their basic health is monitored:

  • api.adoptopenjdk.net
  • staging.adoptopenjdk.net

The following publicly available services should also be monitored so the #infrastructure channel is notified if they go down:

  • HTTP/HTTPS
    • www.adoptopenjdk.net
    • api.adoptopenjdk.net
    • ci.adoptopenjdk.net
    • keybox.adoptopenjdk.net
    • staging.adoptopenjdk.net
    • ansible.adoptopenjdk.net

Free up disk space on build-marist-s390x-sles-12 root partition

The /dev/dasdb2 file system on build-marist-s390x-sles-12 (148.100.110.56) is filling up, currently at 96%.

On the Ubuntu sister machine, the system upgrades had multiple versions of the kernel left behind. That may be happening on SLES too.

This task is to clear out any unused packages and kernels to free up the root partition.

Investigate running Jenkins master as a service

Launching Jenkins currently requires remembering a long command-line. To keep things simple it would be preferable to embody this as a system service or some such thing, so that it will start on normal machine boot level, and be easier to stop/restart etc.

Original suggestion by @karianna

Jenkins server - root partition is almost full

The Jenkins server (http://ci.adoptopenjdk.net) root partition is almost full.
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 188G 142G 37G 80% /

101GB of this is in /home/jenkins/.jenkins/jobs

Looking at the builds, It does not appear that they are configured to clean themselves up.
This should be configured for all builds.

example:
screen shot 2017-07-10 at 4 21 36 pm

We do not currently have any sles12 s390x machines tagged with "test"

At the moment we run the systemtests on sles12 as this has a suitable version of the libffi library available (https://github.com/AdoptOpenJDK/openjdk-infrastructure/issues/19). At present none of those machines has a a tag of test so I cannot use that tag to run the systemtest jobs. If I leave the tag off I end up with things potentially running on master which doesn't work well at all as it doesn't have make installed. I could use build that would stop the other platforms from using the dedicated test machines, so that's not a sensible solutionest either. For now I've set the jobs to use !hg which knocks out three machines including master and is adequate until we get a sles12/s390x box tagged with test

See also the work item about machines tags: #93

Include host time synchronization pkgs in Ansible scripts

New machines that are configured for AdoptOpenJDK should have some real time clock synchronization package installed (e.g. NTP, timesyncd, etc) to ensure they do not drift too far and disrupt Jenkins pipeline coordination.

Although many of our jobs are quite long running, where they fail they may fail quickly and being out of sync by tens of seconds matters.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.