Production
The main dashboard on linux.kernelci.org shows all the results for the native tests run by KernelCI with upstream kernels. It is updated typically once a week with all the changes that have been merged in the various components: core tools, YAML configuration, backend, frontend, Jenkins …
Each project has a main
branch used for development, this is where all the
PRs are normally merged. Then there is also a kernelci.org
branch in each
project for production deployment. These branches are updated every time the
production instance is updated. All the changes should be tested on
staging before being merged into the main
branches and later on
deployed in production.
The kernelci-deploy project contains some tools to automate parts of the production update. This is gradually getting more automated but a few steps still require manual intervention. Updating the production instance requires having SSH access to the main kernelci.org server.
Update procedure
-
Release new versions
Each component which has some changes ready to be deployed needs a new version tag. This is something that respective maintainers normally do. They should also look if there are any PRs ready to be merged before creating the new version.
-
Update the kernelci-deploy checkout
The production update uses tools in
kernelci-deploy
. The first step is to update it to the latest version, which can be done with this command:./kernelci.org checkout
-
Build new rootfs images
Root file system (rootfs) images should be built ahead of the production update, so they can be tested on staging. It can take several hours to build all of them, so ideally this should be a couple of days before the production update. For example, with production updates usually occurring every Monday, the root file system images can be built every Friday to get tested on staging over the weekend.
To build a new set of rootfs images:
./kernelci.org rootfs
This will first rebuild the Docker images used to build the rootfs images, then trigger the rootfs image builds, and wait for them to complete. You may abort while waiting for the builds to complete and resume later with this command:
./kernelci.org rootfs_flush
-
Flush kernel builds
Once rootfs images have been tested and the new URLs have been merged in
test-configs.yaml
, the next step is to abort any pending kernel revision to be built in Jenkins and wait for all the ones currently in the pipeline to complete. The production update needs to be done while the pipeline is empty to avoid glitches while multiple components are being updated. Thekernelci.org
script inkernelci-deploy
has a command to automate this part:./kernelci.org pause
This may take a while, typically 1h or so, depending on the number of kernel builds still pending. The script will be monitoring the status and exit once completed. It will also cancel all bisections as they can take a very long time to run. Regressions that persist will cause other bisections to be run after the production update.
-
Restart jenkins
Once there are no more kernel builds running in Jenkins, it should be restarted with the latest changes from
kernelci-jenkins
and the encrypted data inkernelci-jenkins-data
(labs API tokens etc.). This is also a good time for rebooting all the nodes VMs and pruning Docker resources. Doing this requires SSH access to all the nodes. There is areboot.sh
helper script inkernelci-jenkins-data
:./bot.kernelci.org/reboot.sh
There is currently no automated way to detect that Jenkins as fully restarted. This is a gap to be filled in the
kernelci.org
helper script before the production update can be fully automated. -
Update everything
Once Jenkins has restarted, the
kernelci.org
script provides another command to automatically updated everything:./kernelci.org update
This does the following:
- Checkout the
kernelci-core
repository locally with the head of thekernelci.org
branch - Run the Jenkins DSL job to recreate all the job definitions
- Redeploy the backend using Ansible
- Redeploy the frontend using Ansible
- Update the static website on kernelci.org
- Update the KernelCI LAVA test-definitions fork with a rebase on top of upstream
- Rebuild all the kernel Docker images (toolchains, device tree validation, QEMU…) and push them to the Docker hub
- Push a test revision to the KernelCI linux tree to run a “pipe cleaner” build
- Start a “pipe cleaner” build trigger job in Jenkins to build the test kernel revision
It should take around 30min for the test kernel builds to complete, and then a while longer for any tests to complete in all the labs. The results should be checked manually, by comparing with previous revisions on the web frontend. The number of builds and test results as well as their pass rates should be similar. These builds can be seen here:
https://linux.kernelci.org/job/kernelci/branch/kernelci.org/
If there appear to be some issues with the results, the monitor job should be stopped so no new revisions will be automatically built. Some changes may be reverted or a fix applied depending on the root cause of the issues. If the results look fine, then the monitor job will start discovering new kernel revisions periodically and the production update is complete.
- Checkout the
Known issues
-
Scheduler settings of the
kernel-tree-monitor
job are not restored after Jenkins restart.Once Jenkins is restarted with latest changes from
kernelci-jenkins
, the scheduler settings are never set up. This is known issue and the only solution is to set the scheduler manually.- Log in to Jenkins on https://bot.kernelci.org
- Go to
kernel-tree-monitor
job configuration page at https://bot.kernelci.org/job/kernel-tree-monitor/configure - In Build Triggers section choose Build periodically
- Paste the following settings to the Schedule text box to have the monitor job run every hour:
H * * * *
- Save settings by clicking Save button