This project sets up a server capable of receiving GitHub webhooks and hosting a webpage. The server is designed to automate the testing of pull request code from the Satpy repository using Behave. Ansible is used to automate the deployment and configuration of the server. The application was created by Accso - Accelerated Solutions in collaboration with Pytroll.
|
These steps are not necessary for changing the behave tests but only for changes to the server code. Else, the server should be installed and running already. The behave tests should be changed in the Satpy repository itself.
- Install ansible
- Clone this repository to your local computer. Since Ansible is a Unix-based system, if it is in a Windows file system, that might cause issues. Therefore, put it in the mounted WSL file system and execute the commands using the linux shell.
- Install the certbot-nginx role:
ansible-galaxy role install coopdevs.certbot_nginx,v0.3.1
within the directoryansible/roles
. - The satellite data required to run the behave tests is not included in this repository, since they are huge files. In the current state of behave tests, GOES16 and GOES17 data is used which may change if behave tests are extended. Place the required data on your local computer in
pytroll-image-comparison-tests/data/satellite_data
. This will be copied to the server by Ansible. - You need access to a user with sudo privileges to the server listed in the
ansible/inventory.ini
. Right now, this isimage-test.int-pytroll-development.s.ewcloud.host
. To check whether you have access, you can test by usingssh [email protected]
- Change the
host_path
variable in theinventory.yml
to the path you have the repository in. For example, if the repository is/home/your_user/pytroll-image-comparison-tests/
then make surehost_path: /home/your_user
. - In linux shell, navigate to
pytroll-image-comparison-tests/ansible
. Then you may executeansible-playbook playbooks/deploy_image_comparison.yml --ask-become-pass --ask-pass
. - Before the server can start, a file
/home/<comparison-user>/pytroll-image-comparison-tests/serverLogic/secret.py
must exist, in which the Webhook-Secret and Github-Token are stored in the following format:
WEBHOOK_SECRET = "xxx"
GITHUB_TOKEN = "ghp_xxx"
The webhook secret is used for GitHub to contact the server. The GitHub token is used for the server to post GitHub comments. Note that typically, those tokens have a limited expiration time. If the bot appears to be on strike, check on GitHub that the access tokens for user rymdulf are still current.
The webhook needs to be configured in Github for the Satpy repository. For a general guide on how to do this, see here in GitHub docs.
Payload URL: https://pytroll-image-test-dev.int-pytroll-development.s.ewcloud.host/webhook
Content type: application/json
Secret: your_secret
Enable SSL verification
Let me select individual events: Pull request reviews
Active
Choose a secret with which to validate the incoming webhook deliveries. You should choose a random string of text with high entropy. This secret needs to be saved to the secret.py, which should never be pushed to an online repository.
In the 'Recent Deliveries' tab, you can see the recently sent webhooks and resend them if necessary.
The personal access token (PAT) for a GitHub (bot) account is needed to pull the PR code and make comments on the review thread. The account needs to have rights to clone the repository and access discussions. Create a fine-grained PAT using the GitHub Docs. The PAT only needs the following permissions: 'Read access to metadata' and 'Read and Write access to discussions, pull requests, and repository hooks'.
Ansible is an open-source automation tool used for configuration management, application deployment, and task automation. In this project, Ansible is used to automate the setup of the image comparison server.
- playbooks/deploy_image_comparison.yml: This playbook automates the deployment of the image comparison service. It runs the roles explained below. It also schedules a cron job to clean up old test result files. This is currently set to delete 60 days old test result data and empty directories in the
test_results
folder daily at 2AM.
- roles/add_user: Creates the comparison user on the target system with limited shell access (
/bin/false
) for security. This ensures the service runs under a controlled user account for proper access management. - roles/configure_nginx: Copies the Nginx configuration for the image comparison service from the template location, enables the configuration by creating a symbolic link, tests the Nginx configuration, and reloads Nginx if the configuration is valid.
- roles/deploy_data: Copies the
serverLogic
anddata
directories to the target system, ensuring they are owned by the comparison user. It also converts Windows line endings in thestart_server.sh
script to Unix format, and sets appropriate permissions for executing the script. - roles/install_dependencies: Installs the necessary system and Docker packages and ensures the Docker service is running. It also sets up a Python virtual environment and installs Python packages such as FastAPI, Flask, Gunicorn, and Docker's Python module for managing containers. Finally, it configures the permissions for the virtual environment.
- roles/manage_services: Copies the
image-comparison.service
file to the appropriate system directory, reloads the Systemd daemon, and ensures the image comparison service is restarted and running on the target system.
- ansible/templates/systemd/image-comparison.service: This is the systemd service file for managing the image comparison server. It specifies that the service should run under the comparison user and defines the working directory and the script to start the server (
start_server.sh
). The service is configured to restart on failure with a 10-second delay and is limited to 5 restarts within a 300-second interval. The service will be automatically started after the network is available and will be enabled to run at system boot. - ansible/templates/nginx/sites-available/image-comparison: The
templates
directory contains Jinja2 templates that are used for configuring services like Nginx. This Nginx configuration template sets up the web server to route traffic for the image comparison service, ensuring secure connections using SSL certificates from Let's Encrypt. - ansible.cfg: This is the central configuration file for Ansible, specifying options like paths, SSH settings, and default values. In this case, it specifies that the roles are stored in the
roles/
directory and the inventory file isinventory.ini
. - inventory.yml: Defines the target hosts for Ansible. Here, it points to the domain
image-test.int-pytroll-development.s.ewcloud.host
, which represents the system where the playbooks will be executed. This is also the URL that serves the test result website.
This directory contains all the code needed to automate the PR testing.
A shell script to start the Flask server or Gunicorn in either development or production mode. This is the script ansible executes as image-comparison.service
- WEBHOOK_SECRET and GITHUB_TOKEN: These are decrypted from an encrypted file (
secret.env.enc
), and passed to the Flask or Gunicorn server. - IS_DEBUG: Determines whether to start the Flask server in development mode or use Gunicorn in production mode.
Holds configuration settings for the application, which are loaded from environment variables. Mostly, this is meant to make the paths adjustable if necessary. Exceptions are the following variables:
- DEBUG: Determines whether the application is running in debug mode. If an error occurs, changing
DEBUG
toTrue
may help. - HOST_URL: The URL where the server is hosted. This should be
https://image-test.int-pytroll-development.s.ewcloud.host
.
The main Flask server file that processes incoming GitHub webhooks, runs tests, and serves a web interface for viewing test results. The webhook_secret and github_token need to be given as arguments for the server to start correctly. However, this is automatically done by the start_server.sh
script.
- create_app: Sets up the Flask application, including routes for handling webhook events and displaying test results.
- github_webhook: The endpoint to handle incoming webhook requests from GitHub. It processes pull requests, triggers tests, and posts comments back to the PR.
- display_test_results: Displays the test results for a specific timestamp, including the generated and difference images.
- more_results: Lists all previous test result directories.
- display_latest_results: Displays the latest test results.
- serve_test_results: Serves static files (test images) from the results directory.
This module provides utility functions to handle GitHub communication and validating the post-requests sent to the server URL.
- post_github_comment: Sends a comment to a specific pull request in a GitHub repository using the GitHub API.
- verify_signature: Validates the payload's authenticity from GitHub using HMAC with SHA256.
- extract_pull_request_info: Extracts key information (repository, branch, and pull request number) from the webhook payload.
- validate_safe_path: Ensures that paths are safe, avoiding directory traversal attacks.
- validate_timestamp_path_component: Validates a timestamp string, ensuring it follows a specific format.
This file contains the utility functions concerning the setup, execution, and teardown of Docker containers used to test pull requests.
- remove_existing_container: Removes an existing Docker container if it is found running.
- clear_directory: Empties and recreates a specified directory.
- check_container: Checks whether a Docker container is currently running.
- mask_sensitive_data: Replaces sensitive information (e.g. tokens) with placeholders for logging purposes.
- clone_and_test_pull_request: Manages the process of cloning the repository, setting up a virtual environment, installing dependencies, and running the Behave tests inside a Docker container. It posts a comment back to the GitHub PR once the tests are complete or an error occurs.
- serverLogic/static/styles.css: Stylesheet for the webpage.
- serverLogic/templates/: HTML files for each webpage, used in
server.py
- serverLogic/secret.py: File containing the
GITHUB_TOKEN
andWEBHOOK_SECRET
. Needs to be set up manually and must not be pushed to a remote repository.
These are the actual Tests being run on the PR Satpy code. The idea is to create reference images using the current 'gold standard' version of Satpy to create images. These reference images are saved to behave/features/data/reference/
.
Behave tests are not included in this repository, but are going to be integrated in satpy with PR 2912.
In this version, the behave tests need to be executed by the Docker container on the EWC server, since the paths are configured for that purpose. For testing purposes, a Satpy developer could change these paths to their local environment and run them manually. The packages needed to run the tests are behave Pillow pytest numpy opencv-python dask netcdf4 h5netcdf
and their local development version of Satpy (pip install -e {satpy_dir}
). The paths must not be changed in the PR request however, or else the automatic testing will fail.
By default, the application expects the behave tests to be located in the Satpy repository at satpy/tests/behave
. You may adjust this path in serverLogic/config.py
.
The Behave feature file that defines the tests for comparing satellite images. It specifies the scenarios where images from different satellites and composites are compared for pixel-level differences. Add or remove tests for different satellite composite combinations. The corresponding reference images need to exist in the reference directory first.
The step definitions for the Behave tests. It includes steps to:
- Load the reference and generated images.
- Perform pixel-wise comparison of the images using OpenCV.
- Save and report the differences between the reference and generated images.
- Log the results of each test in the test_results folder with detailed output on pixel differences. This folder is hosted by the website.
Here, you can adjust the pixel threshold (how much difference is allowed before failing the test). In case you wish develop and debug the behave tests themselves, you may wish to generate a corrupted reference image and save it to the reference_different
directory. Then change the context for imageA from reference_image
to reference_different_image
.
This utility script generates reference images from satellite data using the Satpy library. Be sure to use the current 'gold standard' Satpy and not a development version. These images are stored in the features/data/reference/
directory. The script is meant to be used manually and only if the reference images need to be changed or extended. Once the reference images are up to date, the script is not needed in production of the automatic behave testing process.
This section provides a few commands to help with debugging the application. For easier debugging, we recommend using a gateway for remote development, for example the JetBrains Gateway.
- check the output file of the last PR testing attempt in the directory Docker creates to clone and test the PR
- e.g.
cat /home/imagetester/pull_request_pull_branch/output.log
- scroll down to see why it failed
- run
sudo journalctl -u docker.service
for Docker related logs
- if the service is not running, the website will give a 502 error
- check server status
sudo systemctl status image-comparison.service
- get more information
journalctl -u image-comparison.service -f
- try
sudo systemctl restart image-comparison.service
- Gunicorn is used to host the server in production mode (Debug=False in config.py)
- check for errors
sudo cat /var/log/gunicorn/error.log
- check the access history at
sudo cat /var/log/gunicorn/access.log
- switch Debug=True to use the Flask server in Debug mode. This provides more debugging statements and the changes you make to the code are directly deployed to the server. Change back to production mode afterwards.
- check for errors
sudo cat /var/log/nginx/error.log
- check the access history at
sudo cat /var/log/nginx/access.log
- make changes using
sudo nano /etc/nginx/sites-available/image-comparison
(also change this inansible/templates/nginx/sites-available/image-comparison
once you are satisfied with the new configuration) - update the configuration using
sudo systemctl reload nginx
- check the status of the certbot certificate using
sudo certbot certificates
- renew the certificate manually using
sudo certbot --nginx -d image-test.int-pytroll-development.s.ewcloud.host --debug
- check logs
sudo cat /var/log/letsencrypt/letsencrypt.log
- check that the personal access tokens are up-to-date. Login to GitHub as user rymdulf and browse to https://github.com/settings/tokens . Gerrit has login credentials.