Gitlab

Learn how to setup Gitlab CI pipelines so publishing your data science work becomes automatic!!

Prerequisites

  • Have a Kyso account - either on kyso.io or on your company's private Kyso installation.

  • Create a Kyso access token - follow these instructions. Save this for later!

  • Ensure your directory (or folders) contains a valid kyso.yaml file. Check the following instructions for more info.

How to publish to Kyso with Gitlab CI Pipelines

About the kyso-push step variables:

  • Theusernameand token fields take their value from secret variables, to make the system work the user has to create the kyso auth token and define the KYSO_USERNAME and KYSO_TOKEN as explained on the github documentation.

  • Theurl has to point to your Kyso deployment. So if your company, Acme Inc. has its own Kyso instance available on https://acme.kyso.io, that is the value that has to be assigned to it.

  1. Create a .gitlab-ci.yml file in your Gitlab repository.

  2. Copy the following YAML contents into .gitlab-ci.yml:

image: node:18
Upload to Kyso:
      script:
            - npm install -g kyso
            - kyso login --kysoInstallUrl https://kyso.io --provider kyso --username [YOUR EMAIL] --token [YOUR ACCESS TOKEN]
            - kyso push

We're using node version 18 here, note that this might need to change if looking at this doc in the future. This works as of September 27th 2022.

This means that on each new push to a tag on the repository, a CI pipeline will be generated to run the commands in our action file above.

Remember to also ensure that your metadata specs in the sub-directory YAML files, are correct:

  • organization: [Your Organisation Name]

  • team/channel: [Destination channel for each report, and this might be different for each sub-directory]

Note that if your repository contains multiple sub-folders, it will be of type: meta in the root directory kyso.yaml file and each sub-folder will have it's own kyso.yaml file with specifications on that individual project's metadata (e.g. title, description, tags, etc.). Read more here:

Executing the CI Pipeline

When we commit changes to the Gitlab repository, a Docker image is created to run our job. We can check out the pipeline logs at https://gitlab.com/user/repository/-/jobs/job-id and should see something like this:

Running with gitlab-runner 15.4.0 (43b2dc3d)
  on Gitlab Shared Runner (Docker) Q2W747Cx
Preparing the "docker" executor
00:03
Using Docker executor with image node:18 ...
Pulling docker image node:18 ...
Using docker image sha256:5eb5251ba1b16c9549a23229d4fb1211e2a7f4ca6ce7b91fb86d939fd886440e for node:18 with digest node@sha256:d6ed353d022f6313aa7c3f3df69f3a216f1c9f8c3374502eb5e6c45088ce68e8 ...
Preparing environment
00:00
Running on runner-q2w747cx-project-94-concurrent-0 via 276b460ab7aa...
Getting source from Git repository
00:01
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /builds/kyle/kyso-demo/.git/
Checking out 433188c3 as main...
Skipping Git submodules setup
Executing "step_script" stage of the job script
00:20
Using docker image sha256:5eb5251ba1b16c9549a23229d4fb1211e2a7f4ca6ce7b91fb86d939fd886440e for node:18 with digest node@sha256:d6ed353d022f6313aa7c3f3df69f3a216f1c9f8c3374502eb5e6c45088ce68e8 ...
$ npm install -g kyso
added 236 packages, and audited 237 packages in 13s
33 packages are looking for funding
  run `npm fund` for details
found 0 vulnerabilities
$ kyso login --kysoInstallUrl https://kyso.io --provider kyso --username kyle@kyso.io --token my-kyso-token
Logged successfully
$ kyso push
4 reports found
No new or modified files to upload in the 'rna-sequences' folder.
🎉🎉🎉 Report was uploaded to
https://kyso.io/kyso-examples/life-sciences/graphing-genomic-mutation-ratios
🎉🎉🎉
No new or modified files to upload in the 'small-molecules' folder.
No new or modified files to upload in the '10xgenomics' folder.
Cleaning up project directory and file based variables
00:00
Job succeeded

Maintaining a QA process with Pull Requests

Similar to our documentation on Github Actions, we can maintain a QA process, ensuring only merged/reviewed work is pushed upstream to Kyso.

So, for example, if you work with a team of scientists, data scientists, etc.., all pushing and pulling to and from the same repository, you're going to want to control how, when and what changes get published to Kyso.

Because a merged pull request always results in a push, we can just use the push event to accomplish our goal. So the above workflow will run when a PR is merged or when a commit is made directly to the master branch.

To make this workflow even more secure, it has been recommended that you add branch protection rules to your main branch!

Last updated