Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions rfcs/0061-ci-2-0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
- Start Date: 2026-06-08
- Authors: @AdamGS
- RFC PR: [vortex-data/rfcs#61](https://github.com/vortex-data/rfcs/pull/61)

# CI 2.0

## Summary

This RFC intends to propose a cohesive architecture for the project's CI and workflow system. There have been many conversations (both public and private) on this topic,
and I would like to try and propose a holistic vision that can be discussed and iterated in a public forum.

I've tried to include the current state of the project to the best of my understanding, and the issues I see that motivate my proposed changes.

## Current state

The current solution includes (but is not limited) to CI, publishing and benchmarking. It is fully implemented using Github Actions workflows and apps, and relies on
a mix [runs-on](https://runs-on.com/) powered self-hosted runners and GitHub hosted runners.

### Permission Model

There are four different scopes of permissions:

1. Admins - very limited group of users that can manage the organization-wide settings, and the core repo settings.
2. Maintainers - All TSC members, can push branches and open pull requests directly from within the core repo, and approve and merge pull requests.
CI runs automatically for them, and they can trigger CI for external contributions.
3. Members - Non-TSC members that are explicitly trusted. They have the same development experience a maintainer has, aside from the ability to approve and merge pull requests.
4. Everyone else - have to open PRs using a fork in a classic GH flow. Requires a maintainer to run any actions-based workflow, label their PR etc.

Everyone on the maintainers group is automatically subscribed to every repo event, which creates a lot of noise in both emails and notifications. The distinction between the two
groups is hard to communicate externally.

### Testing and CI

Our current test suite includes many workflows, I think they can be grouped into the following categories:

1. Linters - these include rustfmt, clippy, cargo-deny and other language-specific tooling.
2. Core tests - the full test suite, coverage tests
3. Integration-focused tests - Python, Java/JNI, SLT, C/C++
4. Environment tests - making sure Vortex works on various environments like wasm and Windows.

### Benchmarking

Benchmarks run on dedicated runs-on runners, orchestrated by Github, they run both on a nightly schedule, on every commit and can be triggered as-required by a maintainer
labeling the PR.

Any user that can run CI has potential unbounded access to the AWS account, allowing them (or a 3rd party that took control of their account) to hijack resources.

### Versioning and publishing

Every pull request must include a label indicating the sort of change it includes (feature, break, fix, chore, performance, ci or skip).
Those changes are accumulated into a draft release. What that draft is published, a workflow is triggers that gradually publishes all rust crates
and all language or framework bindings in order. Those include:

1. Rust crates (including integrations like `vortex-datafusion`).
2. Python package
3. JNI bindings
4. Spark bindings

Our versioning is tightly coupled between the rust implementation and external integrations. Fixing any issue in the Java bindings or DataFusion integration
requires a full release, which might include many unrelated changes. It also means that changes in the periphery of the project gets included in the changelog AND
the versioning scheme for the core of the project.

This also makes the meaning of the version itself murky. We currently vaguely follow semantic versioning, but we effectively bump or minor version on every release, which means that
a breaking change is always acceptable.

We also don't have clear and consistent definitions of what each kind of changes means, making the changelog inconsistent and somewhat confusing.

## Suggested changes

### Permissions

I suggest we get rid of the two-tier member status, moving the GitHub's built-in permission model that allows users that contributed to the project in the past to run CI by-default, which will make the committers group redundant.

Their CI flow will not run on the project's runs-on infrastructure, which we can limit by making CI check `github.event.pull_request.head.repo.full_name` instead of `github.repository`. We can further reduce the risk of external contributors by adding more restrictions attempted changes to the `.github` directory.

### Benchmarks Bot

For benchmarks, we'll create a new dedicated bot that will have a limited list of allowed users that can trigger benchmark run using a comment (or any other preferred control flow). The bot will be hosted in a different repo, so the surface area for runs-on for external contributors is reduced. Hosting the bot externally will also make it easier to have it run on PRs coming from forks, and it'll be easier to give it a more flexible scheduling system and more powerful permissions.

### Versioning

I suggest we move some of the binding into their own
Loading