Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
4eac5b4
CUDA: refactor mma data loading for AMD (#22051)
JohannesGaessler Apr 19, 2026
e365e65
vendor : update cpp-httplib to 0.42.0 (#21781)
cabelo Apr 19, 2026
9d49acb
server: rename --clear-idle to --cache-idle-slots (#21741)
yychyo Apr 20, 2026
788fcbc
[SYCL] Fix reorder MMVQ assert on unaligned vocab sizes (#22035)
PMZFX Apr 20, 2026
de71b5f
server : refactor "use checkpoint" logic (#22114)
ggerganov Apr 20, 2026
81df3f7
fix: GLM-DSA crash in llama-tokenize when using vocab_only (#22102)
ssam18 Apr 20, 2026
a678916
mtmd: refactor mtmd_decode_use_mrope (#22161)
ngxson Apr 20, 2026
a6cc43c
ggml-webgpu: updated matrix-vector multiplication (#21738)
neha-ha Apr 20, 2026
7f251fd
ggml-cpu: Optimized x86 and generic cpu q1_0 dot (follow up) (#21636)
pl752 Apr 20, 2026
fb19f94
TP: fix 0-sized tensor slices, AllReduce fallback (#21808)
JohannesGaessler Apr 20, 2026
fd6ae4c
Tensor-parallel: Fix delayed AllReduce on Gemma-4 MoE (#22129)
gaugarg-nv Apr 20, 2026
cf8b0db
server : remove /api endpoints (#22165)
ggerganov Apr 20, 2026
86f8daa
mtmd: correct get_n_pos / get_decoder_pos (#22175)
ngxson Apr 20, 2026
9789512
ggml-cuda: flush legacy pool on OOM and retry (#22155)
leonardHONG Apr 20, 2026
ff6b106
server : fix hardcoded proxy connection timeout in router mode (#1876…
xris99 Apr 21, 2026
cfe9838
fit-params : refactor + add option to output estimated memory per dev…
ggerganov Apr 21, 2026
041fe83
ggml : bump version to 0.10.0 (ggml/1463)
ggerganov Apr 21, 2026
4889afb
sync : ggml
ggerganov Apr 21, 2026
cd03ec7
llama-ext : fix exports (#22202)
ggerganov Apr 21, 2026
9998d88
mtmd: correct mtmd_decode_use_mrope() (#22188)
ngxson Apr 21, 2026
82209ef
vulkan: Support F16 OP_FILL (#22177)
jeffbolznv Apr 21, 2026
7fc1c4e
metal : workaround macOS GPU interactivity watchdog (#22216)
ggerganov Apr 21, 2026
606fa42
vendor : update cpp-httplib to 0.43.1 (#22143)
cabelo Apr 21, 2026
52f1096
openvino: driver setup, CI split, thread safety, and NPU optimization…
wine99 Apr 21, 2026
84652b8
arg : add --spec-default (#22223)
ggerganov Apr 21, 2026
98d2d28
mtmd: Add support for Reka Edge 2603 (#21616)
kwajiehao Apr 21, 2026
72d693e
spec : reset i_last when low acceptance streak occurs (#22168)
treo Apr 21, 2026
2248799
hexagon: fix missing v79 entry in libggml-htp.inf (#22194)
mengshengwu Apr 21, 2026
5a4cd67
Hexagon: DAIG op (#22195)
shreyajn Apr 21, 2026
04fe84b
server: allow cancel loading model (#21814)
ngxson Apr 21, 2026
2799d93
ggml-webgpu: reset CPU/GPU profiling time when freeing context (#22050)
yomaytk Apr 21, 2026
0dedb9e
hexagon: add support for FILL op (#22198)
aparmp-quic Apr 21, 2026
ca7f7b7
ggml-webgpu(shader): support conv2d kernels. (#21964)
Constannnnnt Apr 22, 2026
134d6e5
common/chat, server: refactor, move all conversion functions to commo…
pwilkin Apr 22, 2026
750579f
common: Refactoring sampler parameters (#20429) (#22233)
ezturner Apr 22, 2026
7bfe60f
mtmd, llama : Update HunyuanVL vision-language model support (#22037)
ManaEstras Apr 22, 2026
17f6245
server: ignore reasoning content from transcription api (#21905)
ngxson Apr 22, 2026
82d3f4d
mtmd: also support LLAMA_ROPE_TYPE_NONE (#22242)
ngxson Apr 22, 2026
225088e
sycl: Improve mul_mat_id memory efficiency and add BF16 fast path (#2…
qnixsynapse Apr 22, 2026
bcb5eeb
speculative-simple : add checkpoint support (#22227)
ggerganov Apr 22, 2026
8bccdbb
chat: fix parallel_tool_calls default setting based on model capabili…
pwilkin Apr 22, 2026
6da7168
ggml-webgpu: Add fused RMS_NORM + MUL (#21983)
yomaytk Apr 22, 2026
0d0764d
[WebGPU] Implement async tensor api and event api (#22099)
nikhilJain17 Apr 22, 2026
6217b49
HIP: flip GGML_HIP_GRAPHS to default on (#22254)
IMbackK Apr 23, 2026
86db42e
CUDA: fuse relu + sqr (#22249)
anavp-nvidia Apr 23, 2026
b76429a
ggml-webgpu: add support for im2col (#22259)
Constannnnnt Apr 23, 2026
60b68a6
sycl : fused MoE mul_mat_vec_q for TG (#21920)
abotsis Apr 23, 2026
5eaee65
convert : Handle ModelOpt produced mixed precision model during conve…
ynankani Apr 23, 2026
4ead6fd
[SYCL] Update oneapi 2025.3.3, Seperate SYCL build, release Ubuntu 24…
NeoZhangJianyu Apr 23, 2026
96c1db2
ggml-base: use MATH_LIBRARY variable instead of hardcoded 'm' (#22239)
ggerganov Apr 23, 2026
930e021
gitignore: add AGENTS.local.md (#22246)
ggerganov Apr 23, 2026
8635e22
metal : fix event synchronization (#22260)
ggerganov Apr 23, 2026
550d684
server: Enable transcriptions API for LFM2-Audio (#22000)
tdakhran Apr 23, 2026
0dd7f91
cli : cleanup auto-completion code (#21745)
matthiasstraka Apr 23, 2026
9012c50
model-conversion : fix mmproj output file name [no ci] (#22274)
danbev Apr 23, 2026
0949beb
fix build number for sycl release (#22283)
CISC Apr 23, 2026
c807c6e
server: (anthropic API) fix prefix caching (#21793)
kvc0 Apr 23, 2026
12568ca
vendor : update LibreSSL to 4.3.1 (#22285)
angt Apr 23, 2026
c78fb90
server: fix heap-buffer-overflow from negative n_discard (CVE-2026-21…
SongTonyLi Apr 23, 2026
185cbff
server : convert_anthropic_to_oai: also copy chat_template_kwargs (#2…
Soreepeong Apr 23, 2026
187a456
Enable testing on Snapdragon devices (#21051)
shreyajn Apr 23, 2026
5d2b52d
hexagon: add support for basic and extended Op profiling (#22269)
max-krasnyansky Apr 23, 2026
fa0b8a7
cli: Remove redundant local sampling variables (#20429) (#22264)
ezturner Apr 23, 2026
e5f070a
fix(shader): handle the buffer aliasing for rms fuse (#22266)
Constannnnnt Apr 23, 2026
8bc492e
hexagon: add SOLVE_TRI op (#21974)
mengshengwu Apr 24, 2026
793d0a7
server: rename debug tags to match --cache-idle-slots naming (#22292)
yychyo Apr 24, 2026
ffdd983
server : fix swa-full logic (#22288)
ggerganov Apr 24, 2026
017f090
jinja : remove unused header (#22310)
ggerganov Apr 24, 2026
e583f3b
ggml : minor coding style (#22308)
ggerganov Apr 24, 2026
dc80c52
common : fix jinja warnings with clang 21 (#22313)
angt Apr 24, 2026
15fa3c4
metal : print GPU description (#22318)
ggerganov Apr 24, 2026
298826e
ggml-cpu: add rvv 512b,1024b impls for iq4_xs
taimur-10x Feb 13, 2026
e6a4ff4
ggml-cpu: refactor; add rvv 512b, 1024b impls for q6_K, i-quants
taimur-10x Feb 14, 2026
ebe9650
added 512 and 1024 implementations of tq3_s, iq3_xxs, iq2_s, iq2_xs, …
RehanQasim-dev Feb 24, 2026
aa171a2
ggml-cpu: refactor; improve iq2_xs impl for rvv 256
RehanQasim-dev Feb 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .devops/intel.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ARG ONEAPI_VERSION=2025.3.2-0-devel-ubuntu24.04
ARG ONEAPI_VERSION=2025.3.3-0-devel-ubuntu24.04

## Build Image

Expand Down
50 changes: 48 additions & 2 deletions .devops/openvino.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,19 @@ ARG OPENVINO_VERSION_MAJOR=2026.0
ARG OPENVINO_VERSION_FULL=2026.0.0.20965.c6d6a13a886
ARG UBUNTU_VERSION=24.04

# Optional proxy build arguments - empty by default
# Intel GPU driver versions. https://github.com/intel/compute-runtime/releases
ARG IGC_VERSION=v2.30.1
ARG IGC_VERSION_FULL=2_2.30.1+20950
ARG COMPUTE_RUNTIME_VERSION=26.09.37435.1
ARG COMPUTE_RUNTIME_VERSION_FULL=26.09.37435.1-0
ARG IGDGMM_VERSION=22.9.0

# Intel NPU driver versions. https://github.com/intel/linux-npu-driver/releases
ARG NPU_DRIVER_VERSION=v1.32.0
ARG NPU_DRIVER_FULL=v1.32.0.20260402-23905121947
ARG LIBZE1_VERSION=1.27.0-1~24.04~ppa2

# Optional proxy build arguments
ARG http_proxy=
ARG https_proxy=

Expand Down Expand Up @@ -78,13 +90,47 @@ ARG http_proxy
ARG https_proxy

RUN apt-get update \
&& apt-get install -y libgomp1 libtbb12 curl \
&& apt-get install -y libgomp1 libtbb12 curl wget ocl-icd-libopencl1 \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
&& find /var/cache/apt/archives /var/lib/apt/lists -not -name lock -type f -delete \
&& find /var/cache -type f -delete

# Install GPU drivers
ARG IGC_VERSION
ARG IGC_VERSION_FULL
ARG COMPUTE_RUNTIME_VERSION
ARG COMPUTE_RUNTIME_VERSION_FULL
ARG IGDGMM_VERSION
RUN mkdir /tmp/neo/ && cd /tmp/neo/ \
&& wget https://github.com/intel/intel-graphics-compiler/releases/download/${IGC_VERSION}/intel-igc-core-${IGC_VERSION_FULL}_amd64.deb \
&& wget https://github.com/intel/intel-graphics-compiler/releases/download/${IGC_VERSION}/intel-igc-opencl-${IGC_VERSION_FULL}_amd64.deb \
&& wget https://github.com/intel/compute-runtime/releases/download/${COMPUTE_RUNTIME_VERSION}/intel-ocloc-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
&& wget https://github.com/intel/compute-runtime/releases/download/${COMPUTE_RUNTIME_VERSION}/intel-ocloc_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
&& wget https://github.com/intel/compute-runtime/releases/download/${COMPUTE_RUNTIME_VERSION}/intel-opencl-icd-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
&& wget https://github.com/intel/compute-runtime/releases/download/${COMPUTE_RUNTIME_VERSION}/intel-opencl-icd_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
&& wget https://github.com/intel/compute-runtime/releases/download/${COMPUTE_RUNTIME_VERSION}/libigdgmm12_${IGDGMM_VERSION}_amd64.deb \
&& wget https://github.com/intel/compute-runtime/releases/download/${COMPUTE_RUNTIME_VERSION}/libze-intel-gpu1-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
&& wget https://github.com/intel/compute-runtime/releases/download/${COMPUTE_RUNTIME_VERSION}/libze-intel-gpu1_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
&& dpkg --install *.deb \
&& rm -rf /tmp/neo/

# Install NPU drivers
ARG NPU_DRIVER_VERSION
ARG NPU_DRIVER_FULL
ARG LIBZE1_VERSION
RUN mkdir /tmp/npu/ && cd /tmp/npu/ \
&& wget https://github.com/intel/linux-npu-driver/releases/download/${NPU_DRIVER_VERSION}/linux-npu-driver-${NPU_DRIVER_FULL}-ubuntu2404.tar.gz \
&& tar -xf linux-npu-driver-${NPU_DRIVER_FULL}-ubuntu2404.tar.gz \
&& dpkg --install *.deb \
&& rm -rf /tmp/npu/

RUN cd /tmp \
&& wget https://snapshot.ppa.launchpadcontent.net/kobuk-team/intel-graphics/ubuntu/20260324T100000Z/pool/main/l/level-zero-loader/libze1_${LIBZE1_VERSION}_amd64.deb \
&& dpkg --install libze1_${LIBZE1_VERSION}_amd64.deb \
&& rm libze1_${LIBZE1_VERSION}_amd64.deb

COPY --from=build /app/lib/ /app/

### Full (all binaries)
Expand Down
113 changes: 113 additions & 0 deletions .github/workflows/build-and-test-snapdragon.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
name: CI (snapdragon)

on:
workflow_dispatch:
push:
branches:
- master
paths:
- '.github/workflows/build-and-test-snapdragon.yml'
- 'ggml/include/ggml-hexagon.h'
- 'ggml/src/ggml-hexagon/**'
- 'docs/backend/snapdragon/**'
- 'scripts/snapdragon/**'
- 'CMakePresets.json'

pull_request:
types: [opened, synchronize, reopened]
paths:
- '.github/workflows/build-and-test-snapdragon.yml'
- 'ggml/include/ggml-hexagon.h'
- 'ggml/src/ggml-hexagon/**'
- 'docs/backend/snapdragon/**'
- 'scripts/snapdragon/**'
- 'CMakePresets.json'

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
cancel-in-progress: true

jobs:
android-ndk-snapdragon:
runs-on: ubuntu-latest
container:
image: 'ghcr.io/snapdragon-toolchain/arm64-android:v0.3'
defaults:
run:
shell: bash

steps:
- name: Clone
uses: actions/checkout@v6
with:
fetch-depth: 0
lfs: false

- name: Build Llama.CPP for Snapdragon Android
id: build_llama_cpp_snapdragon_android
run: |
cp docs/backend/snapdragon/CMakeUserPresets.json .
cmake --preset arm64-android-snapdragon-release -B build
cmake --build build
cmake --install build --prefix pkg-adb/llama.cpp

- name: Upload Llama.CPP Snapdragon Android Build Artifact
if: ${{ always() && steps.build_llama_cpp_snapdragon_android.outcome == 'success' }}
uses: actions/upload-artifact@v6
with:
name: llama-cpp-android-arm64-snapdragon
path: pkg-adb/llama.cpp

check-secret:
runs-on: ubuntu-latest
outputs:
has-key: ${{ steps.check.outputs.has-key }}
steps:
- id: check
run: echo "has-key=${{ secrets.QDC_API_KEY != '' }}" >> "$GITHUB_OUTPUT"

test-snapdragon-qdc:
name: Test on QDC Android Device (${{ matrix.device }})
needs: [android-ndk-snapdragon, check-secret]
if: needs.check-secret.outputs.has-key == 'true'
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
device: [SM8750, SM8650, SM8850]

steps:
- name: Checkout
uses: actions/checkout@v6

- name: Download build artifact
uses: actions/download-artifact@v4
with:
name: llama-cpp-android-arm64-snapdragon
path: pkg-snapdragon/

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.x'
cache: pip

- name: Install QDC SDK wheel
run: |
curl -fSL -o qdc_sdk.zip https://softwarecenter.qualcomm.com/api/download/software/tools/Qualcomm_Device_Cloud_SDK/All/0.2.3/qualcomm_device_cloud_sdk-0.2.3.zip
unzip qdc_sdk.zip -d qdc_sdk
pip install qdc_sdk/qualcomm_device_cloud_sdk-0.2.3-py3-none-any.whl

- name: Run QDC tests (${{ matrix.device }})
run: |
python scripts/snapdragon/qdc/run_qdc_jobs.py \
--test all \
--pkg-dir pkg-snapdragon/llama.cpp \
--model-url "https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct-Q4_0.gguf" \
--device ${{ matrix.device }}
env:
QDC_API_KEY: ${{ secrets.QDC_API_KEY }}

- name: Cleanup
if: always()
run: rm -rf pkg-snapdragon qdc_sdk qdc_sdk.zip
49 changes: 18 additions & 31 deletions .github/workflows/build-android.yml
Original file line number Diff line number Diff line change
@@ -1,26 +1,24 @@
name: CI (android)

on:
workflow_dispatch: # allows manual triggering
workflow_dispatch:
push:
branches:
- master
paths: [
'.github/workflows/build-android.yml',
'**/CMakeLists.txt',
'**/.cmake',
'**/*.h',
'**/*.hpp',
'**/*.c',
'**/*.cpp'
]
paths:
- '.github/workflows/build-android.yml'
- '**/CMakeLists.txt'
- '**/.cmake'
- '**/*.h'
- '**/*.hpp'
- '**/*.c'
- '**/*.cpp'

pull_request:
types: [opened, synchronize, reopened]
paths: [
'.github/workflows/build-android.yml',
'examples/llama.android/**'
]
paths:
- '.github/workflows/build-android.yml'
- 'examples/llama.android/**'

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
Expand Down Expand Up @@ -67,35 +65,24 @@ jobs:
defaults:
run:
shell: bash
strategy:
matrix:
include:
- build: 'arm64-cpu'
defines: '-D ANDROID_ABI=arm64-v8a -D ANDROID_PLATFORM=android-31 -D CMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_ROOT}/build/cmake/android.toolchain.cmake -D GGML_NATIVE=OFF -DGGML_CPU_ARM_ARCH=armv8.5-a+fp16+i8mm -G Ninja -D LLAMA_OPENSSL=OFF -D GGML_OPENMP=OFF'
- build: 'arm64-snapdragon'
defines: '--preset arm64-android-snapdragon-release'

steps:
- name: Clone
id: checkout
uses: actions/checkout@v6
with:
fetch-depth: 0
lfs: false

- name: Build Llama.CPP for Hexagon Android
id: build_llama_cpp_hexagon_android
- name: Build
id: ndk_build
run: |
if [[ "${{ matrix.build }}" == "arm64-snapdragon" ]]; then
cp docs/backend/snapdragon/CMakeUserPresets.json .
fi
cmake ${{ matrix.defines }} -B build
cmake -D ANDROID_ABI=arm64-v8a -D ANDROID_PLATFORM=android-31 -D CMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_ROOT}/build/cmake/android.toolchain.cmake -D GGML_NATIVE=OFF -DGGML_CPU_ARM_ARCH=armv8.5-a+fp16+i8mm -G Ninja -D LLAMA_OPENSSL=OFF -D GGML_OPENMP=OFF -B build
cmake --build build
cmake --install build --prefix pkg-adb/llama.cpp

- name: Upload Llama.CPP Hexagon Android Build Artifact
if: ${{ always() && steps.build_llama_cpp_hexagon_android.outcome == 'success' }}
- name: Upload Android Build Artifact
if: ${{ always() && steps.ndk_build.outcome == 'success' }}
uses: actions/upload-artifact@v6
with:
name: llama-cpp-android-${{ matrix.build }}
name: llama-cpp-android-arm64-cpu
path: pkg-adb/llama.cpp
120 changes: 120 additions & 0 deletions .github/workflows/build-openvino.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
name: CI (openvino)

on:
workflow_dispatch: # allows manual triggering
push:
branches:
- master
paths: [
'.github/workflows/build-openvino.yml',
'**/CMakeLists.txt',
'**/.cmake',
'**/*.h',
'**/*.hpp',
'**/*.c',
'**/*.cpp',
]

pull_request:
types: [opened, synchronize, reopened]
paths: [
'.github/workflows/build-openvino.yml',
'ggml/src/ggml-openvino/**'
]

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
cancel-in-progress: true

env:
GGML_NLOOP: 3
GGML_N_THREADS: 1
LLAMA_LOG_COLORS: 1
LLAMA_LOG_PREFIX: 1
LLAMA_LOG_TIMESTAMPS: 1

jobs:
ubuntu-24-openvino:
name: ubuntu-24-openvino-${{ matrix.openvino_device }}

concurrency:
group: openvino-${{ matrix.variant }}-${{ github.head_ref || github.ref }}
cancel-in-progress: false

strategy:
matrix:
include:
- variant: cpu
runner: '"ubuntu-24.04"'
openvino_device: "CPU"
- variant: gpu
runner: '["self-hosted","Linux","Intel","OpenVINO"]'
openvino_device: "GPU"

runs-on: ${{ fromJSON(matrix.runner) }}

env:
# Sync versions in build-openvino.yml, build-self-hosted.yml, release.yml, build-cache.yml, .devops/openvino.Dockerfile
OPENVINO_VERSION_MAJOR: "2026.0"
OPENVINO_VERSION_FULL: "2026.0.0.20965.c6d6a13a886"

steps:
- name: Clone
id: checkout
uses: actions/checkout@v6

- name: ccache
if: runner.environment == 'github-hosted'
uses: ggml-org/ccache-action@v1.2.21
with:
key: ubuntu-24-openvino-${{ matrix.variant }}-no-preset-v1
evict-old-files: 1d
save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

- name: Dependencies
id: depends
run: |
sudo apt-get update
sudo apt-get install -y build-essential libssl-dev libtbb12 cmake ninja-build python3-pip
sudo apt-get install -y ocl-icd-opencl-dev opencl-headers opencl-clhpp-headers intel-opencl-icd

- name: Use OpenVINO Toolkit Cache
if: runner.environment == 'github-hosted'
uses: actions/cache@v5
id: cache-openvino
with:
path: ./openvino_toolkit
key: openvino-toolkit-v${{ env.OPENVINO_VERSION_FULL }}-${{ runner.os }}

- name: Setup OpenVINO Toolkit
if: steps.cache-openvino.outputs.cache-hit != 'true'
uses: ./.github/actions/linux-setup-openvino
with:
path: ./openvino_toolkit
version_major: ${{ env.OPENVINO_VERSION_MAJOR }}
version_full: ${{ env.OPENVINO_VERSION_FULL }}

- name: Install OpenVINO dependencies
run: |
cd ./openvino_toolkit
chmod +x ./install_dependencies/install_openvino_dependencies.sh
echo "Y" | sudo -E ./install_dependencies/install_openvino_dependencies.sh

- name: Build
id: cmake_build
run: |
source ./openvino_toolkit/setupvars.sh
cmake -B build/ReleaseOV -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_OPENVINO=ON
time cmake --build build/ReleaseOV --config Release -j $(nproc)

- name: Test
id: cmake_test
# TODO: fix and re-enable the `test-llama-archs` test below
run: |
cd ${{ github.workspace }}
if [ "${{ matrix.openvino_device }}" = "GPU" ]; then
export GGML_OPENVINO_DEVICE=GPU
fi
ctest --test-dir build/ReleaseOV -L main -E "test-llama-archs" --verbose --timeout 2000
Loading
Loading