Skip to content

AssertionError in OffHeapStorageArea During Concurrent Puts Under Memory Pressure with heap-offheap config #3306

@a-amir01

Description

@a-amir01

There is a race condition in Ehcache's off-heap eviction logic that causes AssertionError during concurrent put() operations when off-heap memory is at or near capacity.

When the cache is configured to use heap-offheap with the size of off-heap being about the same size of the data (with very little room for metadata overhead) an AssertionError is thrown during put operations.

JMH benchmark also shows heap-offheap having the worst performance (throughput/latency) compared to all other configurations.

JMH BENCHMARK CONFIGURATION:
------------------------------------------------------------
Forks:                          1 (separate JVM processes)
Threads per fork:        8
Warmup:                      3 iterations × 1 s
Measurement:             8 iterations × 2 s
Modes:                        Throughput, AverageTime
JVM Arguments:
  -Xms2g
  -Xmx2g
  -XX:+UseG1GC
  -XX:MaxDirectMemorySize=256m

============================================================
BENCHMARK: benchmarkCacheMixed (90% Hits / 10% Misses)
============================================================

TIER CONFIGURATIONS:
------------------------------------------------------------
Total Data Size:         7.75 MB (104 entries)
------------------------------------------------------------
Config Name              Heap Entries (Actual Size)      Off-heap  Disk
------------------------------------------------------------
HEAP_ONLY                  104 entries (7.75 MB   )      0 MB      0 MB
HEAP_OFFHEAP                94 entries (1.99 MB   )      6 MB      0 MB
HEAP_DISK                   94 entries (1.99 MB   )      0 MB     16 MB
THREE_TIER_TINY             82 entries (0.99 MB   )      3 MB     16 MB

THROUGHPUT (ops/sec - HIGHER is better):
--------------------------------------------------------------------------------------------
Config Name                                      ZIPFIAN                        UNIFORM
--------------------------------------------------------------------------------------------
HEAP_ONLY                       4,535,330 ±    479,886 (± 10.6%)       2,199,778 ±    130,353 (±  5.9%)
HEAP_OFFHEAP                        7,716 ±        323 (±  4.2%)           6,939 ±        390 (±  5.6%)
HEAP_DISK                         333,987 ±     14,978 (±  4.5%)         260,309 ±     67,632 (± 26.0%)
THREE_TIER_TINY                   139,686 ±      6,038 (±  4.3%)         137,248 ±     10,951 (±  8.0%)

LATENCY (µs/op - LOWER is better):
--------------------------------------------------------------------------------------------
Config Name                                      ZIPFIAN                        UNIFORM
--------------------------------------------------------------------------------------------
HEAP_ONLY                      1.697 ±    0.058 (±  3.4%)      4.167 ±    0.184 (±  4.4%)
HEAP_OFFHEAP                1003.665 ±   39.293 (±  3.9%)   1152.306 ±   53.827 (±  4.7%)
HEAP_DISK                     23.709 ±    1.449 (±  6.1%)     27.531 ±    1.077 (±  3.9%)
THREE_TIER_TINY               55.848 ±    5.310 (±  9.5%)     58.942 ±    6.290 (± 10.7%)

java.lang.AssertionError
	at org.ehcache.shadow.org.terracotta.offheapstore.paging.OffHeapStorageArea.release(OffHeapStorageArea.java:592)
	at org.ehcache.shadow.org.terracotta.offheapstore.paging.OffHeapStorageArea.shrink(OffHeapStorageArea.java:696)
	at org.ehcache.shadow.org.terracotta.offheapstore.storage.OffHeapBufferStorageEngine.shrink(OffHeapBufferStorageEngine.java:250)
	at org.ehcache.shadow.org.terracotta.offheapstore.AbstractLockedOffHeapHashMap.shrink(AbstractLockedOffHeapHashMap.java:501)
	at org.ehcache.shadow.org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapMap.handleOversizeMappingException(AbstractConcurrentOffHeapMap.java:714)
	at org.ehcache.shadow.org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapMap.computeWithMetadata(AbstractConcurrentOffHeapMap.java:744)
	at org.ehcache.impl.internal.store.offheap.EhcacheConcurrentOffHeapClockCache.compute(EhcacheConcurrentOffHeapClockCache.java:153)
	at org.ehcache.impl.internal.store.offheap.AbstractOffHeapStore.computeWithRetry(AbstractOffHeapStore.java:1051)
	at org.ehcache.impl.internal.store.offheap.AbstractOffHeapStore.put(AbstractOffHeapStore.java:251)
	at org.ehcache.impl.internal.store.tiering.TieredStore.put(TieredStore.java:114)
	at org.ehcache.core.Ehcache.doPut(Ehcache.java:94)
	at org.ehcache.core.EhcacheBase.put(EhcacheBase.java:188)
	at com.test.benchmark.EhcacheTierBenchmark.benchmarkCacheMixed(EhcacheTierBenchmark.java:397)
	at com.test.benchmark.jmh_generated.EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.benchmarkCacheMixed_thrpt_jmhStub(EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.java:145)
	at com.test.benchmark.jmh_generated.EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.benchmarkCacheMixed_Throughput(EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.java:84)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:527)
	at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:504)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
========================================

<failure>

java.lang.RuntimeException: Benchmark failed - stopping execution
	at com.test.benchmark.EhcacheTierBenchmark.benchmarkCacheMixed(EhcacheTierBenchmark.java:407)
	at com.test.benchmark.jmh_generated.EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.benchmarkCacheMixed_thrpt_jmhStub(EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.java:145)
	at com.test.benchmark.jmh_generated.EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.benchmarkCacheMixed_Throughput(EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.java:84)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:527)
	at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:504)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.AssertionError
	at org.ehcache.shadow.org.terracotta.offheapstore.paging.OffHeapStorageArea.release(OffHeapStorageArea.java:592)
	at org.ehcache.shadow.org.terracotta.offheapstore.paging.OffHeapStorageArea.shrink(OffHeapStorageArea.java:696)
	at org.ehcache.shadow.org.terracotta.offheapstore.storage.OffHeapBufferStorageEngine.shrink(OffHeapBufferStorageEngine.java:250)
	at org.ehcache.shadow.org.terracotta.offheapstore.AbstractLockedOffHeapHashMap.shrink(AbstractLockedOffHeapHashMap.java:501)
	at org.ehcache.shadow.org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapMap.handleOversizeMappingException(AbstractConcurrentOffHeapMap.java:714)
	at org.ehcache.shadow.org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapMap.computeWithMetadata(AbstractConcurrentOffHeapMap.java:744)
	at org.ehcache.impl.internal.store.offheap.EhcacheConcurrentOffHeapClockCache.compute(EhcacheConcurrentOffHeapClockCache.java:153)
	at org.ehcache.impl.internal.store.offheap.AbstractOffHeapStore.computeWithRetry(AbstractOffHeapStore.java:1051)
	at org.ehcache.impl.internal.store.offheap.AbstractOffHeapStore.put(AbstractOffHeapStore.java:251)
	at org.ehcache.impl.internal.store.tiering.TieredStore.put(TieredStore.java:114)
	at org.ehcache.core.Ehcache.doPut(Ehcache.java:94)
	at org.ehcache.core.EhcacheBase.put(EhcacheBase.java:188)
	at com.test.benchmark.EhcacheTierBenchmark.benchmarkCacheMixed(EhcacheTierBenchmark.java:397)
	... 12 more


Benchmark had encountered error, and fail on error was requested
ERROR: org.openjdk.jmh.runner.RunnerException: Benchmark caught the exception
	at org.openjdk.jmh.runner.Runner.runBenchmarks(Runner.java:572)
	at org.openjdk.jmh.runner.Runner.internalRun(Runner.java:309)
	at org.openjdk.jmh.runner.Runner.run(Runner.java:208)
	at org.openjdk.jmh.Main.main(Main.java:71)
Caused by: org.openjdk.jmh.runner.BenchmarkException: Benchmark error during the run
	at org.openjdk.jmh.runner.BenchmarkHandler.runIteration(BenchmarkHandler.java:440)
	at org.openjdk.jmh.runner.BaseRunner.runBenchmark(BaseRunner.java:262)
	at org.openjdk.jmh.runner.BaseRunner.runBenchmark(BaseRunner.java:233)
	at org.openjdk.jmh.runner.BaseRunner.doSingle(BaseRunner.java:138)
	at org.openjdk.jmh.runner.BaseRunner.runBenchmarksForked(BaseRunner.java:75)
	at org.openjdk.jmh.runner.ForkedRunner.run(ForkedRunner.java:72)
	at org.openjdk.jmh.runner.ForkedMain.main(ForkedMain.java:86)
	Suppressed: java.lang.RuntimeException: Benchmark failed - stopping execution
		at com.test.benchmark.EhcacheTierBenchmark.benchmarkCacheMixed(EhcacheTierBenchmark.java:407)
		at com.test.benchmark.jmh_generated.EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.benchmarkCacheMixed_thrpt_jmhStub(EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.java:145)
		at com.test.benchmark.jmh_generated.EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.benchmarkCacheMixed_Throughput(EhcacheTierBenchmark_benchmarkCacheMixed_jmhTest.java:84)
		at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
		at java.base/java.lang.reflect.Method.invoke(Method.java:580)
		at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:527)
		at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:504)
		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
		at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
		at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
		at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
		at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
		at java.base/java.lang.Thread.run(Thread.java:1583)
	Caused by: java.lang.AssertionError
		at org.ehcache.shadow.org.terracotta.offheapstore.paging.OffHeapStorageArea.release(OffHeapStorageArea.java:592)
		at org.ehcache.shadow.org.terracotta.offheapstore.paging.OffHeapStorageArea.shrink(OffHeapStorageArea.java:696)
		at org.ehcache.shadow.org.terracotta.offheapstore.storage.OffHeapBufferStorageEngine.shrink(OffHeapBufferStorageEngine.java:250)
		at org.ehcache.shadow.org.terracotta.offheapstore.AbstractLockedOffHeapHashMap.shrink(AbstractLockedOffHeapHashMap.java:501)
		at org.ehcache.shadow.org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapMap.handleOversizeMappingException(AbstractConcurrentOffHeapMap.java:714)
		at org.ehcache.shadow.org.terracotta.offheapstore.concurrent.AbstractConcurrentOffHeapMap.computeWithMetadata(AbstractConcurrentOffHeapMap.java:744)
		at org.ehcache.impl.internal.store.offheap.EhcacheConcurrentOffHeapClockCache.compute(EhcacheConcurrentOffHeapClockCache.java:153)
		at org.ehcache.impl.internal.store.offheap.AbstractOffHeapStore.computeWithRetry(AbstractOffHeapStore.java:1051)
		at org.ehcache.impl.internal.store.offheap.AbstractOffHeapStore.put(AbstractOffHeapStore.java:251)
		at org.ehcache.impl.internal.store.tiering.TieredStore.put(TieredStore.java:114)
		at org.ehcache.core.Ehcache.doPut(Ehcache.java:94)
		at org.ehcache.core.EhcacheBase.put(EhcacheBase.java:188)
		at com.test.benchmark.EhcacheTierBenchmark.benchmarkCacheMixed(EhcacheTierBenchmark.java:397)
		... 12 more

Here is a unit test that reproduces the race condition

class EhcacheOffHeapMemoryExceptionTest {

    private CacheManager cacheManager;
    private Cache<String, CachedData> cache;
    private Map<String, CachedData> testData;
    private ExecutorService executorService;
    private int totalEntries;
    private long totalDataBytes;
	private long offHeapBytes;

    @BeforeEach
    void setUp() {
        Random random = new Random(42);
        List<Integer> bodySizes = generateRealisticDistribution(100, random);

        totalEntries = bodySizes.size();
        totalDataBytes = bodySizes.stream().mapToLong(Integer::longValue).sum();

        System.out.println("=== Realistic HTTP Response Distribution ===");
        System.out.println("Total entries: " + totalEntries);
        System.out.println("Total data size: " + String.format("%.2f MB", totalDataBytes / (1024.0 * 1024.0)));
        System.out.println("Average entry size: " + String.format("%.2f KB", (totalDataBytes / totalEntries) / 1024.0));
        System.out.println("Min size: " + String.format("%.2f KB", bodySizes.stream().min(Integer::compareTo).orElse(0) / 1024.0));
        System.out.println("Max size: " + String.format("%.2f KB", bodySizes.stream().max(Integer::compareTo).orElse(0) / 1024.0));

	    offHeapBytes = (long) Math.ceil(totalDataBytes / (1024.0 * 1024.0));
        cacheManager = CacheManagerBuilder.newCacheManagerBuilder().build(true);

        // Generate test data with real body sizes
        testData = new HashMap<>();
        Random rnd = new Random(42);
        for (int i = 0; i < bodySizes.size(); i++) {
            String key = "key-" + i;
            byte[] payload = new byte[bodySizes.get(i)];
            rnd.nextBytes(payload); // Random data, but real size distribution
            testData.put(key, new CachedData(payload));
        }

        executorService = Executors.newFixedThreadPool(16);
    }

    @AfterEach
    void tearDown() {
        if (executorService != null) {
            executorService.shutdownNow();
        }
        if (cacheManager != null) {
            cacheManager.close();
        }
    }

	@Test
	void testWithOffHeapExtraHeapSize() throws InterruptedException {
		long offHeapSize = offHeapBytes * 2;
		ResourcePoolsBuilder pools = ResourcePoolsBuilder.newResourcePoolsBuilder()
				.heap(2, EntryUnit.ENTRIES)
				.offheap(offHeapSize, MemoryUnit.MB);

		cache = cacheManager.createCache("benchmark-cache",
				CacheConfigurationBuilder.newCacheConfigurationBuilder(
								String.class,
								CachedData.class,
								pools
						)
						.build()
		);

		for (Map.Entry<String, CachedData> entry : testData.entrySet())
			cache.put(entry.getKey(), entry.getValue());

		int numThreads = 8;
		int operationsPerThread = 100000;
		printConfiguration(numThreads, operationsPerThread, offHeapSize);
		runTest(numThreads, operationsPerThread, offHeapSize, true);
	}

    @Test
    void testWithOffHeapMatchingDataSize() throws InterruptedException {
	    ResourcePoolsBuilder pools = ResourcePoolsBuilder.newResourcePoolsBuilder()
			    .heap(2, EntryUnit.ENTRIES)
			    .offheap(offHeapBytes, MemoryUnit.MB);

	    cache = cacheManager.createCache("benchmark-cache",
			    CacheConfigurationBuilder.newCacheConfigurationBuilder(
							    String.class,
							    CachedData.class,
							    pools
					    )
					    .build()
	    );

	    for (Map.Entry<String, CachedData> entry : testData.entrySet())
		    cache.put(entry.getKey(), entry.getValue());

        int numThreads = 8;
        int operationsPerThread = 100000;
	    printConfiguration(numThreads, operationsPerThread, offHeapBytes);
		runTest(numThreads, operationsPerThread, offHeapBytes, true);
    }

	@Test
	void testWithOffHeapHalfDataSize() throws InterruptedException {
		long offHeapSize = offHeapBytes / 2;
		ResourcePoolsBuilder pools = ResourcePoolsBuilder.newResourcePoolsBuilder()
				.heap(2, EntryUnit.ENTRIES)
				.offheap(offHeapSize, MemoryUnit.MB);

		cache = cacheManager.createCache("benchmark-cache",
				CacheConfigurationBuilder.newCacheConfigurationBuilder(
								String.class,
								CachedData.class,
								pools
						)
						.build()
		);

		for (Map.Entry<String, CachedData> entry : testData.entrySet())
			cache.put(entry.getKey(), entry.getValue());

		int numThreads = 10;
		int operationsPerThread = 200;
		printConfiguration(numThreads, operationsPerThread, offHeapBytes);
		for (int i = 0; i < 50; i++)
			runTest(numThreads, operationsPerThread, offHeapSize, false);
	}

	private void runTest(int numThreads, int operationsPerThread, long offHeapBytes, boolean simulateMiss) throws InterruptedException {
		AtomicBoolean stopFlag = new AtomicBoolean(false);
		AtomicReference<Throwable> firstError = new AtomicReference<>(null);
		CountDownLatch startLatch = new CountDownLatch(1);
		CountDownLatch completionLatch = new CountDownLatch(numThreads);

		// Launch threads that mimic benchmarkCacheMixed behavior
		for (int threadId = 0; threadId < numThreads; threadId++) {
		    final int finalThreadId = threadId;
		    executorService.submit(() -> {
		        try {
			        try {
				        startLatch.await();
			        } catch (InterruptedException e) {
				        throw new RuntimeException(e);
			        }

			        Random random = new Random(42 + finalThreadId);
		            String[] keys = testData.keySet().toArray(new String[0]);

		            for (int i = 0; i < operationsPerThread; i++) {
		                if (stopFlag.get())
		                    break;

		                try {
		                    // Select key uniformly (simplified from Zipfian for reproducibility)
		                    String key = keys[random.nextInt(keys.length)];

		                    // 30% of the time, simulate cache miss (3x more aggressive than benchmark's 10%)
		                    if (simulateMiss && random.nextInt(100) < 30)
		                        cache.remove(key);

		                    // Try to get from cache (70% hit, 30% miss)
		                    CachedData data = cache.get(key);

		                    if (data == null) {
		                        // Cache miss - fetch from "backend" and populate cache
		                        data = testData.get(key);
								cache.put(key, data);  // THIS IS WHERE THE EXCEPTION OCCURS

		                    }

		                    if (i > 0 && i % 10000 == 0)
		                        System.out.println("Thread-" + finalThreadId + ": " + i + " operations completed");

		                } catch (AssertionError e) {
		                    firstError.compareAndSet(null, e);
		                    stopFlag.set(true);
		                    executorService.shutdownNow();
		                    break;
		                }
		            }
		        } finally {
		            completionLatch.countDown();
		        }
		    });
		}

		startLatch.countDown();
		completionLatch.await(300, TimeUnit.SECONDS); // Longer timeout for more operations

		// If an error was detected, fail the test by re-throwing it
		Throwable error = firstError.get();
		if (error != null) {
		    System.err.println("\nTest FAILED due to AssertionError in worker thread.");
		    if (error instanceof AssertionError) {
		        throw (AssertionError) error;
		    } else {
		        throw new AssertionError("Test failed with unexpected error", error);
		    }
		}
	}

	private void printConfiguration(int numThreads, int operationsPerThread, long offHeapBytes) {
		System.out.println("========================================");
		System.out.println("Configuration:");
		System.out.println("  Heap: 12 entries");
		System.out.println("  Off-heap: " + offHeapBytes + " MB");
		System.out.println("  Working set: " + totalEntries + " entries (" +
		                 String.format("%.2f", totalDataBytes / (1024.0 * 1024.0)) + " MB)");
		System.out.println("  Threads: " + numThreads);
		System.out.println("  Operations per thread: " + operationsPerThread);
		System.out.println("  Total operations: " + (numThreads * operationsPerThread));
		System.out.println("========================================\n");
	}

	/**
	 * Generates body sizes with realistic HTTP response distribution.
	 * 40% small (1KB - 20KB) - typical JSON/API responses
	 * 35% medium (20KB - 100KB) - HTML pages, small images
	 * 20% large (100KB - 500KB) - larger responses
	 * 5% very large (500KB - 2MB) - big payloads
	 */
	private List<Integer> generateRealisticDistribution(int count, Random random) {
		List<Integer> sizes = new ArrayList<>();

		for (int i = 0; i < count; i++) {
			int category = random.nextInt(100);
			int size;
			if (category < 40)
				size = 1024 + random.nextInt(19 * 1024);
			else if (category < 75)
				size = 20 * 1024 + random.nextInt(80 * 1024);
			else if (category < 95)
				size = 100 * 1024 + random.nextInt(400 * 1024);
			else
				size = 500 * 1024 + random.nextInt(1536 * 1024);
			sizes.add(size);
		}

		return sizes;
	}

    record CachedData(
        byte[] body
    ) implements java.io.Serializable {
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions