TSRM: make CG, EG, SCNG and AG compile-time offsets#22287
Conversation
saves an offset load every time they're accessed
| TSRM_ALIGNED_SIZE(sizeof(zend_compiler_globals)) + | ||
| TSRM_ALIGNED_SIZE(sizeof(zend_executor_globals)) + | ||
| TSRM_ALIGNED_SIZE(sizeof(zend_php_scanner_globals)) + | ||
| TSRM_ALIGNED_SIZE(zend_mm_globals_size())); // AG size, exposed through function call |
There was a problem hiding this comment.
really unhappy about this, same issue as below:
| #if defined(__cplusplus) || defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 202311L) | ||
| # define TSRM_STATIC_ASSERT(c, m) static_assert((c), m) | ||
| #elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 201112L) /* C11 */ | ||
| # define TSRM_STATIC_ASSERT(c, m) _Static_assert((c), m) | ||
| #else | ||
| # define TSRM_STATIC_ASSERT(c, m) | ||
| #endif | ||
| TSRM_STATIC_ASSERT(TSRM_ALIGNED_SIZE(sizeof(tsrm_tls_entry)) == TSRM_FAST_RESERVED_BASE, | ||
| "TSRM_FAST_RESERVED_BASE must equal the tsrm_tls_entry header size"); |
There was a problem hiding this comment.
so this is a two-fold problem:
- duplicated ZEND_STATIC_ASSERT copy - tsrm can't use zend_portability, because zend_portability depends on tsrm.h
- tsrm_ls_entry size is only known in this file, not in tsrm.h consumers.
- AG size can be consumed at runtime, so it gets the last slot, but it's structurally the same problem
In C++ this would be a simple consteval function in the header, but I have no idea how to make it work in C. So I kept the definition private, but force it to the specific size it has.
There was a problem hiding this comment.
Possibly, the cache could be set to the end of tsrm_tls_entry so that code outside of TSRM.c do not have to be aware of the size of tsrm_tls_entry. Alternatively, set _tsrm_ls_cache to the start of tsrm_tls_entry, but allocate the front elements before it and use negative offsets.
There was a problem hiding this comment.
I don't think moving it to the end makes sense, it would just move the implementation size leak from the fast path to the slow TSRMG_BULK path.
The alternative might work, I'm not sure yet. Let me explore it.
There was a problem hiding this comment.
It's possible to get rid of the size leak, but the concept is harder for me to wrap my head around. I'll push the commit, but I'm not sure it's a good idea.
There was a problem hiding this comment.
I don't think moving it to the end makes sense, it would just move the implementation size leak from the fast path to the slow TSRMG_BULK path.
Indeed
It's possible to get rid of the size leak, but the concept is harder for me to wrap my head around. I'll push the commit, but I'm not sure it's a good idea.
This seems fine to me, but I let you chose what you prefer.
There was a problem hiding this comment.
That's not my decision to make. I'd probably take a small implementation size leak over the more complicated understanding, but that's subjective to me because I came up with the former x)
| #define ZEND_CGG_DIAGNOSTIC_IGNORED_END ZEND_DIAGNOSTIC_IGNORED_END | ||
|
|
||
| #if defined(__cplusplus) | ||
| #if defined(__cplusplus) || defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 202311L) |
There was a problem hiding this comment.
technically unrelated, but I didn't want to copy an incomplete definition into tsrm.c
There was a problem hiding this comment.
I'd be in favour, getting php-src and extensions to compile in extremely old environments is a pain anyway (I do rhel-7 builds for frankenphp). Should I replace this then?
There was a problem hiding this comment.
Should I replace this then?
Not yet, probably. See: https://news-web.php.net/php.internals/130714
| #if defined(__cplusplus) || defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 202311L) | ||
| # define TSRM_STATIC_ASSERT(c, m) static_assert((c), m) | ||
| #elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 201112L) /* C11 */ | ||
| # define TSRM_STATIC_ASSERT(c, m) _Static_assert((c), m) | ||
| #else | ||
| # define TSRM_STATIC_ASSERT(c, m) | ||
| #endif | ||
| TSRM_STATIC_ASSERT(TSRM_ALIGNED_SIZE(sizeof(tsrm_tls_entry)) == TSRM_FAST_RESERVED_BASE, | ||
| "TSRM_FAST_RESERVED_BASE must equal the tsrm_tls_entry header size"); |
There was a problem hiding this comment.
I don't think moving it to the end makes sense, it would just move the implementation size leak from the fast path to the slow TSRMG_BULK path.
Indeed
It's possible to get rid of the size leak, but the concept is harder for me to wrap my head around. I'll push the commit, but I'm not sure it's a good idea.
This seems fine to me, but I let you chose what you prefer.
…ocating version of ts_allocate_fast_id_at)
arnaud-lb
left a comment
There was a problem hiding this comment.
Looks good to me otherwise!
saves an offset load every time they're accessed
fifth split from #22231
a bit of a different approach (keep everything on the existing heap allocations, just reordered) so we don't need to work around the apache module tls surplus problem and don't need to fall back to global-exec
it's not the entire gain of the other PR, but still something. measurements without the local-exec or -mtls-size change (which will benefit on top, together they restore around half the ZTS penalty over NTS):