Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
9811b34
EP PyTorch: NCCL EP backend + autograd ops + tests, route zero_copy v…
phu0ngng Jun 9, 2026
d10f0ce
EP PyTorch: wire maybe_make_window into per-step ops for zero_copy
phu0ngng Jun 9, 2026
3509e91
EP PyTorch: merge EpHandle into EpBuffer; ep_dispatch/ep_combine take…
phu0ngng Jun 9, 2026
8409e2b
EP PyTorch example: drop stale ep_group kwarg from EpBuffer call
phu0ngng Jun 10, 2026
b8480f7
EP PyTorch: drop ep_scope; ep_finalize is optional with atexit fallback
phu0ngng Jun 10, 2026
87a1ba3
EP PyTorch: restrict payload dtype to bf16; refresh stale ep.cpp comm…
phu0ngng Jun 10, 2026
5444954
EP PyTorch: clear pylint warnings in ep.py (broad-except suppression,…
phu0ngng Jun 10, 2026
d5a6c09
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 10, 2026
2e9a89e
EP PyTorch: skip combine_in staging copy in non-zero-copy mode
phu0ngng Jun 10, 2026
e9cdaed
EP PyTorch: gate symm-mem slot allocation on zero-copy; require exper…
phu0ngng Jun 10, 2026
5f29d90
EP PyTorch: rename buffer slots to dispatch_/combine_symm_buf; drop g…
phu0ngng Jun 10, 2026
91edff0
EP PyTorch: warn that ep_bootstrap(zero_copy=True) is experimental
phu0ngng Jun 10, 2026
0415e3a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 11, 2026
bab2bdc
EP PyTorch: wire test_ep.py into L1 distributed QA suite
phu0ngng Jun 11, 2026
6696030
EP PyTorch: validate contiguity of dispatch/combine inputs and topk_w…
phu0ngng Jun 11, 2026
090d05b
EP PyTorch: check topk_idx/topk_weights token count matches tokens in…
phu0ngng Jun 11, 2026
926bef2
EP PyTorch: move symm-mem allocation out of EpBuffer; make ep_dispatc…
phu0ngng Jun 12, 2026
2ca14a7
EP PyTorch: enforce contiguous caller-supplied EP buffers in C++ and …
phu0ngng Jun 12, 2026
8cb815f
EP PyTorch: store autograd ctx tensors via save_for_backward, mark EP…
phu0ngng Jun 23, 2026
53a3834
EP PyTorch: reword EP buffer comments and drop redundant recv_topk_we…
phu0ngng Jun 23, 2026
a4cb0cb
EP PyTorch: rename build env var NVTE_BUILD_WITH_NCCL_EP to NVTE_WITH…
phu0ngng Jun 23, 2026
8c3a222
EP PyTorch: source ep_combine backward grad from EpBuffer symm-mem un…
phu0ngng Jun 24, 2026
ddbfed0
EP PyTorch: add zero-copy test pass over dispatch/combine/1f1b autogr…
phu0ngng Jun 24, 2026
bcf9bb0
EP PyTorch: rename _zero_copy_capable test marker to _zero_copy_test_…
phu0ngng Jun 24, 2026
a6e8768
EP PyTorch: parametrize test_dispatch_autograd over recv-buffer cases…
phu0ngng Jun 24, 2026
05068bc
EP PyTorch: drop unused EpBuffer.from_external; token_counts is alway…
phu0ngng Jun 24, 2026
e33e63f
EP PyTorch: EpBuffer owns dispatch recv-output symm buffers in zero-c…
phu0ngng Jun 24, 2026
31db373
EP PyTorch: stash combine-bwd grad scatter target as plain ctx attrib…
phu0ngng Jun 24, 2026
9496515
EP PyTorch: drop unused EpBuffer.record_stream
phu0ngng Jun 24, 2026
d4f8973
EP PyTorch: add EpBuffer caller_provides_dispatch_recv_tokens and cal…
phu0ngng Jun 24, 2026
21dea50
EP PyTorch: add opt-in caller_provides_dispatch_recv_tokens/caller_pr…
phu0ngng Jun 24, 2026
63d35bd
EP PyTorch: rename combine grad buffer API to grad_expert_out across …
phu0ngng Jun 24, 2026
82aabaf
EP PyTorch: rename eo local to expert_out in tests and examples
phu0ngng Jun 24, 2026
6ec78a3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 24, 2026
3a023ef
EP PyTorch: move symm_mem_alloc to distributed.py and re-export from ep
phu0ngng Jun 25, 2026
3ff8a4e
EP PyTorch: replace caller_provides_* bools with dispatch_recv_tokens…
phu0ngng Jun 25, 2026
9a88584
Cleanup ep_moe.py
phu0ngng Jun 25, 2026
008c3d2
EP PyTorch: drop redundant warnings reimport and add _EpCombine.backw…
phu0ngng Jun 25, 2026
7d1cbc4
Update transformer_engine/pytorch/distributed.py
phu0ngng Jun 25, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions build_tools/pytorch.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,16 @@ def setup_pytorch_extension(

setup_mpi_flags(include_dirs, cxx_flags)

# Mirror the NCCL EP gate from setup.py / common CMake. When disabled, the
# ep.cpp source no-ops at the #ifdef boundary; without the define it would
# produce undefined references to nvte_ep_*.
if bool(int(os.getenv("NVTE_WITH_NCCL_EP", "1"))):
cxx_flags.append("-DNVTE_WITH_NCCL_EP")
# PyTorch's symm-mem headers gate the NCCL_HAS_SYMMEM_* feature macros on
# USE_NCCL. The EP extension shares the symm-mem NCCL comm with torch, so
# it needs those macros visible.
cxx_flags.append("-DUSE_NCCL")

library_dirs = []
libraries = []
if bool(int(os.getenv("NVTE_ENABLE_NVSHMEM", 0))):
Expand Down
Loading
Loading