Skip to content

add PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

Aurel Istrate requested to merge pytorch_2.1.2_cuda into master

Created by: bedroge

The installation worked fine on other node types, but on the V100 nodes I got a failing test_cuda_expandable_segments test. I've just ignored it for now by using --ignore-test-failure, which still prints the issue but ignores it:

== testing...

WARNING: Test failure ignored: 'Test ended with failures! Exit code: 1\nFailed tests (suites/files):\n+ test_cuda_expandable_segments 1/1'

== ... (took 9 hours 26 mins 28 secs)

edit: actually, I just see that there was a failing test on zen3 as well, but that one was automatically ignored by EB (not sure why that didn't happen for the other one):

WARNING: 1 test failure, 0 test errors (out of 209672):
test_quantization 1/1 (1 failed, 1021 passed, 82 skipped, 3 rerun)

The PyTorch test suite is known to include some flaky tests, which may fail depending on the specifics of the system or the context in which they are run. For this P
yTorch installation, EasyBuild allows up to 2 tests to fail. We recommend to double check that the failing tests listed above  are known to be flaky, or do not affec
t your intended usage of PyTorch. In case of doubt, reach out to the EasyBuild community (via GitHub, Slack, or mailing list).

Merge request reports