PyTorch
|
Classes | |
struct | CUDAGuard |
A variant of DeviceGuard that is specialized for CUDA. More... | |
struct | CUDAMultiStreamGuard |
A variant of MultiStreamGuard that is specialized for CUDA. More... | |
class | CUDAStream |
struct | CUDAStreamGuard |
A variant of StreamGuard that is specialized for CUDA. More... | |
struct | OptionalCUDAGuard |
A variant of OptionalDeviceGuard that is specialized for CUDA. More... | |
struct | OptionalCUDAStreamGuard |
A variant of OptionalStreamGuard that is specialized for CUDA. More... | |
Functions | |
CUDAStream | getStreamFromPool (const bool isHighPriority=false, DeviceIndex device=-1) |
Get a new stream from the CUDA stream pool. More... | |
CUDAStream | getStreamFromExternal (cudaStream_t ext_stream, DeviceIndex device_index) |
Get a CUDAStream from a externally allocated one. More... | |
CUDAStream | getDefaultCUDAStream (DeviceIndex device_index=-1) |
Get the default CUDA stream, for the passed CUDA device, or for the current device if no device index is passed. More... | |
CUDAStream | getCurrentCUDAStream (DeviceIndex device_index=-1) |
Get the current CUDA stream, for the passed CUDA device, or for the current device if no device index is passed. More... | |
void | setCurrentCUDAStream (CUDAStream stream) |
Set the current stream on the device of the passed in stream to be the passed in stream. More... | |
std::ostream & | operator<< (std::ostream &stream, const CUDAStream &s) |
CUDAStream c10::cuda::getCurrentCUDAStream | ( | DeviceIndex | device_index = -1 | ) |
Get the current CUDA stream, for the passed CUDA device, or for the current device if no device index is passed.
The current CUDA stream will usually be the default CUDA stream for the device, but it may be different if someone called 'setCurrentCUDAStream' or used 'StreamGuard' or 'CUDAStreamGuard'.
CUDAStream c10::cuda::getDefaultCUDAStream | ( | DeviceIndex | device_index = -1 | ) |
Get the default CUDA stream, for the passed CUDA device, or for the current device if no device index is passed.
The default stream is where most computation occurs when you aren't explicitly using streams.
CUDAStream c10::cuda::getStreamFromExternal | ( | cudaStream_t | ext_stream, |
DeviceIndex | device_index | ||
) |
Get a CUDAStream from a externally allocated one.
This is mainly for interoperability with different libraries where we want to operate on a non-torch allocated stream for data exchange or similar purposes
CUDAStream c10::cuda::getStreamFromPool | ( | const bool | isHighPriority = false , |
DeviceIndex | device = -1 |
||
) |
Get a new stream from the CUDA stream pool.
You can think of this as "creating" a new stream, but no such creation actually happens; instead, streams are preallocated from the pool and returned in a round-robin fashion.
You can request a stream from the high priority pool by setting isHighPriority to true, or a stream for a specific device by setting device (defaulting to the current CUDA stream.)
std::ostream & c10::cuda::operator<< | ( | std::ostream & | stream, |
const CUDAStream & | s | ||
) |
void c10::cuda::setCurrentCUDAStream | ( | CUDAStream | stream | ) |
Set the current stream on the device of the passed in stream to be the passed in stream.
Yes, you read that right: this function has nothing to do with the current device: it toggles the current stream of the device of the passed stream.
Confused? Avoid using this function; prefer using 'CUDAStreamGuard' instead (which will switch both your current device and current stream in the way you expect, and reset it back to its original state afterwards).