Skip to content

Better error message for no GPU or incompatible GPU #818

@dkolsen-pgi

Description

@dkolsen-pgi

Given this small but useless test program:

#include <thrust/device_vector.h>
#include <thrust/sort.h>
int main() {
  thrust::device_vector<int> dv;
  thrust::sort(dv.begin(), dv.end());
}

When compiled with nvcc -arch=sm_80 tiny.cu and then run on a system that doesn't have any GPUs, the error message is:

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  radix_sort: failed on 1st step: cudaErrorInvalidDevice: invalid device ordinal
Aborted

When run a system with a Volta GPU (sm_70), it fails with:

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  radix_sort: failed on 1st step: cudaErrorInvalidDeviceFunction: invalid device function
Aborted

I don't expect either of those situations to work. The program should crash. What I would like is a better error message, one that gives a naive user some clue about what is happening. The first case should say something about no GPU being available, and the second should mention something about an incompatible GPU (saying that you are trying to run an sm_80 program on an sm_70 GPU would be awesome).

Making such a change will improve the user experience and make it easier for users to troubleshoot problems.

Metadata

Metadata

Assignees

No one assigned

    Labels

    thrustFor all items related to Thrust.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions