Using device 3 (rank 3, local rank 3, local size 4) : Tesla K80
Using device 2 (rank 2, local rank 2, local size 4) : Tesla K80
Using device 0 (rank 0, local rank 0, local size 4) : Tesla K80
Using device 1 (rank 1, local rank 1, local size 4) : Tesla K80
 running on    4 total cores
 distrk:  each k-point on    4 cores,    1 groups
 distr:  one band on    1 cores,    4 groups
 using from now: INCAR     
  
 *******************************************************************************
  You are running the GPU port of VASP! When publishing results obtained with
  this version, please cite:
   - M. Hacene et al., http://dx.doi.org/10.1002/jcc.23096
   - M. Hutchinson and M. Widom, http://dx.doi.org/10.1016/j.cpc.2012.02.017
  
  in addition to the usual required citations (see manual).
  
  GPU developers: A. Anciaux-Sedrakian, C. Angerer, and M. Hutchinson.
 *******************************************************************************
  
 vasp.6.1.1 19Jun20 (build Jun 25 2020 02:42:34) complex                        
  
 MD_VERSION_INFO: Compiled 2020-06-25T09:42:34-UTC in devlin.sd.materialsdesign.
 com:/home/medea2/data/build/wwolf/vasp6.1.1/13539/x86_64/src/src/build/gpu from
  svn 13539
 
 This VASP executable licensed from Materials Design, Inc.
 
 POSCAR found :  2 types and     241 ions
 LDA part: xc-table for Pade appr. of Perdew
  
 WARNING: The GPU port of VASP has been extensively
 tested for: ALGO=Normal, Fast, and VeryFast.
 Other algorithms may produce incorrect results or
 yield suboptimal performance. Handle with care!
  
 POSCAR, INCAR and KPOINTS ok, starting setup
creating 32 CUDA streams...
creating 32 CUFFT plans with grid size 160 x 150 x 150...
creating 32 CUDA streams...
creating 32 CUDA streams...
creating 32 CUDA streams...
creating 32 CUFFT plans with grid size 160 x 150 x 150...
creating 32 CUFFT plans with grid size 160 x 150 x 150...
creating 32 CUFFT plans with grid size 160 x 150 x 150...

CUFFT Error in cuda_fft.cu, line 99: CUFFT_ALLOC_FAILED
 Failed to create CUFFT plan!

CUFFT Error in cuda_fft.cu, line 105: CUFFT_ALLOC_FAILED
 Failed to create CUFFT plan!
*****************************
Error running VASP parallel with MPI

#!/bin/bash
cd "/home/user/MD/TaskServer/Tasks/172.16.0.48-32000-task10539"
export PATH="/home/user/MD/Linux-x86_64/IntelMPI5/bin:$PATH"
export LD_LIBRARY_PATH="/home/user/MD/Linux-x86_64/IntelMPI5/lib:/home/user/MD/TaskServer/Tools/vasp-gpu6.1.1/Linux-x86_64:$LD_LIBRARY_PATH"
"/home/user/MD/Linux-x86_64/IntelMPI5/bin/mpirun" -r ssh  -np 4 "/home/user/MD/TaskServer/Tools/vasp-gpu6.1.1/Linux-x86_64/vasp_gpu"

child process exited abnormally
*****************************