calculate a picture with cuda and display it directly with OpenGL
calculate a picture with cuda and display it directly with OpenGL
I would like to write a program which calculates me a picture (actually a hologram for a Spatial Light Modulator (SLM)). This should happen in real time. The picture should be calculated on the GPU and then displayed from there directly on the screen (800x600 pixels). I would like to use cuda and OpenGL. I wrote a little program myself which is just an example which displays a checker board on the screen. It is not working as I don't know how to pass the picture from cuda to OpenGL. Especially I do not know what an image rescource is. How do I declare it. How do I assign the calculated picture to it?
Here is my code:
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
#include <GLglew.h>
#include <GLfreeglut.h>
#include "cuda_gl_interop.h"
/* Create checkerboard texture */
#define checkImageWidth 1024
#define checkImageHeight 1024
#define SIZE_X 1024
#define SIZE_Y 1024
static GLubyte checkImage[ 1024 ][ 1024 ][ 1 ];
/*static GLubyte checkImage[1024][1024][1];*/
static GLuint texName;
// Texture reference for 2D float texture
float tex[ 1024 ][ 1024 ];
float dA[ 1024 * 1024 ];
// 2D float texture
texture<float, cudaTextureType2D, cudaReadModeElementType> texRef;
float *d_A;
size_t dsize = 1024 * 1024 * sizeof( float );
struct mystruct
int x;
int y;
;
void makeCheckImage( void )
int i, j, c;
for( i = 0; i < 600; i++ )
for( j = 0; j < 800; j++ )
c = ( ( ( ( i % 2 ) == 0 ) ) ^ ( j % 2 == 0 ) ) * 255;
checkImage[ i ][ j ][ 0 ] = (GLubyte)c;
__global__ void cudaMakeCheckImage( float *c )
int col = threadIdx.x + blockIdx.x * blockDim.x;
int row = threadIdx.y + blockIdx.y * blockDim.y;
int index = col + row * 1024;
if( col < 1024 && row < 1024 )
c[ index ] = ( ( ( ( col % 2 ) == 0 ) ) ^ ( row % 2 == 0 ) ) * 255;
void init( void )
glClearColor( 0.0, 0.0, 0.0, 0.0 );
glShadeModel( GL_FLAT );
glEnable( GL_DEPTH_TEST );
cudaMakeCheckImage << <1024, 1024 >> > ( d_A );
glPixelStorei( GL_UNPACK_ALIGNMENT, 1 );
//makeCheckImage();
glGenTextures( 1, &texName );
glBindTexture( GL_TEXTURE_2D, texName );
// set basic parameters
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_NEAREST );
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_NEAREST );
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST );
glTexParameteri( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST );
// Create texture data
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGB, checkImageWidth, checkImageHeight, 0, GL_LUMINANCE, GL_UNSIGNED_BYTE, checkImage );
// Unbind the texture
glBindTexture( GL_TEXTURE_2D, 0 );
cudaMalloc( &d_A, dsize );
cudaGraphicsResource* Res;
// Allocate CUDA array in device memory
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc( 32, 0, 0, 0, cudaChannelFormatKindFloat );
cudaArray* CuArr;
cudaMallocArray( &CuArr, &channelDesc, 1024, 1024 );
cudaError_t eError = cudaGraphicsGLRegisterImage( &Res, texName, GL_TEXTURE_2D, cudaGraphicsMapFlagsWriteDiscard );
cudaGraphicsMapResources( 1, &Res, 0 );
cudaMemcpy2DToArray( CuArr, 0, 0, d_A, 1024, 1024, 1024, cudaMemcpyDeviceToDevice );
cudaGraphicsSubResourceGetMappedArray( &CuArr, Res, 0, 0 );
cudaGraphicsUnmapResources( 1, &Res, 0 );
void display( void )
glClear( GL_COLOR_BUFFER_BIT
void keyboard( unsigned char key, int x, int y )
switch( key )
case 27:
exit( 0 );
break;
default:
break;
int main( int argc, char** argv )
GLUT_RGB
Is this the right way to do it? Or do I have to use framebuffers. I actually don't want to. I would like to keep it as simple as possible. What do I have to change to make it work?
1 Answer
1
I think you can look at DRM/ DRI or Linux flat framebuffers. You can refer to DirectFB project http://www.webos-internals.org/wiki/Directfb.You will need fbDev0 module for that and may need to recompile your kernel with that module. I am assuming you are using linux.
So what here you are trying to do is bypassing whole API layer and directly trying to manipulate framebuffer. DRM is module in kernel that manages access to GPU resources and hence you might use that.
On Windows you can write mini filter driver that directly writes to frambuffer or you can use something like http://www.blackhat.com/presentations/win-usa-04/bh-win-04-butler.pdf. This is Direct Kernel Object Manipulation.
ok I think this is a bit to advanced for me. In the moment I just want to get it working at all. I managed to get the code working and postet it as as an asnwer to my question, but I get only a black frame instead of a black and white pattern, as wished. Can you please tell me what I am doing wrong?
– Marcel Rudolf
Sep 17 '18 at 15:30
I am not sure what you tried ? Are you saying you tried what I suggested and not getting right out put or you are not getting right out put for what currently you are doing in code you posted in original post?
– Paritosh Kulkarni
Sep 17 '18 at 15:46
I tried to get the right output from the post I answered. The links you sent me, are topics where I don't know what to do. Actually I would like to have a working code... I think I don't have to write drivers. There must be a solution just with cuda and with openGL, don't you think so?
– Marcel Rudolf
Sep 17 '18 at 16:08
Basically I was trying to answer your comment that you want it as fast as possible and without using framebuffer in opengl. I was not knowing your current code doesnt work
– Paritosh Kulkarni
Sep 17 '18 at 16:15
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you agree to our terms of service, privacy policy and cookie policy
There are plenty of similar questions in SO. Almost sure you find an answer that solves your problem.
– Ripi2
Sep 17 '18 at 23:25