简体   繁体   中英

understanding how textures work with CUDA

I got confused of how textures work with CUDA

as when I do device Query "on my GTX 780" I find this:

Maximum Texture Dimension Size (x,y,z)  1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)

now when I investigated CUDA "particles example", I found this:

checkCudaErrors(cudaBindTexture(0, oldPosTex, sortedPos, numParticles*sizeof(float4)));

where numParticles in my case I have raised it to 1024 * 1024 * 2 (around 2.1 millions)

how does this fit in the 1D texture??

also inside the kernels I've found this "need more explain please as everything here is connected"

texture<float4, 1, cudaReadModeElementType> oldPosTex;
#define FETCH(t, i) tex1Dfetch(t##Tex, i)

at kernel:

float4 pos = FETCH(oldPos, sortedIndex); 

now what I need to know also, I can use this texture "with its defined size numParticles*sizeof(float4) in a frame buffer draw instead of drawing a VBO?

how does this fit in the 1D texture?

The texture hardware consists of two main parts, the texture filtering hardware and the texture cache. Texture filtering includes functionality such as interpolation, addressing by normalized floating point coordinates and handling out-of-bounds addresses (clamp, wrap, mirror and border addressing modes). The texture cache can store data in a space filling curve to maximize 2D spatial locality (and thereby the cache hit rate). It can also store data in a regular flat array.

The Maximum Texture Dimension Size refers to limitations in the texture filtering hardware, not the texture caching hardware. And so, it refers to limits you may hit when using functions like tex2D() but not when using functions like tex1Dfetch() , which performs an unfiltered texture lookup. So, code you gave is probably setting things up for tex1Dfetch() .

need more explain please as everything here is connected

This question is too broad and may be why your question was downvoted.

now what I need to know also, I can use this texture "with its defined size numParticles*sizeof(float4) in a frame buffer draw instead of drawing a VBO?

This is not a CUDA question as CUDA cannot draw anything. You should look into CUDA OpenGL interop to see if your question is answered there. If it's not, you should create a new question and describe your question more clearly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM