简体   繁体   中英

LWJGL glLinkProgram takes a long time to process

My old computer(Lenovo Y40) had a dual graphics card setup between an AMD Radeon R9 M275 and some Intel integrated graphics card but I am not sure which graphics card it was using and my new computer(HP Spectre) has an Intel HD Graphics 620 card. I have been creating my own game library for a while now on my old computer and never had any issues. When I got my new computer and I transferred the code over, it ran significantly slower. I am using LWJGL 3. I have timed it and it takes about 400ms to do "glLinkProgram" on my new computer and it takes about 5ms on my old computer. It could be just cause of hardware difference but would it really just the difference between graphics card that changes the time by 395ms?! I am new to using opengl and graphics cards so I'm not sure. I personally don't believe that code is needed here because it is not my own code that is taking a while. It is the glLinkProgram method in GL20 of LWJGL. Is there anything that I can do or is this all hardware based?

EDIT

Code

Fragment Shader

#version 330 core

layout (location = 0) out vec4 color;


in DATA
{
    vec2 tc;
    vec3 position;
} fs_in;

struct Light
{
    vec2 pos;
    float size;
    float lowLightValue;
};

uniform Light lights[100];
uniform sampler2D tex;
uniform int enabled =0;

float high = 0;
float average =0;
bool isInsideLight = false;
vec4 highcol = vec4(0);


bool greater(vec4 l, vec4 r)
{
    float lbright = sqrt(0.2126*pow(l.r,2))+(0.7152*pow(l.g,2))+(0.0722*pow(l.b,2));
    float rbright = sqrt(0.2126*pow(r.r,2))+(0.7152*pow(r.g,2))+(0.0722*pow(r.b,2));
    if(lbright > rbright)
    {
        return true;
    }
    return false;
}

void main()
{
    color = texture(tex,fs_in.tc);
    if(enabled == 1)
    {
//      float len = length(fs_in.position.xy-lights[0].pos);
//      float lenr = len/lights[0].size;
//      float llv = lights[0].lowLightValue;
//      if(len > lights[0].size)
//      {
//          color *= llv;
//      }
//      else
//      {
//          color *= 1-((1 - llv)/lights[0].size)*len;
//      }
//      vec4 color2;
        for(int i =0;i<lights.length();i++)
        {
            if(lights[i].lowLightValue != 0)
            {
                float len = length(fs_in.position.xy-lights[i].pos);
                if(len <= lights[i].size)
                {
                    isInsideLight = true;
                    break;
                }
            }
        }
        int numLights=0;
        average =0;
        for(int i = 0;i < lights.length();i++)
        {

            if(lights[i].lowLightValue != 0)
            {
                float len = length(fs_in.position.xy-lights[i].pos);
                float llv = lights[i].lowLightValue;
                if(!isInsideLight)
                {
                    average += llv;
                    numLights++;
                }
                else
                {
                    if(len <= lights[i].size)
                    {
                        float num = 1-((1-llv)/lights[i].size)*len;
                        if(num > average)//Getting the highest
                        {
                            average = num;
                        }
                    }
                }
//              if((1/lenr) > 1)
//              {
//                  lenr = 0;
//              }
//              float col = (lenr*llv)+llv;
//              vec4 ncol = color*col;
//              if(greater(ncol,highcol))
//              {
//                  highcol = ncol;
//              }
                //if(col>high)
                //{
                //  high = col;
                //}
            }
            else
            {
                break;
            }
        }

        if(!isInsideLight)
            color *= average/numLights;
        else
            color *= average;
//      color = highcol;
    }
}

Vertex Shader

#version 330 core

layout (location = 0) in vec4 position;
layout (location = 1) in vec2 tc;

uniform mat4 pr_matrix;
uniform mat4 ml_matrix = mat4(1.0);
uniform mat4 vw_matrix = mat4(1.0);

out DATA
{
    vec2 tc;
    vec3 position;
} vs_out;

void main()
{
    gl_Position = pr_matrix * vw_matrix * ml_matrix * position;
    vs_out.tc = tc;
    vs_out.position = vec3(ml_matrix*position);
}

There is definitely difference between implementation of GLSL compiler which is part of graphics driver code. Intel compiler might not be able to do optimization that AMD does. This could be because of the power of hardware you have. Intels GPU are still not descrete GPUs so number of cores, number of processors and number of memories are limited. So the compiler is limited by the way it can do optimizations.By descrete GPU also means they dont have dedicated video memory to which vertex/fragment/texture processors can talk to. So all this has to happen through bus available on motherboard and it will take small portion of existing ram as video memory. ( I am not sure about new GPU by intel but thats what on chip GPU means)

You have uniform array of Light structure which internally is 4 floats. For all variables uniform or attributes compiler will allocate some slots which is nothing but some memory locations. If you consider 100 * 8 -> that many memory locations and 100 * 8 * sizeof(float) -> that much actual memory.

As you are passing this as uniform compiler or driver cant do any optimizations as the value of variable only will be knows at runtime. So you might be having only 2 lights still place for all 100 will be reserved and I think this is limitation of driver due to hardware its not able to optimally link it.

You could try to profile it on different hardware . Also try to reduce size of light array to just 1 and see if link time improves.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM