I have gone through this https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-custom-probe-overview but i haven't found an answer
Problem: i have tensorflow applications running on individual VMs hosted by gunicorn + flask application. The intention is to ensure every VM gets only one request at a time. So we have configured our app in such a way that during a request being processed , if we receive another, we simply send a BUSY code back (non 200 response) ..now this fails the health probe BUT we have no idea when and how it adds this VM back to the pool since in reality , this VM was just busy and NOT in poor health..since azure LB doesn't understand application running on VMs we didn't know how else to solve this
But we are seeing a lot of timeouts, poor utilisation of existing VMs etc when we use the above approach, prompting us to wonder if the "poor health" guys are even being recalled..azure documentation and support is really poor ..any pointers please?
As per the documentation here Load balancer operates on layer 4 and doesn't provide application layer gateway functionality. You could try following steps to better understand the workflow and accordingly configure your LB for better efficiency.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.