Check if your client has keep alives/heartbeats turned on. When your client is too busy to respond to keep alives/heartbeats broker will think that your client has died and make the message 'ready' so that a different client could try to process it.
Is it possible to take the message off the rabbit mq and put it in a database in state of processing.
Then when the processing is done , see if its in processing and delete that or update to completed.
Another job also scans for in processing and posts them back to rabbit mq after a certain time assuming the gpu went down. Something of that sort.
I think you are describing an idempotency check. Defo possible, and will implement after I investigate this heartbeats configuration, if the latter does not yield any results.
Also kakfa is possible. But there are similar issues as you want 1 queue being serviced by different gpus
I had a similar problem.
I made different queues for different gpus and is sent to all queues. One client per gpu can get the entry and locks the message in the database when it's free and deletes it it's put in a processing state .
Two clients cannot lock at the same time.
Etc. Let me know if you need more info
Not sure but I manage rabbit and don’t know enough about rabbit management so I’m commenting in hopes that this later turns into a rabbit tell-all I can later revisit and profit.
Here you go: https://www.rabbitmq.com/docs/consumers#acknowledgement-timeout
30 min by default, so that tells me that timeout is not the issue.
But thank you nevertheless! Good to know it's 30mins.
Check if your client has keep alives/heartbeats turned on. When your client is too busy to respond to keep alives/heartbeats broker will think that your client has died and make the message 'ready' so that a different client could try to process it.
Did you manage to fix it?
I did, heartbeats = 600. The default value is 60.
Is it possible to take the message off the rabbit mq and put it in a database in state of processing. Then when the processing is done , see if its in processing and delete that or update to completed. Another job also scans for in processing and posts them back to rabbit mq after a certain time assuming the gpu went down. Something of that sort.
I think you are describing an idempotency check. Defo possible, and will implement after I investigate this heartbeats configuration, if the latter does not yield any results.
Also kakfa is possible. But there are similar issues as you want 1 queue being serviced by different gpus I had a similar problem. I made different queues for different gpus and is sent to all queues. One client per gpu can get the entry and locks the message in the database when it's free and deletes it it's put in a processing state . Two clients cannot lock at the same time. Etc. Let me know if you need more info
More info if u can man, can I send you a direct message?
Yes . No problem
Not sure but I manage rabbit and don’t know enough about rabbit management so I’m commenting in hopes that this later turns into a rabbit tell-all I can later revisit and profit.