Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not necessarily that "we cannot do your thing." Just that "we cannot do your thing using your lock. To get around this, simply make a new resource, to get a new lock."

Think of how in e.g. an IaaS control plane, when you delete a VM, it may take an arbitrarily-long time before you can create another VM with the same ID. (Maybe forever!) But you can always create a VM with a different ID, that otherwise fulfills all the same purposes (e.g. has the old instance's IP, FQDN, etc.) The old ID essentially has a distributed lock on its use, with an unbounded release time — and that's perfectly fine for the use-case.

For an example of fail-stalled being not only practical but preferred, consider tag-out locking systems (exclusive-access locks used to prevent machines from being turned on while maintenance is being performed on them.) If there was a digital lock of that type, you wouldn't want to ever automatically time it out. A human put that lock there, to keep them alive. They'll take it off when they're done. If you really suspect someone forgot to unlock the tag-out lock, you can always go and check with the lock's acquirer. But if you can't get in contact with them, you can't know that they don't still have their hands up in the gears of the machine. And in this case, failing to auto-restart the assembly line (until the "partition" is over and you can just ask the maintenance worker why they're still holding the lock) is worth much less than said maintenance worker's life.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: