diff options
author | Ross Burton <ross.burton@arm.com> | 2024-10-14 19:38:15 +0100 |
---|---|---|
committer | Richard Purdie <richard.purdie@linuxfoundation.org> | 2024-10-15 11:47:24 +0100 |
commit | f85c68cb9cb93ef55f69620fa56ecc2a67408aa7 (patch) | |
tree | e976e17dd817497ce2f6521c300d4734bc2f674a /scripts/lib/devtool/sdk.py | |
parent | 01d0ef0bcc822cace4364e5e68e984ba794ce743 (diff) | |
download | poky-f85c68cb9cb93ef55f69620fa56ecc2a67408aa7.tar.gz |
insane: avoid race condition when DEBIAN/CONTROL entries are removed
There is a race condition when iterating directories which are being
altered whilst iterating, which is something that can and does happen
when do_package_qa runs at the same time as eg do_package_write_ipkg
(the opkg metadata is written inside the build tree). The race is that
naive code will list a directory contents and then stat() each name to
determine if its a directory or file. The classic failure that we see
is that CONTROL/ is found on a listdir but deleted by the time the stat
happens, so is incorrectly listed as a file (because it is not a
directory).
Since Python 3.5, os.walk() uses scandir() instead of listdir() which
mitigates this race by returning the file type alongside the name, so
a stat is no longer needed to identify the type.
However, cachedpath.walk() was copied from Python before this, so it
uses listdir() and has this race condition. Since I changed insane to
use cachedpath.walk()[1] I inadvertently reintroduced this race.
I believe there's actually no need to use cachedpath.walk() and a
logical fix is to simply use os.walk():
With os.walk() each directory is listed and categorised in a single
os.scandir() as the underlying syscall, getdents64, returns the type.
However, cachedpath.walk() uses os.listdir() which ignores the type
field returned and has to do a stat() on every file to determine the
type.
Thus, we should switch users of cachedpath.walk() to os.walk(): there's
no real gain in what is effectively just a prefetch for the stat cache,
but depending on what the calling code does may result in more stat()
calls than needed.
In the future we may want to redesign cachedpath to reimplement walk so
that it can also cache the DirEntry instances as returned by scandir()
as that will avoid needing to call stat() at all in many cases. However
I believe we should instead use a caching pathlib.Path instance instead.
[1] cad3c8 insane: use oe.cachedpath.CachedPath instead of os.path
(From OE-Core rev: 22e4486d65e4874bf48d89160d69118f318278e8)
Signed-off-by: Ross Burton <ross.burton@arm.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Diffstat (limited to 'scripts/lib/devtool/sdk.py')
0 files changed, 0 insertions, 0 deletions