I think this is nicer in general, and here in particular we have a lot
of code like:
static inline IteratedCache* hashmap_iterated_cache_new(Hashmap *h) {
return (IteratedCache*) _hashmap_iterated_cache_new(HASHMAP_BASE(h));
}
and it's visually appealing to use the same whitespace in the function
signature and the cast in the body of the function.
The compiler would do this to, esp. with LTO, but we can short-circuit the
whole process and make everything a bit simpler by avoiding the separate
definition.
(It would be nice to do the same for _set_new(), _set_ensure_allocated()
and other similar functions which are one-line trivial wrappers too. Unfortunately
that would require enum HashmapType to be made public, which we don't want
to do.)
Up to now the capability CAP_SETPCAP was raised implicitly in the
function capability_bounding_set_drop.
This functionality is moved into a new function
(capability_gain_cap_setpcap).
The new function optionally provides the capability set as it was
before raisining CAP_SETPCAP.
I think it's nicer to move it to the left, since the function
is already a pointer by itself, and it just happens to return a pointer,
and the two concepts are completely separate.
Instead of assuming that more-recently modified directories have higher mtime,
just look for any mtime changes, up or down. Since we don't want to remember
individual mtimes, hash them to obtain a single value.
This should help us behave properly in the case when the time jumps backwards
during boot: various files might have mtimes that in the future, but we won't
care. This fixes the following scenario:
We have /etc/systemd/system with T1. T1 is initially far in the past.
We have /run/systemd/generator with time T2.
The time is adjusted backwards, so T2 will be always in the future for a while.
Now the user writes new files to /etc/systemd/system, and T1 is updated to T1'.
Nevertheless, T1 < T1' << T2.
We would consider our cache to be up-to-date, falsely.
This allows us to properly detect mount points, for free. (Also, allows
us to respect btimes that are newer than the cutoff, which should be
useful when people untar file trees in /var/tmp)
Fixes: #16848
There is little point in #defining and #undefining CAP_LAST_CAP multiple times.
The check is only done in developer mode. After all, it's not an error to
compile on a newer kernel, and we shouldn't even warn in that case.
Switch from security_getenforce() and netlink notifications to the
SELinux status page.
This usage saves system calls and will also be the default in
libselinux > 3.1 [1].
[1]: 05bdc03130
Previously:
1. last_error wouldn't be updated with errors from is_dir;
2. We'd always issue a stat(), even for binaries without execute;
3. We used stat() instead of access(), which is cheaper.
This change avoids all of those, by only checking inside X_OK-positive
case whether access() works on the path with an extra slash appended.
Thanks to Lennart for the suggestion.
It's pretty useful to be able to access the bytes generically, without
acknowledging a specific family, hence let's a third way to access an
in_addr_union.
Imagine $PATH /a:/b. There is an echo command at /b/echo. Under this
configuration, this works fine:
% systemd-run --user --scope echo .
Running scope as unit: run-rfe98e0574b424d63a641644af511ff30.scope
.
However, if I do `mkdir /a/echo`, this happens:
% systemd-run --user --scope echo .
Running scope as unit: run-rcbe9369537ed47f282ee12ce9f692046.scope
Failed to execute: Permission denied
We check whether the resulting file is executable for the performing
user, but of course, most directories are anyway, since that's needed to
list within it. As such, another is_dir() check is needed prior to
considering the search result final.
Another approach might be to check S_ISREG, but there may be more gnarly
edge cases there than just eliminating this obviously pathological
example, so let's just do this for now.
When removing a directory tree as unprivileged user we might encounter
files owned by us but not deletable since the containing directory might
have the "r" bit missing in its access mode. Let's try to deal with
this: optionally if we get EACCES try to set the bit and see if it works
then.
Instead of defining the numbers only as fallback, always define our fallback
number, and if we have the real __NR_foo define, assert that our number matches
the real one.
This will result in warnings when our fallback number is not defined, even if
the kernel headers are new enough to define __NR_foo. This will probably annoy
people compiling for seldom-used architectures, but hopefully it'll provide
motivation to add the missing fallback defines.
The upside is that we have a higher chance of catching the cases where we got
the number wrong. Calling the wrong syscall is quite problematic, and with some
back luck, it might take us a long time to notice that we got the number wrong
on some rarely used architecture.
Also, rework some of the fallback wrappers to not call the syscall with a
negative number (that'd fail, but we'd got to the kernel and back). It seems
nicer to let the compiler know that this can never succeed.
Similarly to "setup" vs. "set up", "fallback" is a noun, and "fall back"
is the verb. (This is pretty clear when we construct a sentence in the
present continous: "we are falling back" not "we are fallbacking").
This has the major benefit that the entire payload of the container can
access these files there. Previously, we'd set them only as env vars,
but that meant only PID 1 could read them directly or other privileged
payload code with access to /run/1/environ.