Commit Graph

798 Commits

Author SHA1 Message Date
Yann E. MORIN
1d9ed17178 basic/fileio: we always have O_TMPFILE now
fileio makes use of O_TMPFILE when it is available.

We now always have O_TMPFILE, defined in missing.h if missing
from the toolchain headers.

Have fileio include missing.h and drop the guards around the
use of O_TMPFILE.
2016-08-29 12:49:10 +02:00
Yann E. MORIN
daad709a7c missing.h: add missing definitions for __O_TMPFILE
Currently, a missing __O_TMPFILE was only defined for i386 and x86_64,
leaving any other architectures with an "old" toolchain fail miserably
at build time:
    src/import/export-raw.c: In function 'reflink_snapshot':
    src/import/export-raw.c:271:26: error: 'O_TMPFILE' undeclared (first use in this function)
             new_fd = open(d, O_TMPFILE|O_CLOEXEC|O_NOCTTY|O_RDWR, 0600);
                              ^

__O_TMPFILE (and O_TMPFILE) are available since glibc 2.19. However, a
lot of existing toolchains are still using glibc-2.18, and some even
before that, and it is not really possible to update those toolchains.

Instead of defining it only for i386 and x86_64, define __O_TMPFILE
with the specific values for those archs where it is different from the
generic value. Use the values as found in the Linux kernel (v4.8-rc3,
current as of time of commit).

---
Note: tested on ARM (build+run), with glibc-2.18 and linux headers 3.12.
Untested on other archs, though (I have no board to test this).

Changes v1 -> v2:
  - add a comment specifying some are hexa, others are octal.
2016-08-29 12:40:22 +02:00
Zbigniew Jędrzejewski-Szmek
2056ec1927 Merge pull request #3965 from htejun/systemd-controller-on-unified 2016-08-19 19:58:01 -04:00
0xAX
e6c9fa74a5 terminal-util: remove unnecessary check of result of isatty() (#4000)
After the call of the isatty() we check its result twice in the
open_terminal(). There are no sense to check result of isatty() that
it is less than zero and return -errno, because as described in
documentation:

isatty() returns 1 if fd is an open file descriptor referring to a
terminal;  otherwise 0 is returned, and errno is set to indicate the
error.

So it can't be less than zero.
2016-08-19 18:51:54 -04:00
Evgeny Vereshchagin
29272c04a7 Merge pull request #3909 from poettering/mount-tool
add a new tool for creating transient mount and automount units
2016-08-19 23:33:49 +03:00
Lennart Poettering
16d901e251 Merge pull request #3987 from keszybz/console-color-setup
Rework console color setup
2016-08-19 19:36:09 +02:00
Zbigniew Jędrzejewski-Szmek
acf553b04d terminal-util: use getenv_bool for $SYSTEMD_COLORS
This changes the semantics a bit: before, SYSTEMD_COLORS= would be treated as
"yes", same as SYSTEMD_COLORS=xxx and SYSTEMD_COLORS=1, and only
SYSTEMD_COLORS=0 would be treated as "no". Now, only valid booleans are treated
as "yes". This actually matches how $SYSTEMD_COLORS was announced in NEWS.
2016-08-19 11:57:37 -04:00
Zbigniew Jędrzejewski-Szmek
158fbf7661 systemd: ignore lack of tty when checking whether colors should be enabled
When started by the kernel, we are connected to the console, and we'll set TERM
properly to some value in fixup_environment(). We'll then enable or disable
colors based on the value of $SYSTEMD_COLORS and $TERM.

When reexecuting, TERM should be already set, so we can use this value.
Effectively, behaviour is the same as before affd7ed1a was reverted, but instead
of reopening the console before configuring color output, we just ignore what
stdout is connected to and decide based on the variables only.
2016-08-19 11:34:22 -04:00
Lennart Poettering
cbf138ebef Merge pull request #3988 from keszybz/journald-dynamic-users
Journald dynamic users
2016-08-19 10:41:26 +02:00
Zbigniew Jędrzejewski-Szmek
61755fdae0 journald: do not create split journals for dynamic users
Dynamic users should be treated like system users, and their logs
should end up in the main system journal.
2016-08-18 23:34:40 -04:00
Tejun Heo
f50582649f logind: update empty and "infinity" handling for [User]TasksMax (#3835)
The parsing functions for [User]TasksMax were inconsistent.  Empty string and
"infinity" were interpreted as no limit for TasksMax but not accepted for
UserTasksMax.  Update them so that they're consistent with other knobs.

* Empty string indicates the default value.
* "infinity" indicates no limit.

While at it, replace opencoded (uint64_t) -1 with CGROUP_LIMIT_MAX in TasksMax
handling.

v2: Update empty string to indicate the default value as suggested by Zbigniew
    Jędrzejewski-Szmek.

v3: Fixed empty UserTasksMax handling.
2016-08-18 22:57:53 -04:00
Lennart Poettering
2ae0858e6c hostnamectl: rework pretty hostname validation (#3985)
Rework 17eb9a9ddb a bit.

Let's make sure we don't clobber the input parameter args[1], following our
coding style to not clobber parameters unless explicitly indicated. (in
particular, as we don't want to have our changes appear in the command line
shown in "ps"...)

No functional change.
2016-08-18 21:16:16 -04:00
Lennart Poettering
450442cf93 add a new tool for creating transient mount and automount units
This adds "systemd-mount" which is for transient mount and automount units what
"systemd-run" is for transient service, scope and timer units.

The tool allows establishing mounts and automounts during runtime. It is very
similar to the usual /bin/mount commands, but can pull in additional
dependenices on access (for example, it pulls in fsck automatically), an take
benefit of the automount logic.

This tool is particularly useful for mount removable file systems (such as USB
sticks), as the automount logic (together with automatic unmount-on-idle), as
well as automatic fsck on first access ensure that the removable file system
has a high chance to remain in a fully clean state even when it is unplugged
abruptly, and returns to a clean state on the next re-plug.

This is a follow-up for #2471, as it adds a simple client-side for the
transient automount logic added in that PR.

In later work it might make sense to invoke this tool automatically from udev
rules in order to implement a simpler and safer version of removable media
management á la udisks.
2016-08-18 22:41:19 +02:00
Tejun Heo
5da38d0768 core: use the unified hierarchy for the systemd cgroup controller hierarchy
Currently, systemd uses either the legacy hierarchies or the unified hierarchy.
When the legacy hierarchies are used, systemd uses a named legacy hierarchy
mounted on /sys/fs/cgroup/systemd without any kernel controllers for process
management.  Due to the shortcomings in the legacy hierarchy, this involves a
lot of workarounds and complexities.

Because the unified hierarchy can be mounted and used in parallel to legacy
hierarchies, there's no reason for systemd to use a legacy hierarchy for
management even if the kernel resource controllers need to be mounted on legacy
hierarchies.  It can simply mount the unified hierarchy under
/sys/fs/cgroup/systemd and use it without affecting other legacy hierarchies.
This disables a significant amount of fragile workaround logics and would allow
using features which depend on the unified hierarchy membership such bpf cgroup
v2 membership test.  In time, this would also allow deleting the said
complexities.

This patch updates systemd so that it prefers the unified hierarchy for the
systemd cgroup controller hierarchy when legacy hierarchies are used for kernel
resource controllers.

* cg_unified(@controller) is introduced which tests whether the specific
  controller in on unified hierarchy and used to choose the unified hierarchy
  code path for process and service management when available.  Kernel
  controller specific operations remain gated by cg_all_unified().

* "systemd.legacy_systemd_cgroup_controller" kernel argument can be used to
  force the use of legacy hierarchy for systemd cgroup controller.

* nspawn: By default nspawn uses the same hierarchies as the host.  If
  UNIFIED_CGROUP_HIERARCHY is set to 1, unified hierarchy is used for all.  If
  0, legacy for all.

* nspawn: arg_unified_cgroup_hierarchy is made an enum and now encodes one of
  three options - legacy, only systemd controller on unified, and unified.  The
  value is passed into mount setup functions and controls cgroup configuration.

* nspawn: Interpretation of SYSTEMD_CGROUP_CONTROLLER to the actual mount
  option is moved to mount_legacy_cgroup_hierarchy() so that it can take an
  appropriate action depending on the configuration of the host.

v2: - CGroupUnified enum replaces open coded integer values to indicate the
      cgroup operation mode.
    - Various style updates.

v3: Fixed a bug in detect_unified_cgroup_hierarchy() introduced during v2.

v4: Restored legacy container on unified host support and fixed another bug in
    detect_unified_cgroup_hierarchy().
2016-08-17 17:44:36 -04:00
Tejun Heo
ca2f6384aa core: rename cg_unified() to cg_all_unified()
A following patch will update cgroup handling so that the systemd controller
(/sys/fs/cgroup/systemd) can use the unified hierarchy even if the kernel
resource controllers are on the legacy hierarchies.  This would require
distinguishing whether all controllers are on cgroup v2 or only the systemd
controller is.  In preparation, this patch renames cg_unified() to
cg_all_unified().

This patch doesn't cause any functional changes.
2016-08-15 18:13:36 -04:00
Zbigniew Jędrzejewski-Szmek
5f9a610ad2 Merge pull request #3905 from htejun/cgroup-v2-cpu
core: add cgroup CPU controller support on the unified hierarchy

(zj: merging not squashing to make it clear against which upstream this patch was developed.)
2016-08-14 18:03:35 -04:00
Tejun Heo
66ebf6c0a1 core: add cgroup CPU controller support on the unified hierarchy
Unfortunately, due to the disagreements in the kernel development community,
CPU controller cgroup v2 support has not been merged and enabling it requires
applying two small out-of-tree kernel patches.  The situation is explained in
the following documentation.

 https://git.kernel.org/cgit/linux/kernel/git/tj/cgroup.git/tree/Documentation/cgroup-v2-cpu.txt?h=cgroup-v2-cpu

While it isn't clear what will happen with CPU controller cgroup v2 support,
there are critical features which are possible only on cgroup v2 such as
buffered write control making cgroup v2 essential for a lot of workloads.  This
commit implements systemd CPU controller support on the unified hierarchy so
that users who choose to deploy CPU controller cgroup v2 support can easily
take advantage of it.

On the unified hierarchy, "cpu.weight" knob replaces "cpu.shares" and "cpu.max"
replaces "cpu.cfs_period_us" and "cpu.cfs_quota_us".  [Startup]CPUWeight config
options are added with the usual compat translation.  CPU quota settings remain
unchanged and apply to both legacy and unified hierarchies.

v2: - Error in man page corrected.
    - CPU config application in cgroup_context_apply() refactored.
    - CPU accounting now works on unified hierarchy.
2016-08-07 09:45:39 -04:00
Zbigniew Jędrzejewski-Szmek
3bb81a80bd Merge pull request #3818 from poettering/exit-status-env
beef up /var/tmp and /tmp handling; set $SERVICE_RESULT/$EXIT_CODE/$EXIT_STATUS on ExecStop= and make sure root/nobody are always resolvable
2016-08-05 20:55:08 -04:00
Lennart Poettering
ceab9e2dee Merge pull request #3900 from keszybz/fix-3607
Fix 3607
2016-08-05 17:03:09 +02:00
Lennart Poettering
41bf0590cc util-lib: unify parsing of nice level values
This adds parse_nice() that parses a nice level and ensures it is in the right
range, via a new nice_is_valid() helper. It then ports over a number of users
to this.

No functional changes.
2016-08-05 11:18:32 +02:00
Vito Caputo
82b103a7ce fileio: fix MIN/MAX mixup (#3896)
The intention is to clamp the value to READ_FULL_BYTES_MAX, which
would be the minimum of the two.
2016-08-05 00:09:23 -04:00
Zbigniew Jędrzejewski-Szmek
163f561e2a basic/set: remove some spurious spaces 2016-08-04 23:53:07 -04:00
Vito Caputo
c2d11a6302 fileio: fix read_full_stream() bugs (#3887)
read_full_stream() _always_ allocated twice the memory needed, due to
only breaking the realloc() && fread() loop when fread() returned 0,
requiring another iteration and exponentially enlarged buffer just to
discover the EOF condition.

This also caused file sizes >2MiB && <= 4MiB to erroneously be treated
as E2BIG, due to the inappropriately doubled buffer size exceeding
4*1024*1024.

Also made the 4*1024*1024 magic number a READ_FULL_BYTES_MAX constant.
2016-08-04 22:52:02 +02:00
Lennart Poettering
992e8f224b util-lib: rework /tmp and /var/tmp handling code
Beef up the existing var_tmp() call, rename it to var_tmp_dir() and add a
matching tmp_dir() call (the former looks for the place for /var/tmp, the
latter for /tmp).

Both calls check $TMPDIR, $TEMP, $TMP, following the algorithm Python3 uses.
All dirs are validated before use. secure_getenv() is used in order to limite
exposure in suid binaries.

This also ports a couple of users over to these new APIs.

The var_tmp() return parameter is changed from an allocated buffer the caller
will own to a const string either pointing into environ[], or into a static
const buffer. Given that environ[] is mostly considered constant (and this is
exposed in the very well-known getenv() call), this should be OK behaviour and
allows us to avoid memory allocations in most cases.

Note that $TMPDIR and friends override both /var/tmp and /tmp usage if set.
2016-08-04 16:27:07 +02:00
David Michael
5124866d73 util-lib: add parse_percent_unbounded() for percentages over 100% (#3886)
This permits CPUQuota to accept greater values as documented.
2016-08-04 13:09:54 +02:00
Zbigniew Jędrzejewski-Szmek
19c8201744 Merge pull request #3820 from poettering/nspawn-resolvconf
nspawn resolv.conf handling improvements, and inherit $TERM all the way through nspawn → console login
2016-08-03 20:14:32 -04:00
Lennart Poettering
21b3a0fcd1 util-lib: make timestamp generation and parsing reversible (#3869)
This patch improves parsing and generation of timestamps and calendar
specifications in two ways:

- The week day is now always printed in the abbreviated English form, instead
  of the locale's setting. This makes sure we can always parse the week day
  again, even if the locale is changed. Given that we don't follow locale
  settings for printing timestamps in any other way either (for example, we
  always use 24h syntax in order to make uniform parsing possible), it only
  makes sense to also stick to a generic, non-localized form for the timestamp,
  too.

- When parsing a timestamp, the local timezone (in its DST or non-DST name)
  may be specified, in addition to "UTC". Other timezones are still not
  supported however (not because we wouldn't want to, but mostly because libc
  offers no nice API for that). In itself this brings no new features, however
  it ensures that any locally formatted timestamp's timezone is also parsable
  again.

These two changes ensure that the output of format_timestamp() may always be
passed to parse_timestamp() and results in the original input. The related
flavours for usec/UTC also work accordingly. Calendar specifications are
extended in a similar way.

The man page is updated accordingly, in particular this removes the claim that
timestamps systemd prints wouldn't be parsable by systemd. They are now.

The man page previously showed invalid timestamps as examples. This has been
removed, as the man page shouldn't be a unit test, where such negative examples
would be useful. The man page also no longer mentions the names of internal
functions, such as format_timestamp_us() or UNIX error codes such as EINVAL.
2016-08-03 19:04:53 -04:00
Lennart Poettering
6af760f3b2 core: inherit TERM from PID 1 for all services started on /dev/console
This way, invoking nspawn from a shell in the best case inherits the TERM
setting all the way down into the login shell spawned in the container.

Fixes: #3697
2016-08-03 14:52:16 +02:00
Lennart Poettering
5e0bb1a628 Merge pull request #3828 from keszybz/drop-systemd-vconsole-setup-service
Update documentation for systemd-vconsole-setup
2016-08-03 14:38:36 +02:00
Leonardo Brondani Schenkel
aa0c34279e virt: detect bhyve (FreeBSD hypervisor) (#3840)
The CPUID and DMI vendor strings do not seem to be documented.
Values were found experimentally and by inspecting the source code.
2016-08-01 09:04:49 -04:00
Zbigniew Jędrzejewski-Szmek
2d37cd5356 Add enable_disable() helper
In this patch "enabled" and "disabled" is used exclusively, but "enable" and
"disable" forms are need for the following patch.
2016-07-31 22:48:22 -04:00
Michael Biebl
b6b609dbc2 string-util: rework memory_erase() to not use GCC optimize attribute (#3812)
"#pragma GCC optimize" is merely a convenience to decorate multiple
functions with attribute optimize. And the manual has this to say about
this attribute:

  This attribute should be used for debugging purposes only. It
  is not suitable in production code.

Some versions of GCC also seem to have a problem with this pragma in
combination with LTO, resulting in ICEs.

So use a different approach (indirect the memset call via a volatile
function pointer) as implemented in openssl's crypto/mem_clr.c.

Closes: #3811
2016-07-26 23:32:37 -04:00
Zbigniew Jędrzejewski-Szmek
dadd6ecfa5 Merge pull request #3728 from poettering/dynamic-users 2016-07-25 16:40:26 -04:00
Zbigniew Jędrzejewski-Szmek
e28973ee18 Merge pull request #3757 from poettering/efi-search 2016-07-25 16:34:18 -04:00
Lennart Poettering
1a0b98c437 Merge pull request #3589 from brauner/cgroup_namespace
Cgroup namespace
2016-07-25 22:23:00 +02:00
Lennart Poettering
87410f166e fileio: imply /tmp as directory if passed as NULL to open_tmpfile_unlinkable()
We can make this smarter one day, to honour $TMPDIR and friends, but for now,
let's just use /tmp.
2016-07-25 20:35:04 +02:00
Alban Crequy
98df8089be namespace: don't fail on masked mounts (#3794)
Before this patch, a service file with ReadWriteDirectories=/file...
could fail if the file exists but is not a mountpoint, despite being
listed in /proc/self/mountinfo. It could happen with masked mounts.

Fixes https://github.com/systemd/systemd/issues/3793
2016-07-25 15:39:46 +02:00
Zbigniew Jędrzejewski-Szmek
31b14fdb6f Merge pull request #3777 from poettering/id128-rework
uuid/id128 code rework
2016-07-22 21:18:41 -04:00
Lennart Poettering
5052c4eadd Merge pull request #3753 from poettering/tasks-max-scale
Add support for relative TasksMax= specifications, and bump default for services
2016-07-22 17:40:12 +02:00
Alessandro Puccetti
b3d1d51603 namespace: ensure to return a valid inaccessible nodes (#3778)
Because /run/systemd/inaccessible/{chr,blk} are devices with
major=0 and minor=0 it might be possible that these devices cannot be created
so we use /run/systemd/inaccessible/sock instead to map them.
2016-07-22 15:59:14 +02:00
Lennart Poettering
29206d4619 core: add a concept of "dynamic" user ids, that are allocated as long as a service is running
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).

For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.

A simple way to test this is:

        systemd-run -p DynamicUser=1 /bin/sleep 99999
2016-07-22 15:53:45 +02:00
Lennart Poettering
e4631b48e1 sysusers: move various user credential validity checks to src/basic/
This way we can reuse them for validating User=/Group= settings in unit files
(to be added in a later commit).

Also, add some tests for them.
2016-07-22 15:53:45 +02:00
Lennart Poettering
83f8e80857 core: support percentage specifications on TasksMax=
This adds support for a TasksMax=40% syntax for specifying values relative to
the system's configured maximum number of processes. This is useful in order to
neatly subdivide the available room for tasks within containers.
2016-07-22 15:33:12 +02:00
Lennart Poettering
910fd145f4 sd-id128: split UUID file read/write code into new id128-util.[ch]
We currently have code to read and write files containing UUIDs at various
places. Unify this in id128-util.[ch], and move some other stuff there too.

The new files are located in src/libsystemd/sd-id128/ (instead of src/shared/),
because they are actually the backend of sd_id128_get_machine() and
sd_id128_get_boot().

In follow-up patches we can use this reduce the code in nspawn and
machine-id-setup by adopted the common implementation.
2016-07-22 12:59:36 +02:00
Martin Pitt
bf3dd08a81 Merge pull request #3762 from poettering/sigkill-log
log about all processes we forcibly kill
2016-07-22 09:18:30 +02:00
Martin Pitt
5c3c778014 Merge pull request #3764 from poettering/assorted-stuff-2
Assorted fixes
2016-07-22 09:10:04 +02:00
Alessio Igor Bogani
4d07c8d386 missing_syscall: add __NR_copy_file_range for powerpc architecture (#3772) 2016-07-21 11:40:35 +02:00
Lennart Poettering
846b8fc30d bootctl: move toupper() implementation to string-util.h
We already have tolower() calls there, hence let's unify this at one place.
Also, update the code to only use ASCII operations, so that we don't end up
being locale dependant.
2016-07-21 11:37:58 +02:00
Lennart Poettering
c0f81393d1 basic: fix macro definition in nss-util.h
Fix a copy/paste mistake.
2016-07-20 14:53:15 +02:00
Lennart Poettering
0d5b481092 cgroup: suppress sending follow-up SIGCONT after sending SIGCONT/SIGKILL anyway 2016-07-20 14:35:15 +02:00