Commit graph

3834 commits

Author SHA1 Message Date
Thomas Lamprecht
78e5d8a8d0 bump version to 9.0.23
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-10-03 22:12:27 +02:00
Fiona Ebner
86e52f8660 tests: cfg2cmd: regenerate with QEMU 10.1 binary
For the rationale about the change see "cfg2cmd: turn off hpet for
Linux VMs running at least kernel 2.6 and machine type >= 10.1".

Bump the build dependency in d/control.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250929125852.102343-7-f.ebner@proxmox.com
2025-10-03 21:56:18 +02:00
Fiona Ebner
3a4e014474 cfg2cmd: turn off hpet for Linux VMs running at least kernel 2.6 and machine type >= 10.1
Recent enough Linux versions already use 'kvm-clock' rather than
'hpet' as the default clock source [0][1]. Changes in QEMU [2] led to
slightly increased CPU usage when using hpet [3][4]:

> the timer must be kept running even if not enabled, in
> order to set the ISR flag, so writes to HPET_TN_CFG must
> not call hpet_del_timer()

Upstream suggested to not use hpet if possible [5][6]:

> That said, if you can disable the HPET timer by default without
> problems with e.g. live migration I strongly suggest you do. And in
> the mean time you can also revert these patches, they were actually
> reported as bugs but it's not clear what guest OS was affected.

> No, the bug reports are really just for corner cases and there are no
> huge issues. However, both Linux and Windows give the HPET a
> relatively high priority that it probably does not deserve. :)

There were more changes in QEMU, so it would require more reverts.
Thus, disable the timer. People having a Linux VM pinned to an older
machine version or using other os types will see the increased usage
again if installing the new QEMU 10.1 binary, but that seems like a
fair trade-off for reducing CPU load for everybody else and being able
to move forward.

The is_linux() helper does not include the 'l24' os type by default,
because all except one existing checks as well as the newly introduced
check are specifically for 'l26' and most future features are not
worth considering for 'l24' either.

Users of Linux 2.6.x before v2.6.26 might need to pin the machine
version or manually enable hpet if they want to continue using HPET.
Otherwise, there is acpi_pm since v2.6.18 that should be automatically
picked.

[0]: /sys/devices/system/clocksource/clocksource0/current_clocksource
[1]: Kernel commit 790c73f6289a ("x86: KVM guest: paravirtualized clocksource") in v2.6.26+
[2]: QEMU commit f0ccf77078 ("hpet: fix and cleanup persistence of interrupt status")
[3]: https://lore.kernel.org/qemu-devel/8183674f-a9cc-4727-bb52-fe3d3e44804b@proxmox.com/
[4]: https://forum.proxmox.com/threads/161849/post-756793
[5]: https://lore.kernel.org/qemu-devel/CABgObfaKJ5NFVKmYLFmu4C0iZZLJJtcWksLCzyA0tBoz0koZ4A@mail.gmail.com/
[6]: https://lore.kernel.org/qemu-devel/CABgObfYnOzg=BPeG5BjSmGEV_Q0pR7xGg6L3XNQCONtU_GiuGA@mail.gmail.com/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250929125852.102343-6-f.ebner@proxmox.com
2025-10-03 21:56:18 +02:00
Fiona Ebner
dcaf59736b config: schema: define default OS type
Like this, the Cfg2Cmd module can fill in the default and only needs
to check for definedness once.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250929125852.102343-5-f.ebner@proxmox.com
2025-10-03 21:56:18 +02:00
Fiona Ebner
7b6cb366bf introduce dedicated cfg2cmd module
Having a dedicated Cfg2Cmd class allows having a cleaner interface by
only calling into pre-defined methods. Important, global information
about the VM like machine type or OS version will be recorded by the
object and can be queried via methods. For now, there is only
windows_version(). There will be sub-classes, each concerning a
dedicated part of the configuration. The first one is for the timer.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250929125852.102343-4-f.ebner@proxmox.com
2025-10-03 21:56:18 +02:00
Fiona Ebner
3c46c7ddb3 tests: add tests for non-{Linux, Windows} OS types
For 'l24', use 'vga: qxl' since that is the only place where a Linux
check is done not only for 'l26'.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250929125852.102343-3-f.ebner@proxmox.com
2025-10-03 21:56:18 +02:00
Fiona Ebner
b795c0a390 tests: cfg2cmd: add tests for startdate parameter and disabled time drift fix
No test currently cover these settings.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250929125852.102343-2-f.ebner@proxmox.com
2025-10-03 21:56:18 +02:00
Fiona Ebner
33c0169fef backup: fleecing: avoid warning when querying block node size for TPM state
Currently, there only is a warning that the fallback, being the size
queried from the storage, is used. This should work in all cases, but
there are plans for supporting TPM state as a FUSE/NBD export from an
underlying qcow2 image where it might still work, but the
correspondence of size between the attached block node in QEMU and the
storage layer already becomes much more blurry. Avoid the warning and
future-proof by also querying the size for the TPM state directly from
the attached block node in QEMU.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20251003103319.44974-3-f.ebner@proxmox.com
2025-10-03 13:48:39 +02:00
Fiona Ebner
cd82e3f270 fix #6882: backup provider api: fix backup with TPM state by correctly generating node name
The backup-access API in QEMU expects the '-backup' suffix to be
present for the TPM state fleecing image too. This is a regression of
the switch to using blockdev for fleecing images with commit f92c1fa0
("backup: use blockdev for fleecing images"). Add special handling to
the fleecing_node_name() helper to fix it.

Fleecing backups to native plugins do not use a dedicated image for
the TPM state, so this only affected the backup provider API.

Fixes: f92c1fa0 ("backup: use blockdev for fleecing images")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20251003103319.44974-2-f.ebner@proxmox.com
2025-10-03 13:48:39 +02:00
Fiona Ebner
14c57934ed fix #6828: remote migration: bump timeout for writing configuration to accommodate volume activation
The 'config' command will lead to volume activation being done for the
referenced volumes. This is because the 'config' handler in the
mtunnel API endpoint calls into the update_vm_api() function, which
uses the create_disks() function, which is also used for existing
disks. In create_disks(), each volume is activated to do an
existence/basic sanity check by querying its size.

There is no requirement to be fast when handling the 'config' command
during remote migration. Since there could be many disks for a given
VM, allow for up to 2 minutes instead of just 10 seconds.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20251001150118.136976-2-f.ebner@proxmox.com
2025-10-03 13:47:48 +02:00
Fiona Ebner
26717ab3b8 migration: conntrack: work around systemd issue where scope for VM might become blocked
Because of a systemd issue [0], when a service that's 'partOf' a scope
fails, the scope itself might end up being left-over, even after all
processes in the scope exit. In particular, this can happen for the
'$vmid.scope' when the 'pve-dbus-vmstate@$vmid.service' fails.

Doing a 'reset-failed' of the failed 'partOf' service leads to the
left-over scope being cleaned up too. Without that users in that
situation would get a difficult-to-make-sense-of "timeout waiting on
systemd" error message.

[0]: https://github.com/systemd/systemd/issues/39141

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20250929122529.90484-3-f.ebner@proxmox.com
2025-10-03 13:27:23 +02:00
Fiona Ebner
5595b5af80 dbus vmstate: add missing includes
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20250929122529.90484-2-f.ebner@proxmox.com
2025-10-03 13:27:23 +02:00
Fiona Ebner
87fa886ffe agent: move guest agent format and parsing to agent module
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20250909132613.96402-6-f.ebner@proxmox.com
2025-10-03 13:11:22 +02:00
Fiona Ebner
65c834d980 agent: prefer usage of get_qga_key() helper
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20250909132613.96402-5-f.ebner@proxmox.com
2025-10-03 13:11:22 +02:00
Fiona Ebner
82fd82c165 agent: implement fsfreeze helper to better handle lost commands
As reported in the enterprise support, it can happen that a guest
agent command is read, but then the guest agent never sends an answer,
because the service in the guest is stopped/killed. For example, if a
guest reboot happens before the command can be successfully executed.
This is usually not problematic, but the fsfreeze-freeze command has a
timeout of 1 hour, so the guest agent socket would be blocked for that
amount of time, waiting on a command that is not being executed
anymore.

Use a lower timeout for the initial fsfreeze-freeze command, and issue
an fsfreeze-status command afterwards, which will return immediately
if the fsfreeze-freeze command already finished, and which will be
queued if not. This is used as a proxy to determine whether the
fsfreeze-freeze command is still running and to check whether it was
successful. Using a too low timeout would mean stuffing/queuing many
fsfreeze-status commands while the guest agent might still be busy
actually doing the freeze. In total, fsfreeze-freeze is still allowed
to take 1 hour, but the time the socket is blocked after a
"lost command" is at most 10 minutes.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20250909132613.96402-4-f.ebner@proxmox.com
2025-10-03 13:11:22 +02:00
Fiona Ebner
480af34c65 qmp client: remove erroneous comment
This is most likely a left-over from copy-pasting from an example in
the 'IO::Multiplex' man page. In particular, the method is not called
every second.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20250909132613.96402-3-f.ebner@proxmox.com
2025-10-03 13:11:22 +02:00
Fiona Ebner
66e46cb670 api: agent: improve module imports
Order module import according to the style guide and add missing
PVE::QemuConfig import.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/all/20250909132613.96402-2-f.ebner@proxmox.com
2025-10-03 13:11:22 +02:00
Fiona Ebner
2af6b68301 drive device: precise arguments for scsihw_infos() helper
Avoid passing in the whole config and whole drive to a helper that
requires only very specific information.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-10-02 14:09:20 +02:00
Alexandre Derumier via pve-devel
87ad8fd1b6 introduce DriveDevice module
and move print_drivedevice_full

Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
Link: https://lore.proxmox.com/mailman.312.1755872353.385.pve-devel@lists.proxmox.com
[FE: run make tidy]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-10-02 14:09:20 +02:00
Fiona Ebner
fc8065547c tests: cfg2cmd: add tests for different SCSI controllers
All existing test cases only used virtio-scsi-pci or default, add
some coverage for others.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-10-02 14:08:54 +02:00
Wolfgang Bumiller
d8dafa7f02 api: add missing snapshot info to get-config return schema
These are included but missing in the schema.
Add them so they are generated for the rust types.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2025-10-01 10:29:31 +02:00
Fiona Ebner
31d6f5f63b vm status: also queue query-proxmox-support QMP commands
The vmstatus() function is used by pvestatd and needs to be fast.
However, the 'query-proxmox-support' querying is done sequentially for
each VM and each query has its own timeout (it's the default 5
seconds). If QMP is blocked for some reason for a single VM, that
already adds 5 seconds to the whole operation. Compared with the whole
stats querying queue, which is allowed to use 3 seconds in total, this
is rather extreme and needs to be fixed.

Back when commit 6891fd70 ("print query-proxmox-support result in
'full' status") was implemented, not all supported QEMU versions in
Proxmox VE implemented the 'query-proxmox-support' QMP command.
Because of this, the queue might be interrupted if ordering this
command too early. It still could've been ordered before the
'query-balloon' one, which also can fail. Nowadays, all supported QEMU
versions do implement the command and this just returns static
information which cannot fail (as long as QMP communication itself
works), so it can also be ordered at the beginning of the queue (after
the main 'query-status').

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250925122829.70121-3-f.ebner@proxmox.com
2025-09-25 15:55:55 +02:00
Fiona Ebner
528df52316 fix #6207: vm status: return undef values when disk{read, write} cannot be queried
If disk read/write cannot be queried because of QMP timeout, they
should not be reported as 0, because a consumer of the RRD stats
cannot distinguish between 0 being an actual 0 value and 0 being an
indicator for the absence of the real value. The RRD graphs in the UI
will already show this correctly.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250925122829.70121-2-f.ebner@proxmox.com
2025-09-25 15:54:34 +02:00
Christoph Heiss
d44663ce6d api: dbus-vmstate: fix return property name
It's called `has-dbus-vmstate` in the actual returned object, not
`dbus-vmstate`.

Signed-off-by: Christoph Heiss <c.heiss@proxmox.com>
Link: https://lore.proxmox.com/20250924090158.132032-1-c.heiss@proxmox.com
2025-09-24 11:07:01 +02:00
Fiona Ebner
c5cfa92ebb fix #6713: snapshot volume chain: fix snapshot after disk move with zeroinit
After mirror, the node below throttle might not be the format node,
but can also be a zeroinit node. In particular, this is the node that
needs to be used by when replacing the blockdev for volume-chain
snapshots for the 'current' snapshot. Look up the actually inserted
node below throttle, rather than assuming that it's the format node.

Also removes the $src_file_blockdev_name variable that has been unused
since commit e7cf7c00 ("blockdev: delete/replace: re-use detach()
helper").

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-By: Aaron Lauterer <a.lauterer@proxmox.com>
Link: https://lore.proxmox.com/20250919120848.60751-1-f.ebner@proxmox.com
2025-09-22 19:27:02 +02:00
Thomas Lamprecht
5208318f81 bump version to 9.0.22
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-09-17 18:37:58 +02:00
Fiona Ebner
0db14e6010 vm start: remove left-over VM-state-related properties
When cloning from a snapshot, VM-state-related properties would be
accidentally copied, see commit "partially fix #6805: api: clone:
properly remove all snapshot-related info". Detect and remove such
left-over properties upon VM start.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250917163038.488710-7-f.ebner@proxmox.com
2025-09-17 18:35:14 +02:00
Fiona Ebner
bd1a858f43 vm commandline: handle 'nets-host-mtu' property in snapshot
Fixes: 7ceb6b72 ("snapshot: introduce running-nets-host-mtu property")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250917163038.488710-6-f.ebner@proxmox.com
2025-09-17 18:35:14 +02:00
Fiona Ebner
80236b100e resume from suspended: properly handle 'nets-host-mtu'
Fixes: 7ceb6b72 ("snapshot: introduce running-nets-host-mtu property")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250917163038.488710-5-f.ebner@proxmox.com
2025-09-17 18:35:14 +02:00
Fiona Ebner
f1fe3b7489 api: create/update: disallow setting 'running-nets-host-mtu' via API
Like the other snapshot-related properties, it should not be possible
to set 'running-nets-host-mtu' via the API.

Fixes: 7ceb6b72 ("snapshot: introduce running-nets-host-mtu property")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250917163038.488710-4-f.ebner@proxmox.com
2025-09-17 18:35:14 +02:00
Fiona Ebner
7c05f0be98 partially fix #6805: api: modify vm config: privilege checks for VM-state-related properties
Currently, the VM-state-related properties 'runningcpu',
'runningmachine' and 'running-nets-host-mtu' are not supposed to end
up in the VM configuration of a remote-migratable VM, because a
suspended VM is not yet migratable. However, there was a bug and the
properties were not removed after cloning from a snapshot, see commit
"partially fix #6805: api: clone: properly remove all snapshot-related
info". Upon remote migration, the property would be encountered and
would be limited to root@pam only. Also, migrating suspended VMs might
be implemented in the future, i.e. BZ issue #2252.

To aid fixing bug #6805 and preparing for issue #2252 in the future,
do proper privilege checking for configuration properties related to
the running VM state.

Note that the 'vmstate' property is already checked for in the
check_vm_modify_config_perm() helper. Note that VM-state-related
properties cannot be set via API by a user.

Originally-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250917163038.488710-3-f.ebner@proxmox.com
2025-09-17 18:35:14 +02:00
Fiona Ebner
6572ff5784 partially fix #6805: api: clone: properly remove all snapshot-related info
When cloning from a snapshot, the current VM state is not copied. Not
all relevant properties were dropped, leading to some left-overs in
the configuration of the clone target.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250917163038.488710-2-f.ebner@proxmox.com
2025-09-17 18:35:14 +02:00
Thomas Lamprecht
e19ee1bf71 fix bad line continuations with mismatched quote characters
We had some bad line continuations of longer string literals using a
pattern like:

'foo"
."bar'

I.e., a 'line1"."line2' instead of an actually working 'line1'.'line2'
code.

This still resulted in valid perl by luck, making it go unnoticed, but
the resulting string was rather broken as it included the newline and
.' code part for the line continuation.

I noticed them due to bad indentation still using tabs, which was due
to perltidy not touching string literals.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-09-15 13:06:09 +02:00
Thomas Lamprecht
be406571a2 bump version to 9.0.21
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-09-10 15:17:14 +02:00
Thomas Lamprecht
fbf6a480da migration: make error for to-old target node even more explicitly
In that the target node is meant, as while we printed the node name
itself a user might miss that it refers to the target node when
reading this error, especially with bigger clusters and rather
similar node names like e.g. pve1, ..., pve11, pve12, ..., pve21.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-09-10 13:11:20 +02:00
Fiona Ebner
096a1ac228 migration: remove unused variable
The $version variable has been unused since commit 898e9296 ("migrate:
drop outdated PVE 7.2 check guarding cloudinit config section").

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250909091918.32254-3-f.ebner@proxmox.com
2025-09-10 13:11:20 +02:00
Fiona Ebner
67d8d092b1 migration: tell users to upgrade if nets-host-mtu is not supported
In Proxmox VE 9, the default behavior for VirtIO network devices is to
inherit the MTU from the bridge. This means that most migrations are
potentially problematic when the nets-host-mtu parameter is not set,
see commit 20c91f7f ("migration: preserve host_mtu for virtio-net
devices"). While setting the parameter could be avoided in some cases,
the information what MTU the target node bridges have is not readily
available. Upgrading is already required to avoid actual problematic
cases, so just tell people to upgrade when the target does not support
preserving the VirtIO-net MTU yet in all cases.

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250909091918.32254-2-f.ebner@proxmox.com
2025-09-10 13:11:20 +02:00
Fiona Ebner
73897abcbd run make tidy
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-09-08 11:35:07 +02:00
Fiona Ebner
05eb8e6394 cfg2cmd: inform users that setting guest-phys-bits might be necessary when setting aw-bits
Until QEMU warns about this itself, inform the users here. Commit
message below copied from [1].

If a virtual machine is setup with an intel-iommu device, QEMU
allocates and maps the (virtual) I/O address space (IOAS) for a VFIO
passthrough device with iommufd.

In case of a mismatch of the address width of the host CPU and IOMMU
CPU, the guest physical address space (GPAS) and memory-type range
registers (MTRRs) are setup to the host CPU's address width, which
causes IOAS to be allocated and mapped outside of the IOMMU's maximum
guest address width (MGAW) and causes the following error from QEMU
(the error message is copied from the user forum [0]):

    kvm: vfio_container_dma_map(0x5c9222494280, 0x380000000000, 0x10000, 0x78075ee70000) = -22 (Invalid argument)

[0]: https://forum.proxmox.com/threads/169586/page-3#post-795717
[1]: https://lore.proxmox.com/pve-devel/20250902112307.124706-5-d.kral@proxmox.com/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-09-08 11:34:11 +02:00
Daniel Kral
dc52c006ce fix #6608: expose viommu driver aw-bits option
Since QEMU 9.2 [0], the default I/O address space bit width was raised
from 39 bits to 48 bits for the Intel vIOMMU driver, which makes the
aw-bits check introduced in [1] to trip for host CPUs with less than 48
bits physical address width from QEMU 9.2 onwards:

vfio 0000:XX:YY.Z: Failed to set vIOMMU: aw-bits 48 > host aw-bits 39

For VFIO devices where a vIOMMU is in-use, QEMU fetches the IOVA ranges
with the iommufd ioctl IOMMU_IOAS_IOVA_RANGES or the vfio_iommu_type1's
VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE info, so 'phys-bits' doesn't change
the behavior of the check.

Therefore, expose the 'aw-bits' option of the intel-iommu and
virtio-iommu QEMU drivers to allow users to set the value.

[0] qemu ddd84fd0c1 ("intel_iommu: Set default aw_bits to 48 starting from QEMU 9.2")
[1] qemu 77f6efc0ab ("intel_iommu: Check compatibility with host IOMMU capabilities")

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
Link: https://lore.proxmox.com/20250905141529.215689-1-d.kral@proxmox.com
2025-09-08 10:09:48 +02:00
Thomas Lamprecht
4ce8581958 bump version to 9.0.20
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2025-09-04 19:49:31 +02:00
Fiona Ebner
7ceb6b7255 snapshot: introduce running-nets-host-mtu property
For VirtIO network devices, it is necessary to preserve the values and
presence of the host_mtu setting when restoring a snapshot. See commit
"migration: preserve host_mtu for virtio-net devices" for details.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Link: https://lore.proxmox.com/20250904124113.81772-7-f.ebner@proxmox.com
2025-09-04 19:47:58 +02:00
Fiona Ebner
0bcf41ed42 snapshot: save vmstate: die when PID cannot be obtained
The call get_current_qemu_machine() already depends on the virtual
machine running, so not being able to obtain the PID is very
unexpected. Quietly not including the running CPU in the snapshot can
lead to not being able to restore the snapshot later, so die early
instead.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Link: https://lore.proxmox.com/20250904124113.81772-6-f.ebner@proxmox.com
2025-09-04 19:47:58 +02:00
Fiona Ebner
8595594d38 snapshot: save vmstate: avoid using deprecated check_running() function
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Link: https://lore.proxmox.com/20250904124113.81772-5-f.ebner@proxmox.com
2025-09-04 19:47:58 +02:00
Fiona Ebner
20c91f7f3a migration: preserve host_mtu for virtio-net devices
The virtual hardware is generated differently (at least for i440fx
machines) when host_mtu is set or not set on the netdev command line
[0]. When the MTU is the same value as the default 1500, Proxmox VE
did not add a host_mtu parameter. This is problematic for migration
where host_mtu is present on one end of the migration, but not on the
other [1]. Moreover, the effective setting in the guest (state) will
still be the host_mtu from the source side, even if a different value
is used for host_mtu on the target instance's commandline. This will
not lead to an error loading the migration stream in QEMU, but having
a larger host_mtu than the bridge MTU is still problematic for certain
network traffic like
> iperf3 -c 10.10.10.11 -u -l 2k
when host_mtu=9000 and bridge MTU=1500.

Pass the values from the source to the target during migration to be
able to preserve them.

[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346
[1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Link: https://lore.proxmox.com/20250904124113.81772-4-f.ebner@proxmox.com
2025-09-04 19:47:58 +02:00
Fiona Ebner
c4d2ee0610 api: vm start: introduce nets-host-mtu parameter for migration compat
The virtual hardware is generated differently (at least for i440fx
machines) when host_mtu is set or not set on the netdev command line
[0]. When the MTU is the same value as the default 1500, Proxmox VE
did not add a host_mtu parameter. This is problematic for migration
where host_mtu is present on one end of the migration, but not on the
other [1]. Moreover, the effective setting in the guest (state) will
still be the host_mtu from the source side, even if a different value
is used for host_mtu on the target instance's commandline. This will
not lead to an error loading the migration stream in QEMU, but having
a larger host_mtu than the bridge MTU is still problematic for certain
network traffic like
> iperf3 -c 10.10.10.11 -u -l 2k
when host_mtu=9000 and bridge MTU=1500. Starting a VM cold with such a
configuration is already prohibited, so also prevent it for migration.

Add the necessary parameter for VM start to allow preserving the
values going forward.

[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346
[1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Link: https://lore.proxmox.com/20250904124113.81772-3-f.ebner@proxmox.com
2025-09-04 19:47:58 +02:00
Fiona Ebner
ffa2fd05e1 virtio-net: fix migration between default/non-default MTUs starting with machine version 10.0+pve1
The virtual hardware is generated differently (at least for i440fx
machines) when host_mtu is set or not set on the netdev command line
[0]. When the MTU is the same value as the default 1500, Proxmox VE
did not add a host_mtu parameter. This is problematic for migration
where host_mtu is present on one end of the migration, but not on the
other [1].

Always set the host_mtu parameter starting with machine version
10.0+pve1 to avoid this issue going forward. Handling migrations with
older machine versions is more involved and will be done in separate
patches. Thanks to Stefan Hanreich and Fabian Grünbichler for
discussing this with me!

Since print_netdevice_full() is also called for hotplug, it cannot
always use the $version_guard helper and needs to fallback to
min_version() then.

[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346
[1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Link: https://lore.proxmox.com/20250904124113.81772-2-f.ebner@proxmox.com
2025-09-04 19:47:58 +02:00
Aaron Lauterer
35a1828bdc api: rrd: add missing ds parameter for png graph
Otherwise, no png would be generated if that parameter is omitted

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Tested-by: Michael Köppl <m.koeppl@proxmox.com>
Link: https://lore.proxmox.com/20250828125810.3642601-2-a.lauterer@proxmox.com
2025-09-03 18:37:02 +02:00
Fiona Ebner
b4c7f9b0f7 run make tidy
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2025-09-03 10:01:12 +02:00
Wolfgang Bumiller
96c6e79b25 api: minor schema fixup: 'string' is a type, not a format
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2025-09-01 15:32:14 +02:00