[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 20/24] replay: simple auto-snapshot mode for record
From: |
Nicholas Piggin |
Subject: |
Re: [PATCH v4 20/24] replay: simple auto-snapshot mode for record |
Date: |
Tue, 12 Mar 2024 20:43:21 +1000 |
On Tue Mar 12, 2024 at 7:00 PM AEST, Pavel Dovgalyuk wrote:
> On 11.03.2024 20:40, Nicholas Piggin wrote:
> > record makes an initial snapshot when the machine is created, to enable
> > reverse-debugging. Often the issue being debugged appears near the end of
> > the trace, so it is important for performance to keep snapshots close to
> > the end.
> >
> > This implements a periodic snapshot mode that keeps a rolling set of
> > recent snapshots. This could be done by the debugger or other program
> > that talks QMP, but for setting up simple scenarios and tests, this is
> > more convenient.
> >
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> > docs/system/replay.rst | 5 ++++
> > include/sysemu/replay.h | 11 ++++++++
> > replay/replay-snapshot.c | 57 ++++++++++++++++++++++++++++++++++++++++
> > replay/replay.c | 27 +++++++++++++++++--
> > system/vl.c | 9 +++++++
> > qemu-options.hx | 9 +++++--
> > 6 files changed, 114 insertions(+), 4 deletions(-)
> >
> > diff --git a/docs/system/replay.rst b/docs/system/replay.rst
> > index ca7c17c63d..1ae8614475 100644
> > --- a/docs/system/replay.rst
> > +++ b/docs/system/replay.rst
> > @@ -156,6 +156,11 @@ for storing VM snapshots. Here is the example of the
> > command line for this:
> > ``empty.qcow2`` drive does not connected to any virtual block device and
> > used
> > for VM snapshots only.
> >
> > +``rrsnapmode`` can be used to select just an initial snapshot or periodic
> > +snapshots, with ``rrsnapcount`` specifying the number of periodic snapshots
> > +to maintain, and ``rrsnaptime`` the amount of run time in seconds between
> > +periodic snapshots.
> > +
> > .. _network-label:
> >
> > Network devices
> > diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
> > index 8102fa54f0..92fa82842b 100644
> > --- a/include/sysemu/replay.h
> > +++ b/include/sysemu/replay.h
> > @@ -48,6 +48,17 @@ typedef enum ReplayCheckpoint ReplayCheckpoint;
> >
> > typedef struct ReplayNetState ReplayNetState;
> >
> > +enum ReplaySnapshotMode {
> > + REPLAY_SNAPSHOT_MODE_INITIAL,
> > + REPLAY_SNAPSHOT_MODE_PERIODIC,
> > +};
>
> This should be defined in replay-internal.h, because it is internal for
> replay.
>
> > +typedef enum ReplaySnapshotMode ReplaySnapshotMode;
> > +
> > +extern ReplaySnapshotMode replay_snapshot_mode;
> > +
> > +extern uint64_t replay_snapshot_periodic_delay;
> > +extern int replay_snapshot_periodic_nr_keep;
>
> These ones are internal too.
Okay for both.
>
> > +
> > /* Name of the initial VM snapshot */
> > extern char *replay_snapshot;
> >
> > diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
> > index ccb4d89dda..762555feaa 100644
> > --- a/replay/replay-snapshot.c
> > +++ b/replay/replay-snapshot.c
> > @@ -70,6 +70,53 @@ void replay_vmstate_register(void)
> > vmstate_register(NULL, 0, &vmstate_replay, &replay_state);
> > }
> >
> > +static QEMUTimer *replay_snapshot_timer;
> > +static int replay_snapshot_count;
> > +
> > +static void replay_snapshot_timer_cb(void *opaque)
> > +{
> > + Error *err = NULL;
> > + char *name;
> > +
> > + if (!replay_can_snapshot()) {
> > + /* Try again soon */
> > + timer_mod(replay_snapshot_timer,
> > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> > + replay_snapshot_periodic_delay / 10);
> > + return;
> > + }
> > +
> > + name = g_strdup_printf("%s-%d", replay_snapshot,
> > replay_snapshot_count);
> > + if (!save_snapshot(name,
> > + true, NULL, false, NULL, &err)) {
> > + error_report_err(err);
> > + error_report("Could not create periodic snapshot "
> > + "for icount record, disabling");
> > + g_free(name);
> > + return;
> > + }
> > + g_free(name);
> > + replay_snapshot_count++;
> > +
> > + if (replay_snapshot_periodic_nr_keep >= 1 &&
> > + replay_snapshot_count > replay_snapshot_periodic_nr_keep) {
> > + int del_nr;
> > +
> > + del_nr = replay_snapshot_count - replay_snapshot_periodic_nr_keep
> > - 1;
> > + name = g_strdup_printf("%s-%d", replay_snapshot, del_nr);
>
> Copy-paste of snapshot name format.
Yes good catch.
>
> > + if (!delete_snapshot(name, false, NULL, &err)) {
> > + error_report_err(err);
> > + error_report("Could not delete periodic snapshot "
> > + "for icount record");
> > + }
> > + g_free(name);
> > + }
> > +
> > + timer_mod(replay_snapshot_timer,
> > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> > + replay_snapshot_periodic_delay);
> > +}
> > +
> > void replay_vmstate_init(void)
> > {
> > Error *err = NULL;
> > @@ -82,6 +129,16 @@ void replay_vmstate_init(void)
> > error_report("Could not create snapshot for icount
> > record");
> > exit(1);
> > }
> > +
> > + if (replay_snapshot_mode == REPLAY_SNAPSHOT_MODE_PERIODIC) {
> > + replay_snapshot_timer = timer_new_ms(QEMU_CLOCK_REALTIME,
> > +
> > replay_snapshot_timer_cb,
> > + NULL);
> > + timer_mod(replay_snapshot_timer,
> > + qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> > + replay_snapshot_periodic_delay);
> > + }
> > +
> > } else if (replay_mode == REPLAY_MODE_PLAY) {
> > if (!load_snapshot(replay_snapshot, NULL, false, NULL, &err))
> > {
> > error_report_err(err);
> > diff --git a/replay/replay.c b/replay/replay.c
> > index 895fa6b67a..c916e71d30 100644
> > --- a/replay/replay.c
> > +++ b/replay/replay.c
> > @@ -29,6 +29,10 @@
> > ReplayMode replay_mode = REPLAY_MODE_NONE;
> > char *replay_snapshot;
> >
> > +ReplaySnapshotMode replay_snapshot_mode;
> > +uint64_t replay_snapshot_periodic_delay;
> > +int replay_snapshot_periodic_nr_keep;
> > +
> > /* Name of replay file */
> > static char *replay_filename;
> > ReplayState replay_state;
> > @@ -424,6 +428,27 @@ void replay_configure(QemuOpts *opts)
> > }
> >
> > replay_snapshot = g_strdup(qemu_opt_get(opts, "rrsnapshot"));
> > + if (replay_snapshot && mode == REPLAY_MODE_RECORD) {
> > + const char *snapmode;
> > +
> > + snapmode = qemu_opt_get(opts, "rrsnapmode");
> > + if (!snapmode || !strcmp(snapmode, "initial")) {
> > + replay_snapshot_mode = REPLAY_SNAPSHOT_MODE_INITIAL;
> > + } else if (!strcmp(snapmode, "periodic")) {
> > + replay_snapshot_mode = REPLAY_SNAPSHOT_MODE_PERIODIC;
> > + } else {
> > + error_report("Invalid rrsnapmode option: %s", snapmode);
> > + exit(1);
> > + }
> > +
> > + /* Default 10 host seconds of machine runtime per snapshot. */
> > + replay_snapshot_periodic_delay =
> > + qemu_opt_get_number(opts, "rrsnaptime", 10) *
> > 1000;
>
> Can user set it to zero here?
I guess so. It would just continually snapshot, which is probably okay
if that's what you want.
Thanks,
Nick
- [PATCH v4 13/24] tests/avocado: replay_linux.py remove the timeout expected guards, (continued)
- [PATCH v4 13/24] tests/avocado: replay_linux.py remove the timeout expected guards, Nicholas Piggin, 2024/03/11
- [PATCH v4 15/24] tests/avocado: reverse_debugging.py add test for x86-64 q35 machine, Nicholas Piggin, 2024/03/11
- [PATCH v4 17/24] tests/avocado: reverse_debugging.py stop VM before sampling icount, Nicholas Piggin, 2024/03/11
- [PATCH v4 18/24] tests/avocado: reverse_debugging reverse-step at the end of the trace, Nicholas Piggin, 2024/03/11
- [PATCH v4 14/24] tests/avocado/reverse_debugging.py: mark aarch64 and pseries as not flaky, Nicholas Piggin, 2024/03/11
- [PATCH v4 16/24] tests/avocado: reverse_debugging.py verify addresses between record and replay, Nicholas Piggin, 2024/03/11
- [PATCH v4 19/24] tests/avocado: reverse_debugging.py add snapshot testing, Nicholas Piggin, 2024/03/11
- [PATCH v4 21/24] tests/avocado: reverse_debugging.py test auto-snapshot mode, Nicholas Piggin, 2024/03/11
- [PATCH v4 20/24] replay: simple auto-snapshot mode for record, Nicholas Piggin, 2024/03/11
- [PATCH v4 22/24] target/ppc: fix timebase register reset state, Nicholas Piggin, 2024/03/11
- [PATCH v4 23/24] spapr: Fix vpa dispatch count for record-replay, Nicholas Piggin, 2024/03/11
- [PATCH v4 24/24] tests/avocado: replay_linux.py add ppc64 pseries test, Nicholas Piggin, 2024/03/11