emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Some experience with the igc branch


From: Gerd Möllmann
Subject: Re: Some experience with the igc branch
Date: Wed, 25 Dec 2024 13:50:37 +0100
User-agent: Gnus/5.13 (Gnus v5.13)

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Gerd Möllmann <gerd.moellmann@gmail.com>
>> Cc: pipcet@protonmail.com,  ofv@wanadoo.es,  emacs-devel@gnu.org,
>>   eller.helmut@gmail.com,  acorallo@gnu.org
>> Date: Wed, 25 Dec 2024 05:56:26 +0100
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> The SIGPROF handler does two things: (1) get the current backtrace,
>> >> which does not trip on memory barriers, and (2) build a summary, i.e.
>> >> count same backtraces using a hash table. (2) trips on memory barriers.
>> >
>> > Can you elaborate on (2) and why it trips?  I guess I'm missing
>> > something because I don't understand which code in record_backtrace
>> > does trip on memory barriers and why.
>> 
>> Ok, (2) begins as shown below.
>> 
>>   static void
>>   record_backtrace (struct profiler_log *plog, EMACS_INT count)
>>   {
>>     log_t *log = plog->log;
>>     get_backtrace (log->trace, log->depth);
>>   --- (2) begins after this line -------------------------------
>>     EMACS_UINT hash = trace_hash (log->trace, log->depth);
>> 
>> The SIGPROF can have interrupted Emacs at any point, both the MPS thread
>> and all others. MPS may have been doing arbitrary stuff when
>> interrupted, and Emacs threads too. Memory barriers may be on
>> unpredictable segments of memory, as they usually are, as part of MPS'
>> GC implementation. Do you agree with this picture?
>> 
>> Elsewhere I tried to explain why I think this works up to the line
>> marked (2) above. Now enter trace_hash. Current implementation:
>> 
>>   static EMACS_UINT
>>   trace_hash (Lisp_Object *trace, int depth)
>>   {
>>     EMACS_UINT hash = 0;
>>     for (int i = 0; i < depth; i++)
>>       {
>>         Lisp_Object f = trace[i];
>>         EMACS_UINT hash1;
>>   #ifdef HAVE_MPS
>>         hash1 = (CLOSUREP (f) ? igc_hash (AREF (f, CLOSURE_CODE)) : igc_hash 
>> (f));
>>                  ^^^^^^^^       ^^^^^^^^  ^^^^
>> 
>> The constructs I marked with ^^^ all access the memory of F. F is a
>> vectorlike, it's memory is managed by MPS in an MPS pool that uses
>> memory barriers, so the memory of F can currently be behind a barrier.
>> It doesn't have to, but it can.
>> 
>> When we access F's memory and it is behind a barrier, the result is a
>> nested SIgSEGV while handling SIGPROF.
>
> Two followup questions:
>
>   . how is accessing F different from accessing the specpdl stack?

F's memory is allocated from an MPS pool via alloc_impl in igc.c. Most
objects are allocated from a pool that uses barriers (I think except
PVEC_THREAD). The specpdl stacks are mallocs (see
grow_specpdl_allocation), and uses as a roots. There are currently no
barriers on roots.

>   . how does this work with the current GC, where F could have been
>     collected and its memory freed?

I think when we find F in a specpdl stack, GC should have seen it and
marked it too in mark_specpdl. So it wouldn't be freed.

(Same for igc, where the stacks are roots, and should have seen F in
that way in scan_specdl.)

> The first question is more important, from where I stand.  Looking
> forward beyond the point where we land igc on master, I wonder how
> will be able to tell, for a random non-trivial change on the C level,
> whether what it does can cause trouble with MPS?  That is, how can a
> mere mortal determine whether a given data structure in igc Emacs can
> or cannot be safely touched when MPS happens to do its thing, whether
> synchronously or asynchronously?  We must have some reasonably
> practical way of telling this, or else we will be breaking Emacs high
> and low.
>
>> More code accessing memory that is potentially behind a barrier follows
>> in record_backtrace.
>
> Which code is that?  (It's a serious question: I tried to identify
> that code, but couldn't.  I'm probably missing something.)

The example I saw, with ^^^^ marking the call sites:

static void
record_backtrace (struct profiler_log *plog, EMACS_INT count)
{
  log_t *log = plog->log;
  get_backtrace (log->trace, log->depth);
  EMACS_UINT hash = trace_hash (log->trace, log->depth);
  int hidx = log_hash_index (log, hash);
  int idx = log->index[hidx];
  while (idx >= 0)
    {
      if (log->hash[idx] == hash
          && trace_equal (log->trace, get_key_vector (log, idx), log->depth))
             ^^^^^^^^^^^

static bool
trace_equal (Lisp_Object *bt1, Lisp_Object *bt2, int depth)
{
  for (int i = 0; i < depth; i++)
    if (!BASE_EQ (bt1[i], bt2[i]) && NILP (Ffunction_equal (bt1[i], bt2[i])))
                                           ^^^^^^^^^^^^^^^

DEFUN ("function-equal", Ffunction_equal, Sfunction_equal, 2, 2, 0,
       doc: /* Return non-nil if F1 and F2 come from the same source.
Used to determine if different closures are just different instances of
the same lambda expression, or are really unrelated function.  */)
     (Lisp_Object f1, Lisp_Object f2)
{
  bool res;
  if (EQ (f1, f2))
    res = true;
  else if (CLOSUREP (f1) && CLOSUREP (f2))
           ^^^^^^^^         ^^^^^^^^
    res = EQ (AREF (f1, CLOSURE_CODE), AREF (f2, CLOSURE_CODE));
              ^^^^                     ^^^^

Didn't look further than that, though.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]