guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GC Warning related to large mem block allocation - Help needed


From: Freja Nordsiek
Subject: Re: GC Warning related to large mem block allocation - Help needed
Date: Mon, 1 Jan 2018 15:11:32 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

David Pirotte and Guile Devel,


Changing GC_LARGE_ALLOC_WARN_LEVEL seems like it is the best solution to
me. Looking at the suggestions to change the allocator earlier in the
discussion, I decided to look more into the feasibility of that solution
to see if that alternative was fixable to avoid having to change
GC_LARGE_ALLOC_WARN_LEVEL. I did some digging around
on https://github.com/ivmai/bdwgc and  http://www.hboehm.info/gc/ and
have dug around a bit inside bytevectors.c in the past. My conclusion is
that while changing the allocator properly would fix the warning, it
could introduce more problems. So, sadly, it seems changing
GC_LARGE_ALLOC_WARN_LEVEL is the only solution. Unless my analysis is
wrong (would be nice if it was because changing the allocator would help
performance too). My analysis follows:



The warning is thrown because large arrays can cause major performance
problems for garbage collectors that work like BDWGC. They decide to
keep or collect objects based on whether anything in other objects that
are being kept or in the stack point to them (either at their head or
somewhere in their interiors). The standard BDWGC malloc (GC_MALLOC)
allocates objects that could potentially have pointers in them and thus
need to be searched for pointers to other objects. Such searching can be
expensive. The BDWGC atomic malloc (GC_MALLOC_ATOMIC) is basically
declaring that the object does not contain pointers and thus does not
need to be searched, which saves a lot of effort for the GC. But,
regardless of whether the allocation is atomic (no pointers) or not,
BDWGC still needs to search everything else for pointers to the objects.
Structs and things like that have a pointer or two need to be declared
to have pointers even if the rest is not pointers. But the rest of the
data is effectively random pointers when BDWGC looks at them. Same goes
for everything on the stack that is not a pointer. The larger an
allocated array is, the more likely that some non-pointer data will
accidentally point to its interior if it is looked at through the lens
of a pointer. This is why BDWGC throws the warning when large arrays are
allocated repeatedly with GC_MALLOC and GC_MALLOC_ATOMIC. There is a
high probability that many of them will be kept around beyond what is
required (and thus they take up RAM) due to non-pointers accidentally
pointing to them, so BDWGC lets the programmer and user know.

To help mitigate this, BDWGC offers GC_MALLOC_IGNORE_OFF_PAGE and
GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE (note, the latter function is not
mentioned on the readme on the git page but is mentioned at
http://www.hboehm.info/gc/gcinterface.html) which do the same
allocations, but only considers them pointed to if pointers or data
BDWGC thinks might be pointers points to the first 512 bytes of the
objects. Since they then look like 512 byte long objects to BDWGC for
the purpose of deciding whether to keep or collect them, there is a much
lower probability of them being accidentally kept a long time. There is
one major catch. If one is still using the object but one's only pointer
to them is pointing at somewhere after the 512 byte mark, they could get
prematurely collected.

Now, going to SRFI-4 vectors and R6RS bytevectors, which underneath use
mostly the same code in Guile, they are  allocated in make_bytevector
with GC_MALLOC_ATOMIC (indirectly through SCM_GC_MALLOC_POINTERLESS) and
an SCM with a pointer to the head returned by the function. In
principle, that could be changed to do a size check and then use
GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE if it is larger than 100 kB (note,
changing it to the non-atomic version while it would get rid of the
warning and make sure it doesn't get kept too long on accident, would
mean that it is searched inside for pointers which could then keep other
stuff on accident).

The only worry then would be that it would get collected while still
being used. I think most cases, this would not be a problem. However, if
someone makes a new bytevector from an existing one from somewhere in
the middle, it is possible that the new one would only point to the
middle and not the head and thus could be collected prematurely (would
need to do some more digging to see if the new one would be allocated
using make_bytevector_from_buffer). Or, if someone was using C code to
say take the norm of the vector (very common operation often done with
BLAS) and the scheme code wasn't going to use the bytevector anymore,
there might only be a pointer on the stack pointing to the current
element that the C code is reading and as soon as it gets past the 512
byte mark, the bytearray might get collected while it is still being
worked on which would be a disaster. So I am not sure that the
allocation could be safely changed to use
GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE if the bytevector is large. I do not
know enough about Guile internals yet to know if typical pure scheme
operations would run into problems. I think it is definitely possible
that there are FFI cases where problems could be run into, which would
then mean the coder has to take extra precautions to prevent collection,
which could be a major problem for changing the allocation Guile 2.0.x
and 2.2.x since it would be a major API change. Wouldn't be such an
issue for 3.x series since the API could be changed but it would be a
bit of a surprising result for people to have to worry about if using
FFI. I could be wrong on this - a pointer to the head might still be
kept on the stack and then there is no problem.

So, it seems, that disabling the warning through
GC_LARGE_ALLOC_WARN_LEVEL or some other method is the only safe
solution, unless my analysis above is wrong and the allocation code
could be safely changed.


Freja Nordsiek


On 12/31/2017 02:22 PM, David Pirotte wrote:
> Hello,
>
>>> If all you are doing is trying to get Guile not to issue warnings about big
>>> allocations, I think all you need to do is put -DGC_IGNORE_WARN in the
>>> CFLAGS when you build Guile.  
>> Thanks for the suggestion, but it does not work.
> For those interested, Mike did find a way to get rid of those warnings, and 
> posted it
> in #guile:
>
>       <spk121> daviid: to quell BDW-GC large alloc warnings via environment
>         variables, you can set GC_LARGE_ALLOC_WARN_INTERVAL to something much
>         larger than its default of 5
>
> which works perfectly, thanks Mike!
>
> David.
>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]