On 18 July 2016 at 15:45, Maxim Ostapenko <address@hidden> wrote:
1) AddressSanitizer mmaps quite large regions of memory for redzones and
shadow gap. In particular, for 39-bit AS it mmapes:
|| `[0x1400000000, 0x1fffffffff]` || HighShadow || - 48 Gb
|| `[0x1200000000, 0x13ffffffff]` || ShadowGap || - 8 Gb
|| `[0x1000000000, 0x11ffffffff]` || LowShadow || - 4 Gb
2) In QEMU, page_set_flags is called for these ranges. It cuts given range
to individual pages and sets flags for them. Given the page size is 4 Kb,
for 8 Gb range we have 2097152 iterations and for 48 Gb 12582912 iterations
in inner loop. This is obviously a performance bottleneck.
Mmm, the algorithm here is pretty simple and basically assumes the
guest isn't going to be doing enormous allocations like that.
(If the host process doesn't happen to have a suitable big lump of its
VA space free then the mmap will fail anyway.)
3) Same issue may happen when ASan tries to read /proc/self/map later in
page_check_range function, after it already mmaped HighShadow, ShadowGap and
LowShadow regions.
Could someone help me, how can I mitigate this performance issue? Do we
really need to set flags to each page on entire (quite big) memory region?
Well, we do need to do some things:
* we're populating the PageDesc data structure which we later use
to cache generated code
* if we're marking the range as writeable and it wasn't previously
writeable, we need to check whether there's already generated code
anywhere in this memory range and invalidate those translations
This could probably be done in a way that doesn't iterate naively
through every page, though.