make-w32
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNU make 3.81beta4 released


From: Markus Mauhart
Subject: Re: GNU make 3.81beta4 released
Date: Thu, 19 Jan 2006 00:55:46 +0100

"Eli Zaretskii" <address@hidden> wrote ...
>
>> Date: Wed, 18 Jan 2006 06:35:12 +0200
>> From: Eli Zaretskii <address@hidden>
>>
>> > 261 ? proc_index is allmost 33M ? !
>> >
>> > -->
>> >
>> > static sub_process *proc_array[256];
>> > static int proc_index = 0;
>> >
>> > proc_array[proc_index++] = (sub_process *) proc;
>> >
>> > ... looks like we just left the bounds of our static array :-)
>>
>> What happens if you use -j 200 instead of just -j?  Does it work
>> successfully then?
>
> Actually, in addition to the obvious bug in sub_proc.c whereby it
> never checks that the number of processes exceeds the fixed 256-slot
> array, there's one more problem that I see: the Win32 API function
> WaitForMultipleObjects that sub_proc.c uses to wait for child
> processes' demise is documented to be limited to a maximum of
> MAXIMUM_WAIT_OBJECTS objects.  MAXIMUM_WAIT_OBJECTS's value is 64, way
> less than 256.  So please try to run your build with -j 64 in the
> sub-Make's command line, and see if that works without hanging and
> without crashing.

I have interesting results.

1st the "hang": I found it happens (the mentioned loop loops forever)
cause it thinks that the single goal's ("all") dependencies are building.
Inside the loop the following stack IIRC fails:
reap_children()
  process_wait_for_any()
    process_wait_for_any_private()
      failed = WaitForMultipleObjects(too much handles)
      return NULL
    return NULL
  doesnt propagate the error correctly.
loop loops again

Then I added array-overflow handling to process_register() and according
failure handling to its callers.

Now I got immediately an exception (probably the same one I reported
previously for non-hanging "mode 2").

Reason is a bug in process_last_err(x) ... doesnt handle correctly
x==INVALID_HANDLE_VALUE.

After I had fixed this, make "-j noNumber" run for some minutes -- in the
beginning many process_register() failed, but make recovered successfully.
After some minutes the wellknown loop again looped forever and I had to
stop this experiment.


Now I continued with your suggestion "-j 64" -- it run AFAICS allmost
1m without errors until I got ...
    Assertion failed: a == g->changed, file .\remake.c, line 169
... this comes from an assertion I had inserted around this bug:
    g->changed += commands_started - ocommands_started;
(g->changed is only 8 bits wide).
Today before my 1st test I thought this bug could cause this infinite
loop, hence the assertion.
Fixing this one is not so easy, cause this byte is also used as a flag
and sometimes binary combined with some pseudo-bools, hence extending
it to a bigger width has sideeffects to consider  --- please see my
postings "some snippets from make381" in gnu.make.bugs from March 2005.
Changelog from 2005-09-16 mentions another bool-related bug I had
reported then. IMHO one should replace all pseudo bools in our code path
with real bool's and their arithmetic and bitwise operations with logical
operations - this concerns global and local variables, structure members,
functions parameters and return types ... about 500 changes IIRC :-)


Best Regards,
Markus.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]