[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnucap] gnucap 0.31 fourier segfault / convergence failure
From: |
Al Davis |
Subject: |
Re: [Bug-gnucap] gnucap 0.31 fourier segfault / convergence failure |
Date: |
Sun, 30 Jun 2002 16:31:29 -0600 |
On Saturday 29 June 2002 05:37 pm, Nuno Miguel Fernandes Sucena
Almeida wrote:
> Hello,
> i get a segfault with the attached schematic when i comment
> out the .fourier and set the C2 value from 100pF to 47pF .
> see the gdb log attachment for GDB output
At first, "It works for me", but then I played with it and was
able to reproduce the problem, and fix it.
To understand the problem, first some background.
The "fourier" command does a transient analysis, then a Fourier
transform on the results, giving a spectrum in the frequency
domain. It saves the transient results, then does the Fourier
transform all at once. Before beginning, it determines what
points are needed, and makes sure to calculate those. The
frequencies and number of points is determined up front, and
not changed.
These points are fixed, then the transient step control can add
additional points as required for truncation error or iteration
count, which are not used by the Fourier transform.
Occasionally, it can reject a point, and back up and try again
with smaller time steps.
Now, the problem ....
The problem is that it is possible to reject a point that is
required for the Fourier transform. When this happens, it
backs up, inserts additional time points (not used by FT), and
recomputes the required point. The bug is that the counter
would be incremented twice .. once for the correct point and
once for the rejected one. This caused two symptoms. The
first is that it would store past the end of the array, which
later caused the seg-fault. The second is that the time points
in the array would be shifted, with a few samples of bogus data
added, which contaminates the results. In some cases, the
array overrun may not cause a crash, and the only error would
be the time shift and a few bogus samples, which would result
in an incorrect transform. In the cases I have seen, the
incorrect transform is close enough to be believable, which is
the worst kind of error.
The fix is simple:
======================================
diff -c -r22.1 s_tr_swp.cc
*** s_tr_swp.cc 2002/04/28 05:19:52 22.1
--- s_tr_swp.cc 2002/06/30 17:55:37
***************
*** 106,112 ****
}else if (!converged && OPT::quitconvfail) {
printnow = true;
}else if (!converged || approxtime <= time0) {
! printnow = false;
}} // else (usual case) use the value set in next
if (printnow) {
--- 106,115 ----
}else if (!converged && OPT::quitconvfail) {
printnow = true;
}else if (!converged || approxtime <= time0) {
! if (printnow) {
! --stepno;
! printnow = false;
! }
}} // else (usual case) use the value set in next
if (printnow) {
=========================================
Just decrement stepno when such a rejection occurs.
> With the same schematic and and another simpler schematic,
> if i comment out the BFR96 Philips model parameter ITF =
> 2.48030E-001 i get this:
>
> (...)
> vco_tran_42pF.dat exists. replace? y
> very backward time step
> convergence failure (itl4)
> zero time step
> internal error: step control (adt=1e-12,rdt=-7.8125e-13)
> time0=2.0466e-07 time1=2.04659e-07 rtime=2.04659e-07
Without the actual files, I cannot reproduce this, but it looks
like the classic convergence failure. I cannot tell whether it
is a program bug or circuit problem.
The "internal error" is known. This happens when it tries
several methods to recover, then gives up. Spice would say
"internal time step too small"..