[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: float matrix multiplication problem
From: |
Jaroslav Hajek |
Subject: |
Re: float matrix multiplication problem |
Date: |
Thu, 14 Jan 2010 10:21:58 +0100 |
On Thu, Jan 14, 2010 at 10:21 AM, Jaroslav Hajek <address@hidden> wrote:
> 2010/1/14 John W. Eaton <address@hidden>:
>> On 14-Jan-2010, Jaroslav Hajek wrote:
>>
>> | On Thu, Jan 14, 2010 at 9:18 AM, John W. Eaton <address@hidden> wrote:
>> | > Using the current sources, running Octave with valgrind, and given
>> | >
>> | > x = single (rand (4, 4) + i*rand (4, 4));
>> | >
>> | > then the expression
>> | >
>> | > x'*x;
>> | >
>> | > generates the following diagnostics:
>> | >
>> | > ==12836== Conditional jump or move depends on uninitialised value(s)
>> | > ==12836== at 0x681224F: bool mx_inline_all_real<float>(unsigned
>> long, std::complex<float> const*) (mx-inlines.cc:214)
>> | > ==12836== by 0x6826504: FloatComplexNDArray::all_elements_are_real()
>> const (fCNDArray.cc:536)
>> | > ==12836== by 0x567F587:
>> octave_float_complex_matrix::try_narrowing_conversion() (ov-flt-cx-mat.cc:83)
>> | > ==12836== by 0x56BF551: octave_value::maybe_mutate() (ov.cc:1148)
>> | > ==12836== by 0x56CD9B3:
>> octave_value::octave_value(FloatComplexMatrix const&, MatrixType const&)
>> (ov.cc:689)
>> | > ==12836== by 0x57C7A79: oct_binop_herm_mul(octave_base_value const&,
>> octave_base_value const&) (op-fcm-fcm.cc:137)
>> | > ==12836== by 0x56C50E0:
>> do_binary_op(octave_value::compound_binary_op, octave_value const&,
>> octave_value const&) (ov.cc:2103)
>> | > ==12836== by 0x574E833:
>> tree_compound_binary_expression::rvalue1(int) (pt-cbinop.cc:212)
>> | > ==12836== by 0x5755FB4:
>> tree_evaluator::visit_statement(tree_statement&) (pt-eval.cc:689)
>> | > ==12836== by 0x57728F3: tree_statement::accept(tree_walker&)
>> (pt-stmt.cc:152)
>> | > ==12836== by 0x5755DBB:
>> tree_evaluator::visit_statement_list(tree_statement_list&) (pt-eval.cc:725)
>> | > ==12836== by 0x577291F: tree_statement_list::accept(tree_walker&)
>> (pt-stmt.cc:216)
>> | >
>> | > As far as I can tell, this is happening at the point where the result
>> | > from xgemm is being stuffed into an octave_value object and
>> | > maybe_mutate is called to try to narrow the result.
>> | >
>> | > I don't understand why something would be uninitialized here, but the
>> | > following change avoids the problem for me:
>> | >
>> | > diff --git a/liboctave/fCMatrix.cc b/liboctave/fCMatrix.cc
>> | > --- a/liboctave/fCMatrix.cc
>> | > +++ b/liboctave/fCMatrix.cc
>> | > @@ -3769,7 +3769,7 @@
>> | > {
>> | > octave_idx_type lda = a.rows ();
>> | >
>> | > - retval = FloatComplexMatrix (a_nr, b_nc);
>> | > + retval = FloatComplexMatrix (a_nr, b_nc, 0.0);
>> | > FloatComplex *c = retval.fortran_vec ();
>> | >
>> | > const char *ctra = get_blas_trans_arg (tra, cja);
>> | >
>> | > I wouldn't think this should be necessary, because I thought CHERK
>> | > and the following code to fill in the lower part of the symmetric
>> | > matrix would properly initialize everything.
>> | >
>> | > jwe
>> | >
>> |
>> | You may have hit the nail on the head. The problem is the following
>> | statement in the ZHERK docs of reference BLAS:
>> |
>> | * Note that the imaginary parts of the diagonal elements need
>> | * not be set, they are assumed to be zero, and on exit they
>> | * are set to zero.
>> |
>> | Probably ATLAS does not do the latter. I think it's an ATLAS bug,
>> | then.
>>
>> Yes, I agree. I removed the ATLAS library on my system and the
>> uninitialized value problem disappeared.
>>
>> | It is not needed to initialize the whole Matrix with zeros, but
>> | what about something like the attached? This won't have a noticeable
>> | performance impact:
>>
>> Unfortunately, it did not work. I'm seeing the same error. I don't
>> see how that can happen, as it would seem that everything must have
>> been initialized.
>
> That would mean the bug is deeper in the computations.
>
>> The only thing that seems to be working is to
>> initialize the entire C matrix prior to the call to cherk.
>>
>
> Surely you meant zherk? You're working in double by default.
>
Aha, sorry, not. Please ignore this.
--
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz