help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rounding floats from 64bit to 32bit (double to single) with 0.5 rule


From: Sergei Steshenko
Subject: Re: Rounding floats from 64bit to 32bit (double to single) with 0.5 rule
Date: Mon, 26 Dec 2016 10:25:29 +0000 (UTC)





>________________________________
> From: hale812 <address@hidden>
>To: address@hidden 
>Sent: Monday, December 26, 2016 9:25 AM
>Subject: Rounding floats from 64bit to 32bit (double to single) with 0.5 rule
> 
>
>Seems like single() function truncates IEEE 754 double float by simply
>omitting irrelevant bits.
>
>This however becomes a problem of error accumulation, when converting data
>for 32bit DSP with a long path of computation.
>
>For better results, the number should be rounded to Sgn1Exp8/Sig23 in binary
>representation before truncating.
>
>Is there a tool for Octave for rounded conversion to Single; or just binary
>rounding(while maintaining irrelevant bits as zeroes in Double numbers) ?
>
>
>
>--
>View this message in context: 
>http://octave.1599824.n4.nabble.com/Rounding-floats-from-64bit-to-32bit-double-to-single-with-0-5-rule-tp4681146.html
>Sent from the Octave - General mailing list archive at Nabble.com.

>

How about first adding 0.5 * first_to_be_discarded_bit_value and the using the 
existing mechanism ?

--Sergei.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]