[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: msgfmt -cv aborts with no message at all
From: |
Bruno Haible |
Subject: |
Re: msgfmt -cv aborts with no message at all |
Date: |
Thu, 11 Sep 2003 18:56:23 +0200 |
User-agent: |
KMail/1.5 |
Jochen Hein wrote:
> I'm doing a "msgfmt -cv " on the attached file. This file probably
> contains broken UTF-8 encoding, which I tried to catch that way.
> ...
> #1 0x40075872 in raise () from /lib/libc.so.6
> #2 0x40076986 in abort () from /lib/libc.so.6
> #3 0x40023986 in po_callback_comment () from
Thanks for reporting this bug. The appended patch fixes it.
> msgfmt should at least give an informative message and point to the
> file (and character) where the encoding is broken.
When a program calls abort(), it's an indicator of a bug in the program.
This was the case here. The encoding of the file is not broken, it
"just" contains an UTF-8 encoded character > 0x10FFFF; this is valid
according to the system's (glibc's) iconv module for UTF-8.
Bruno
diff -c -3 -r1.7 -r1.8
*** gettext-tools/src/po-lex.c 9 Sep 2003 13:40:36 -0000 1.7
--- gettext-tools/src/po-lex.c 11 Sep 2003 16:40:32 -0000 1.8
***************
*** 511,518 ****
abort ();
/* Convert it from UTF-8 to UCS-4. */
mbc->uc_valid = true;
! if (u8_mbtouc (&mbc->uc, scratchbuf, outbytes) != outbytes)
! abort ();
break;
}
}
--- 511,521 ----
abort ();
/* Convert it from UTF-8 to UCS-4. */
mbc->uc_valid = true;
! /* We ignore the return value of u8_mbtouc(): Usually it returns
! outbytes, but if scratchbuf contains an out-of-range Unicode
! character (> 0x10ffff), it can also return 1 and set mbc->uc
! to 0xfffd. This is precisely what we need. */
! u8_mbtouc (&mbc->uc, scratchbuf, outbytes);
break;
}
}
- Re: msgfmt -cv aborts with no message at all,
Bruno Haible <=