[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion fr
From: |
Bruno Haible |
Subject: |
Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv |
Date: |
Sun, 02 Apr 2023 03:08:43 +0200 |
Hi,
Mike Fulton wrote:
> I have hit an issue where the conversion for EBCDIC SBCS conversion is not
> consistent between the two utilities, and am wondering if:
> - this has come up before
It has not been reported before.
> - there is interest in providing consistent behaviour
Only in case of blatant mistakes.
There are many variations of encoding tables. For the non-EBCDIC ones, I
created this archive:
https://haible.de/bruno/charsets/conversion-tables/
These many variations pose problems mostly for East Asian charsets.
As implementor of GNU libiconv, I am careful to choose mappings that
are
1) as close to standards, de-facto standards, and glibc iconv mapping
tables as possible,
2) not going to cause big practical trouble.
Some differences are practically unimportant, for example this one:
$ ./table-diff ebcdic/glibc-iconv/IBM-273.TXT ebcdic/mine/IBM-273.TXT
***************
*** 188,190 ****
0xBB 0x007C # VERTICAL LINE
! 0xBC 0x203E # OVERLINE
0xBD 0x00A8 # DIAERESIS
--- 188,190 ----
0xBB 0x007C # VERTICAL LINE
! 0xBC 0x00AF # MACRON
0xBD 0x00A8 # DIAERESIS
This is unimportant, because OVERLINE and MACRON are interchangeable for
most users; you need to be a Unicode expert in order to understand the
difference.
> I have comparisons for the various code pages, but if we look at the most
> common code page conversion, it is likely IBM-1047 to/from ISO8859-1.
> Here is what I see:
> The numbers are the output from 'cmp -l' so the output is read as:
> <byte-number> <byte value file 1> <byte value file 2>
Your first column appears to be decimal, the second and third column octal.
I prefer to use hexadecimal throughout.
> Convert from IBM-1047 to ISO8859-1: compare open source iconv to IBM z/OS
> iconv
> 21 205 12
> 37 12 205
Does EBCDIC 0x15 map to U+0085 or to U+000A ? The table entry for NL in
https://en.wikipedia.org/wiki/EBCDIC#Definitions_of_non-ASCII_EBCDIC_controls
is not conclusive:
"Line break. Default mapping (0085) matches ISO/IEC 6429's NEL.
Mappings sometimes swapped with Line Feed (EBCDIC 0x25) in accordance
with UNIX line breaking convention."
I made a guess as to which choice will create the least interopability
problem. The same guess/choice as glibc does, by the way.
> Convert from ISO8859-1 to IBM-1047: compare open source iconv to IBM z/OS
> iconv
> 133 25 45
That's the mapping of U+0085: 0x15 or 0x25? It's just the inverse facet
of the difference discussed above.
> z/OS is not the only platform that has EBCDIC files. IBM i also uses EBCDIC
> as does z/VM, z/VSE.
Good to know. But as I said above, I'm interested in the differences between
them only if there is a significant practical relevance.
Bruno
- [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/01
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv,
Bruno Haible <=
- Message not available
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/02
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/02
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/02
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/03