[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Is there a way to print unicode characters and the actual code?
From: |
Peng Yu |
Subject: |
Re: Is there a way to print unicode characters and the actual code? |
Date: |
Sat, 24 Feb 2018 20:12:01 -0600 |
> $ od -An -tx1 -ta -tc <<< 'exámple'
> 65 78 c3 a1 6d 70 6c 65 0a
> e x C ! m p l e nl
> e x 303 241 m p l e \n
At this moment, I wrote some python code to do this, which prints both
the decoded code as well as the encoded code in both hex and binary
numbers in TSV format.
`c if ord(c)>31 else repr(str(c)).strip("'")` is hacky. I am not sure
if there is a good way get things like \f \b as `od` would.
$ cat dumpunicode0.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import sys
for line in sys.stdin:
for c in line.decode('utf-8'):
utf8_encode = '0x' + ''.join(
['%x' % ord(x) for x in reversed(c.encode('utf-8'))]
)
print '\t'.join(
(
c if ord(c)>31 else repr(str(c)).strip("'")
, '0x%x' % ord(c)
, bin(ord(c)).strip("'")
, utf8_encode
, bin(int(utf8_encode, base=16)).strip("'")
)
)
$ ./dumpunicode0.py <<< á
á 0xe1 0b11100001 0xa1c3 0b1010000111000011
\n 0xa 0b1010 0xa 0b1010
$ printf '\f'| od -xc
0000000 000c
\f
0000001
$ printf '\f'| ./dumpunicode0.py
\x0c 0xc 0b1100 0xc 0b1100
--
Regards,
Peng