[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
sed segfaults
From: |
Alex J. Dam |
Subject: |
sed segfaults |
Date: |
Sun, 24 Aug 2003 18:13:38 -0300 |
User-agent: |
Opera7.11/Linux M2 build 406 |
Hello.
I adapted a sed script that converts input to uppercase to
work with some accented characters of Portuguese.
Here's it (it's called mai):
#!/usr/local/bin/sed -f
s/$/aAáÁàÀâÂãÃbBcCçÇdDeEéÉêÊfFgGhHiIíÍjJkKlLmMnNoOóÓôÔõÕpPqQrRsStTuUúÚüÜvVwWxXyYzZ/
:a
s/\([aáàâãbcçdeéêfghiíjklmnoóôõpqrstuúüvwxyz]\)\(.*\1\)\(.\)/\3\2\3/
ta
s/.\{78\}$//
Maybe it's not a nice script, but that doesn't matter.
I put it under ~/bin. When I was testing it, it segfaulted
when processing some inputs. Example:
$ ~/bin/mai
abcde fcggfahh
Segmentation fault
Really strange. Now:
$ cd ~/bin
$ ./mai
abcde fcggfahh
ABCDE FCGGFAHH
Not all strings cause sed to crash. In fact I was (un)lucky to discover
the one above does cause sed to crash.
I'm not used to debugging programs, but here's the stack trace gdb gave
me when sed crashed:
Program received signal SIGSEGV, Segmentation fault.
0x080589c8 in prune_impossible_nodes (preg=0x807dd48, mctx=0xbfffd720) at
regexec.c:865
865 } while (!mctx->state_log[match_last]->halt);
(gdb) backtrace
#0 0x080589c8 in prune_impossible_nodes (preg=0x807dd48, mctx=0xbfffd720)
at regexec.c:865
#1 0x0805867d in re_search_internal (preg=0x807dd48,
string=0x807c608 "ABCDE FCGGFAHHaA<...>zZn/\200", length=118, start=0,
range=118, stop=118, nmatch=4,
pmatch=0x8080b50, eflags=0) at regexec.c:758
#2 0x08057c3a in re_search_stub (bufp=0x807dd48,
string=0x807c608 "ABCDE FCGGFAHHaA<...>zZn/\200", length=118, start=0,
range=118, stop=118, regs=0x80613f4,
ret_len=0) at regexec.c:405
#3 0x080578b1 in re_search (bufp=0x807dd48,
string=0x807c608 "ABCDE FCGGFAHHaA<...>zZn/\200", length=118, start=0,
range=118, regs=0x80613f4) at regexec.c:275
#4 0x0804e925 in match_regex (regex=0x807dd48,
buf=0x807c608 "ABCDE FCGGFAHHaA<...>zZn/\200", buflen=118,
buf_start_offset=0, regarray=0x80613f4, regsize=10)
at regex.c:232
#5 0x0804d5ba in do_subst (sub=0x807c6e0) at execute.c:898
#6 0x0804e0d0 in execute_program (vec=0x807c688, input=0xbffff9b0) at
execute.c:1304
#7 0x0804e664 in process_files (the_program=0x807c688, argv=0xbffffa70) at
execute.c:1518
#8 0x08049991 in main (argc=3, argv=0xbffffa64) at sed.c:288
#9 0x40036552 in __libc_start_main (main=0x804962c <main>, argc=3,
ubp_av=0xbffffa64, init=0x8049044 <_init>,
fini=0x4001562c <_dl_debug_mask>, rtld_fini=0xbfffd720, stack_end=0x4)
at ../sysdeps/generic/libc-start.c:129
I replaced part of the strings with <...> because gdb seems to have some
problems with utf-8.
My locale is: en_US.UTF-8
Sed: 4.0.7
Is this really a bug in sed?
Thanks
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- sed segfaults,
Alex J. Dam <=