bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

sed segfaults


From: Alex J. Dam
Subject: sed segfaults
Date: Sun, 24 Aug 2003 18:13:38 -0300
User-agent: Opera7.11/Linux M2 build 406


 Hello.

 I adapted a sed script that converts input to uppercase to
work with some accented characters of Portuguese.
 Here's it (it's called mai):

#!/usr/local/bin/sed -f
s/$/aAáÁàÀâÂãÃbBcCçÇdDeEéÉêÊfFgGhHiIíÍjJkKlLmMnNoOóÓôÔõÕpPqQrRsStTuUúÚüÜvVwWxXyYzZ/
:a
s/\([aáàâãbcçdeéêfghiíjklmnoóôõpqrstuúüvwxyz]\)\(.*\1\)\(.\)/\3\2\3/
ta
s/.\{78\}$//

 Maybe it's not a nice script, but that doesn't matter.
 I put it under ~/bin. When I was testing it, it segfaulted
when processing some inputs. Example:

 $ ~/bin/mai
 abcde fcggfahh
 Segmentation fault

 Really strange. Now:

 $ cd ~/bin
 $ ./mai

 abcde fcggfahh
 ABCDE FCGGFAHH

 Not all strings cause sed to crash. In fact I was (un)lucky to discover
the one above does cause sed to crash.

 I'm not used to debugging programs, but here's the stack trace gdb gave
me when sed crashed:


Program received signal SIGSEGV, Segmentation fault.
0x080589c8 in prune_impossible_nodes (preg=0x807dd48, mctx=0xbfffd720) at regexec.c:865
865                 } while (!mctx->state_log[match_last]->halt);
(gdb) backtrace
#0 0x080589c8 in prune_impossible_nodes (preg=0x807dd48, mctx=0xbfffd720) at regexec.c:865
#1  0x0805867d in re_search_internal (preg=0x807dd48,
string=0x807c608 "ABCDE FCGGFAHHaA<...>zZn/\200", length=118, start=0, range=118, stop=118, nmatch=4,
   pmatch=0x8080b50, eflags=0) at regexec.c:758
#2  0x08057c3a in re_search_stub (bufp=0x807dd48,
string=0x807c608 "ABCDE FCGGFAHHaA<...>zZn/\200", length=118, start=0, range=118, stop=118, regs=0x80613f4,
   ret_len=0) at regexec.c:405
#3  0x080578b1 in re_search (bufp=0x807dd48,
string=0x807c608 "ABCDE FCGGFAHHaA<...>zZn/\200", length=118, start=0, range=118, regs=0x80613f4) at regexec.c:275
#4  0x0804e925 in match_regex (regex=0x807dd48,
buf=0x807c608 "ABCDE FCGGFAHHaA<...>zZn/\200", buflen=118, buf_start_offset=0, regarray=0x80613f4, regsize=10)
   at regex.c:232
#5  0x0804d5ba in do_subst (sub=0x807c6e0) at execute.c:898
#6 0x0804e0d0 in execute_program (vec=0x807c688, input=0xbffff9b0) at execute.c:1304 #7 0x0804e664 in process_files (the_program=0x807c688, argv=0xbffffa70) at execute.c:1518
#8  0x08049991 in main (argc=3, argv=0xbffffa64) at sed.c:288
#9 0x40036552 in __libc_start_main (main=0x804962c <main>, argc=3, ubp_av=0xbffffa64, init=0x8049044 <_init>, fini=0x4001562c <_dl_debug_mask>, rtld_fini=0xbfffd720, stack_end=0x4) at ../sysdeps/generic/libc-start.c:129

I replaced part of the strings with <...> because gdb seems to have some problems with utf-8.

 My locale is: en_US.UTF-8
 Sed: 4.0.7

 Is this really a bug in sed?

 Thanks





reply via email to

[Prev in Thread] Current Thread [Next in Thread]