nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sort and delete duplcate messages


From: Ralph Corderoy
Subject: Re: Sort and delete duplcate messages
Date: Mon, 04 May 2020 09:55:39 +0100

Hi,

Ken wrote:
> > I know that 'sortm -textfield Subject' will sort messages accoring
> > to the subject field. Having run that command, is there a way to
> > then delete the first duplicate of each message in the list such
> > that if 1 and 2 are duplicates and 6 and 7 are duplicates you would
> > delete messages 2 and 7 leaving 1 and 6?
>
> I want to say you could do something with piping the output of scan
> into "uniq -d -f <num>".  Might require a custom scan format, but that
> seems relatively simple.
>
> Hm, a quick test:
>
> % scan -format '%(msg) %{subject}' | uniq -d -f 1
>
> suggests that it prints the first one, not later ones, so that isn't
> exactly what you want.  Might be a good starting point, though?  You
> could probably do something with uniq -c and pipe that to an awk
> script that did what you wanted.

awk's probably easiest, after deciding what counts as an equivalent
subject field.

    $ ls
    1  2  3  4
    $ sed -n l *
    subject: foo bar$
    subject: foo$
     bar$
    subject: xyzzy $
    subject: fo=?utf-8?Q?=6f?= bar$
    $
    $ scan -width 0 -format '%(decode{subject}):%{subject}:%(putlit{subject}):' 
+.
    foo bar:foo bar: foo bar:
    foo bar:foo bar: foo
     bar:
    xyzzy:xyzzy: xyzzy:
    foo bar:fo=?utf-8?Q?=6f?= bar: fo=?utf-8?Q?=6f?= bar:
    $
    $ scan -width 0 -format '%(msg) %(decode{subject})' +.
    1 foo bar
    2 foo bar
    3 xyzzy
    4 foo bar
    $
    $ scan -width 0 -format '%(msg) %(decode{subject})' +. |
    > awk '{m=$1; sub(/[^ ]* /, "", $0)} NR>1 && $0==l {print m} {l=$0}'
    2
    $

-- 
Cheers, Ralph.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]