bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Makes sort create random order


From: Frederik Eaton
Subject: Re: [PATCH] Makes sort create random order
Date: Mon, 31 Jan 2005 15:55:49 -0800
User-agent: Mutt/1.5.6+20040907i

On Mon, Jan 31, 2005 at 11:37:21AM -0800, Paul Eggert wrote:
> Frederik Eaton <address@hidden> writes:
> 
> > I've given many examples - can you give an example of a situation
> > where people would put (a) differently-formatted numbers in a column
> > of a file (how would they become differently-formatted?) and then sort
> > randomly based on their values, (b) insisting that ties stay together?
> 
> No, but that's because I don't know what the phrase "sort randomly
> based on their values" means.  If it's really a random process, it
> will ignore their values; then it's not a sort at all.

e.g.

$ sort -k 1,1R -k 3n
foo    1
foo    2
foo    3
bar    1
bar    2
bar    3
baz    1
baz    2
baz    3

Artificial? Maybe. Then again, it's the albums/songs example.

> > you might want to sort on one key and randomize on another
> 
> That can be done easily by combining "permute" and "sort", no?  You
> permute the input, then use a stable sort on the key that you want to
> sort by.

Sure. But less convenient and efficient.

> > none of the existing programs handle large files as well as 'sort'
> > does.
> 
> "tac" does.

I'm talking about line-randomizing programs.

> >> That's OK in many applications.  (You have 30 black balls and 20 white
> >> balls in an urn, and want to select 7 balls without replacement....)
> >
> > OK, after some thought I agree with you. Do you think it would be too
> > confusing to have both alternatives available?
> 
> It'd be nicer if we had just one alternative.  And it's pretty easy:
> just do "sort -u | permute" if you want to avoid duplicates.

What's this 'permute'? Perhaps I should have given an example, but see
above. I don't want to avoid duplicates - I'm saying that having them
sorted together is good in some applications, and isn't necessarily a
disadvantage in others (where they often don't exist). But then you
said that they often do exist. And then I said well why not have
options to do it both ways... just making sure we're on the same page.
Your proposal's behavior is probably more intuitive and more useful
for the common case of people who want to shuffle a list of songs
which may for whatever reason contain duplicates (although duplicates
are likely to be quite rare...), but it can't achieve the
functionality of the above example, which has a better connection to
the other features of 'sort', and therefore perhaps more justification
for inclusion, in a certain bottom-up sense.

Frederik

-- 
----------------------------------------------------------------
Frederik Eaton                         http://ofb.net/~frederik/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]