info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Spaces added ... and line endings in general


From: Eric Siegerman
Subject: Re: Spaces added ... and line endings in general
Date: Tue, 23 Jan 2001 21:30:07 -0500
User-agent: Mutt/1.2.5i

On Tue, Jan 23, 2001 at 06:40:19PM -0500, Laine Stump wrote:
> > Strikes me this might actualy make the CVS code simpler :-) ... instead
> > of converting to local conventions (thus needing too know what they
> > are) you would simply pass the cannonical file through a filter which
> > 'localised' it as requested.
> 
> Right now CVS is relying on the C library file functions (fopen,
> fputs, etc) to do the "filtering".

Yup; the CVS code is already as simple as it could possibly be:
open the working file with or without a "b" in the fopen() mode
string, and let stdio handle it from there.


> [...] the only tool I
> use that doesn't accept/deal with multiple line-end conventions is _CVS
> itself_.

Your other tools deal with two line-end conventions, maybe three
(ie. Mac), if you're really lucky.  It doesn't deal with *all*
relevent conventions.

CVS does, by definition (where the definition of "relevent" is
"CVS has been ported to it").  Automatically (see above).  As
long as you don't try to fool it.

How should a *single* CVS executable "accept/deal with" all of
the following, which it *must* do if it's to defend itself
against the kinds of abuse you want to throw at it?
  - Unix format: <LF>
  - DOS format:  <CR><LF>
  - Mac format:  <CR>
  - Files in which some lines use one of the above conventions,
    and some use another (because you edited a DOS-format file in
    vi on a Unix box, and didn't religiously type the ^v^m's)
  - Unix-format files that contain <CR>s as actual formatting
    characters -- perhaps even at the ends of lines, for doing
    overstriking, so looking specifically for <CR><LF> is unsafe
  - Record-oriented formats which use length words and have no
    terminator at all.  This is old mainframe stuff -- dying, but
    alas not dead yet.  (For an example, see below.)


> > But it would need to be set in variety of ways:
> > 
> >     setting on file, 
> >     overridden by .rc file, 
> >     overridden by environment, 
> >     overridden by cmd options
> 
> YES!!!! This is exactly what I'd like to see! (naysayers be damned! ;-)

Perhaps I'm being obtuse, but how does this help with the
following use case:

> The only problem is when you do the checkout on platform X,
> then work with that work directory on platform Y.


Semi-aside: here's an example of a record-oriented format, for
those who've never been, ahem, lucky enough to work on a system
that uses one.  This is (as nearly as I can recall; it's been a
*long* time), the text-file format used by VMS.  Each text line
consists of a "short" containing the record length, in binary,
followed by that many bytes of ASCII, padded to a word boundary.

Here's what my favourite Ogden Nash poem would look like in the
format I've described (which may or may not be exactly VMS's).
Notice the blank line between heading and body:

    0008    5468    6520    4c61    6d61    000d    6279    204f
    .  .    T  h    e       L  a    m  a    .  .    b  y       O
    6764    656e    204e    6173    6800    0000    001a    4120
    g  d    e  n       N    a  s    h  .    .  .    .  .    A    
    6f6e    652d    4c20    6c61    6d61    2068    6527    7320
    o  n    e  -    L       l  a    m  a       h    e  '    s    
    6120    7072    6965    7374    001a    4120    7477    6f2d
    a       p  r    i  e    s  t    .  .    A       t  w    o  -
    4c20    6c6c    616d    6120    6865    2773    2061    2062
    L       l  l    a  m    a       h  e    '  s       a       b
    6561    7374    001d    416e    6420    4920    776f    756c
    e  a    s  t    .  .    A  n    d       I       w  o    u  l
    6420    6265    7420    6120    7369    6c6b    2070    616a
    d       b  e    t       a       s  i    l  k       p    a  j
    616d    6100    001e    5468    6572    6520    6973    6e27
    a  m    a  .    .  .    T  h    e  r    e       i  s    n  '
    7420    616e    7920    7468    7265    652d    4c20    6c6c
    t       a  n    y       t  h    r  e    e  -    L       l  l
    6c61    6d61
    l  a    m  a

Here's the cleartext:
    The Lama
    by Ogden Nash

    A one-L lama he's a priest
    A two-L llama he's a beast
    And I would bet a silk pajama
    There isn't any three-L lllama

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.        address@hidden
|  |  /
Interviewer: You've been looking at the stars all your life:
Is there anything in astrology?
Arthur C. Clarke: It's utter nonsense.  But I'm a Sagittarius,
so I'm naturally skeptical.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]