[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: When specifying multiple elements with "-e" option of join command
From: |
Goto, Ryoichi |
Subject: |
RE: When specifying multiple elements with "-e" option of join command |
Date: |
Fri, 31 Mar 2017 08:37:54 +0900 |
Dear Gordon,
Thank you for your reply.
First, it is the field of File 2, but because of the editor it has become four
fields, but in fact
==> File 2 <==
address@hidden 1 password-1
address@hidden 2 password-2
I thought that I wanted to match the number of fields combined with the next
option to File 2 in three fields.
"-o 0 2.2 2.3"
Next is the correct answer.
[File 1]
address@hidden
address@hidden
address@hidden
-
[File 2]
address@hidden 1 password-1
address@hidden 2 password-2
-
[Expected results]
address@hidden 0 PASSWORD-0
address@hidden 2 password-2
address@hidden 1 password-1
I understand that processing is possible using awk or sed. However, I would
like to know about the specification of "join -e"
option, whether there is a function to supplement two fields missing in File1.
Thank you.
> -----Original Message-----
> From: Assaf Gordon [mailto:address@hidden]
> Sent: Friday, March 31, 2017 12:09 AM
> To: Goto, Ryoichi <address@hidden>
> Cc: address@hidden
> Subject: Re: When specifying multiple elements with "-e" option of join
> command
>
> Hello,
>
> > On Mar 30, 2017, at 01:41, Goto, Ryoichi <address@hidden> wrote:
> >
> > [...]
> > I tried executing the following command, but the record "address@hidden"
> which exists only in File 1 has two pairs of
> character strings specified by "-e" output.
> > $ Join -1 1 - o 0 2.2 2.3 - a 1 - e "0 PASSWORD 0" <(sort File 1)
> > <(sort File 2)
> >
> > [Actual result]
> > Jiro @ yahoo.jp 0 PASSWORD 0 0 PASSWORD 0 address@hidden 2 password 2
> > address@hidden 1 password 1
> >
> > If you remove the double quotes from the command line you ran, "join:
> extra operator '/ dev / fd / 62'" and an unknown
> error will be displayed and say "- e 0 - e PASSWORD 0" The syntax is also an
> error.
>
> The "-e" parameter fills missing fields with the given value. The second
> file has 4 fields, and after the join 3 fields
> are missing - so the string you've set to "-e" appears multiple times.
>
> Notice the following:
>
> $ head *
> ==> file1 <==
> address@hidden
> address@hidden
> address@hidden
>
> ==> file2 <==
> address@hidden 1 password 1
> address@hidden 2 password 2
>
> $ join -o auto -e MISSING -a 1 -j1 <(sort file1) <(sort file2)
> address@hidden 2 password 2
> address@hidden MISSING MISSING MISSING
> address@hidden 1 password 1
>
> I can suggest two work-arounds, which work with your specific files:
>
> Option #1:
> Because 'file1' has only one field, we know implicitly that any joined line
> which still has one field in the output did
> not have a matching record in the second file.
> Then, a simple AWK script can add the needed password:
>
> $ join -a 1 -j1 <(sort file1) <(sort file2)
> address@hidden 2 password 2
> address@hidden
> address@hidden 1 password 1
>
> $ join -a 1 -j1 <(sort file1) <(sort file2) \
> | awk 'NF==1 { print $0, "0 password 0" } NF!=1 { print }'
> address@hidden 2 password 2
> address@hidden 0 password 0
> address@hidden 1 password 1
>
>
> Option #2:
> Use "-e" to mark lines with missing values, then detect and replace then
> with sed:
>
> $ join -o auto -e XX -a 1 -j1 <(sort file1) <(sort file2)
> address@hidden 2 password 2
> address@hidden XX XX XX
> address@hidden 1 password 1
>
> $ join -o auto -e XX -a 1 -j1 <(sort file1) <(sort file2) \
> | sed 's/XX XX XX/0 password 0/'
> address@hidden 2 password 2
> address@hidden 0 password 0
> address@hidden 1 password 1
>
>
> Of course these are just examples which can be used as basis for similar
> variations.
>
> Hope this helps,
>
> regards,
> - assaf