[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#36130: split bug
From: |
Heather Wick |
Subject: |
bug#36130: split bug |
Date: |
Fri, 7 Jun 2019 14:23:15 -0400 |
Hello,
I am using split to split up some large, paired fastq files (nearly 4
billion lines each). I am using the -l flag to split into files of 10
million reads (40 million lines) each and though the fastq files have
matched and sorted reads, split is creating different numbers of split
files for the two paired fastq files, and the pairing becomes off at some
point. The jobs finished without exceeding memory and with an exit status
0, and I noticed the help file said to email this address if there were
bugs, so I thought I would mention it.
This is the line I am using to call split on my zipped fastq files:
zcat MH1_R1.fastq.gz | split - -l 40000000 DHT_R1_
zcat MH1_R2.fastq.gz | split - -l 40000000 DHT_R2_
This creates 96 chunks for the R1 and 95 chunks for R2, even though the
orignal fastq files have the same number of reads.
Do you have any suggestions for how to proceed? Perhaps zcatting and piping
the files is not the best way to call split?
Thanks,
~ Heather
--
Heather Wick
PhD Candidate, Human Genetics
Labs of Sarah Wheelan and Vasan Yegnasubramanian
Institute of Genetic Medicine
Johns Hopkins University School of Medicine
address@hidden
- bug#36130: split bug,
Heather Wick <=
- bug#36130: split bug, Assaf Gordon, 2019/06/07
- bug#36130: split bug, Heather Wick, 2019/06/07
- bug#36130: split bug, Assaf Gordon, 2019/06/07
- bug#36130: split bug, Heather Wick, 2019/06/10
- bug#36130: split bug, Pádraig Brady, 2019/06/10
- bug#36130: split bug, Assaf Gordon, 2019/06/10
- bug#36130: split bug, Assaf Gordon, 2019/06/26
- bug#36130: split bug, Heather Wick, 2019/06/26