a story of a ddrescue failure

bug-ddrescue
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
a story of a ddrescue failure

From:	Todd Brunhoff
Subject:	a story of a ddrescue failure
Date:	Sun, 4 Oct 2020 16:03:27 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0
This weekend I spent my time recovering files from my brother's My Passport 
Ultra, a 2TB spinning
disk that failed on his windows laptop after about 4 years of travel around the 
world. I tried four
attempts at recovery, the second of which was ddrescue which, sadly, made 
things worse.  I'll
describe the four here in hopes of improving things for the future in ddrescue.

Attempt #1:

I have long used dd to recover file systems, mostly linux, by setting bs=512 
which is generally
the sector size. This usually copies the most data because nothing is lost when 
a large read
fails because one sector went bad. This was able to copy the partition tables 
amidst lots of
errors.  After that initial 1.5MB it appeared to be copying the rest of the 
disk successfully
showing only 2256 bad sectors out of 1.5TB over 20 hours or so.

When plugging in the disk, the erros began with this.

Oct 02 13:21:28 archive1 kernel: sd 6:0:0:0: [sdg] tag#0 FAILED Result: 
hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s
Oct 02 13:21:28 archive1 kernel: sd 6:0:0:0: [sdg] tag#0 CDB: Read(10) 28 00 00 
00 00 00 00 00 08 00
Oct 02 13:21:28 archive1 kernel: blk_update_request: I/O error, dev sdg, sector 
0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
Oct 02 13:21:28 archive1 kernel: usb 4-3: reset SuperSpeed Gen 1 USB device 
number 2 using xhci_hcd

As the copy progressed, this became the pattern every 5 seconds:

Oct 02 13:41:47 archive1 kernel: buffer_io_error: 159019 callbacks suppressed
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288589, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288590, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288590, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288590, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288590, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288590, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288590, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288590, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288590, async page read
Oct 02 13:41:47 archive1 kernel: Buffer I/O error on dev sdg1, logical block 
288591, async page read

But at that point I had begun to read other articles about what not to do 
(including not copying 512
bytes at a time), and the manual for ddrescue, and started extracting 10MB 
every 10GB and piping it
into 'od -Ad -tx1z'.  This showed that while dd was happy with the reads, the 
entire 1.5TB read so
far was all zeros. So I killed that, and started ddrescue.

Attempt #2

Ddrescue seems to be well written with lots of strategy and options. The doc 
reads a bit like rsync,
where you end up starting with the examples in the middle of the doc. After 
doing a power cycle on
the drive, which started out with the same errors as the first ones above. I 
started with this

ddrescue --sparse -i30GiB /dev/sdg sdg.bin sdg.mapfile

... so that I could avoid the start of the disk, which is where I thought most 
of the errors fell.
ddrescue made some progress, but continuous errors looked like this:

Oct 03 16:01:26 archive1 kernel: usb 4-3: reset SuperSpeed Gen 1 USB device 
number 14 using xhci_hcd
Oct 03 16:01:26 archive1 kernel: sd 7:0:0:0: [sdg] tag#0 FAILED Result: 
hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s
Oct 03 16:01:26 archive1 kernel: sd 7:0:0:0: [sdg] tag#0 CDB: Read(10) 28 00 e8 
df 86 78 00 00 80 00
Oct 03 16:01:26 archive1 kernel: blk_update_request: I/O error, dev sdg, sector 
3906963064 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 0
Oct 03 16:01:26 archive1 kernel: usb 4-3: reset SuperSpeed Gen 1 USB device 
number 14 using xhci_hcd
Oct 03 16:01:26 archive1 kernel: sd 7:0:0:0: [sdg] tag#0 FAILED Result: 
hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s
Oct 03 16:01:26 archive1 kernel: sd 7:0:0:0: [sdg] tag#0 CDB: Read(10) 28 00 e8 
df 87 08 00 00 78 00
Oct 03 16:01:26 archive1 kernel: blk_update_request: I/O error, dev sdg, sector 
3906963208 op 0x0:(READ) flags 0x80700 phys_seg 15 prio class 0

but eventually ended with this after about 15 minutes.

Oct 03 16:01:15 archive1 kernel: usb 4-3: reset SuperSpeed Gen 1 USB device 
number 13 using xhci_hcd
Oct 03 16:01:20 archive1 kernel: usb 4-3: device firmware changed
Oct 03 16:01:20 archive1 kernel: usb 4-3: USB disconnect, device number 13
Oct 03 16:01:20 archive1 kernel: scsi 6:0:0:0: rejecting I/O to dead device
Oct 03 16:01:20 archive1 kernel: blk_update_request: I/O error, dev sdg, sector 
62934352 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0

I tried power cycle and the same ddrescue command (using the same map file) it 
failed in about 3 minutes with this:

Oct 03 16:23:27 archive1 kernel: usb 4-3: new SuperSpeed Gen 1 USB device 
number 26 using xhci_hcd
Oct 03 16:23:33 archive1 kernel: usb 4-3: device descriptor read/8, error -110
Oct 03 16:23:33 archive1 kernel: usb 4-3: new SuperSpeed Gen 1 USB device 
number 26 using xhci_hcd
Oct 03 16:23:38 archive1 kernel: usb 4-3: device descriptor read/8, error -110
Oct 03 16:23:38 archive1 kernel: usb usb4-port3: unable to enumerate USB device

So I gave up on that, since the drive/driver was giving up the ghost.

Attempt #3

I had to switch to win10.

Popular mechanics had a pretty good article 
(https://www.popularmechanics.com/technology/gadgets/how-to/a3086/hard-drive-recovery/)
which, for software, pointed at EaseUS, Recuva and Prosoft Data Rescue. I first 
tried Prosoft, which was immediately able to
find files. The price of $19 seemed good but after poking around and finding 
that the pro version was $399, it felt like an upsell.

Attempt #4

The last effort was with EaseUS. This was pretty good, found files immediately, 
and had a clear policy of $99/year and $67/mo,
cancel anytime.  So I grit my teeth, and signed up, with an entry on my 
calendar to cancel in 3 weeks.

The UI was very bare. It found all of the files plus a huge number of deleted 
files over a period of about 12 hours. I started the
restore, going from a usb->sata to a usb->sata 1TB extra drive I had.  It is 
still running and look like it will finish in about
12 hours.

Conclusion:

There's something about both prosoft and easus where they know how to avoid or 
suppress errors from the drive and still extract
the files, clearly losing much of the directory structure. If ddrescue could 
figure out that algorythm of avoiding the errors,
I would it as my first choice.

Todd Brunhoff
Media Architect
Portland, OR
[Prev in Thread]
Current Thread
[Next in Thread]
a story of a ddrescue failure, Todd Brunhoff <=
Next by Date: gddrescue ssd/nvme questions
Next by thread: gddrescue ssd/nvme questions
Index(es):
- Date
- Thread