Discussion:
NAND device: all blocks bad after restart, how to recover?
Lars Michael
2011-03-09 16:39:27 UTC
Permalink
Hello,

I repost this issue, with correct subject and hope to have some
feedback. After working with getting UBI images mounted in Linux (it did
work for a sec!) the NAND device suddenly went totally bad after a system
restart! So all blocks are reported bad and the mtd is no longer visible
in Linux. Before it had just a few bad blocks. I am really worried about
what has happend and if this can occur on our custom board too.

The hardware is Freescale TWR-MCF5441X with 256MB NAND flash.
The kernel is 2.6.29 with 167 patches from the back port tree.

From kernel boot:

NAND device: Manufacturer ID: 0x2c, Chip ID: 0xca (Micron
NAND 256MiB 3,3V 16-bit)
Bad block table not found for chip 0
Bad block table not found for chip 0
Scanning device for bad blocks
Bad eraseblock 0 at 0x000000000000
Bad eraseblock 1 at 0x000000020000
Bad eraseblock 2 at 0x000000040000
Bad eraseblock 3 at 0x000000060000
Bad eraseblock 4 at 0x000000080000
Bad eraseblock 5 at 0x0000000a0000
Bad eraseblock 6 at 0x0000000c0000
<cut>
Bad eraseblock 2046 at 0x00000ffc0000
Bad eraseblock 2047 at 0x00000ffe0000
No space left to write bad block table
fsl_nfc: NAND Flash not found !
m25p80 spi1.1: at26df081a (1024 Kbytes)
Creating 1 MTD partitions on "Atmel at26df081a SPI Flash
chip":
0x000000000000-0x000000100000 : "at26df081a"

I try to diagnose the flash from U-Boot. When I e.g. read from the
flash I get:

NAND read: device 0 offset 0x1, size 0xfffffff
Skipping bad block 0x00000000
Skipping bad block 0x00020000
Skipping bad block 0x00040000
<cut>
Skipping bad block 0x0ffa0000
Skipping bad block 0x0ffc0000
Skipping bad block 0x0ffe0000
NAND read from offset 10000000 failed -22
0 bytes read: ERROR
-> nand

Assuming this is a software issue, can the bbt somehow be corrupt,
reporting all blocks bad even if they are not?

Is the problem likely to be in the UBI layer or in the NAND driver
(I suspect the first) and how can I troubleshoot this?

I tried 'nand scrub' but still says no space to write bbt.

Thanks and regards,
- Lars
Artem Bityutskiy
2011-03-10 13:06:18 UTC
Permalink
Post by Lars Michael
Hello,
I repost this issue, with correct subject and hope to have some
feedback. After working with getting UBI images mounted in Linux (it did
work for a sec!) the NAND device suddenly went totally bad after a system
restart! So all blocks are reported bad and the mtd is no longer visible
in Linux.
So do mtd tests work fine with your flash?
Post by Lars Michael
Before it had just a few bad blocks.
Do you have any old logs where you could find the bad block numbers?
Post by Lars Michael
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xca (Micron
NAND 256MiB 3,3V 16-bit)
Bad block table not found for chip 0
Bad block table not found for chip 0
Scanning device for bad blocks
Bad eraseblock 0 at 0x000000000000
Bad eraseblock 1 at 0x000000020000
Bad eraseblock 2 at 0x000000040000
Bad eraseblock 3 at 0x000000060000
Bad eraseblock 4 at 0x000000080000
Bad eraseblock 5 at 0x0000000a0000
Bad eraseblock 6 at 0x0000000c0000
<cut>
Bad eraseblock 2046 at 0x00000ffc0000
Bad eraseblock 2047 at 0x00000ffe0000
No space left to write bad block table
fsl_nfc: NAND Flash not found !
m25p80 spi1.1: at26df081a (1024 Kbytes)
Creating 1 MTD partitions on "Atmel at26df081a SPI Flash
0x000000000000-0x000000100000 : "at26df081a"
Do you have any logs when this thing happened?
Post by Lars Michael
Assuming this is a software issue, can the bbt somehow be corrupt,
reporting all blocks bad even if they are not?
I do not know.
Post by Lars Michael
Is the problem likely to be in the UBI layer or in the NAND driver
(I suspect the first) and how can I troubleshoot this?
I might be broken driver which reports write errors, and then UBI marks
all blocks bad. We might probably need some limit in UBI to prevent
situations like this.
--
Best Regards,
Artem Bityutskiy (????? ????????)
Lars Michael
2011-03-25 14:03:07 UTC
Permalink
From: Artem Bityutskiy <dedekind1 at gmail.com>
Subject: Re: NAND device: all blocks bad after restart, how to recover?
To: "Lars Michael" <lars.michael at yahoo.com>
Cc: linux-mtd at lists.infradead.org
Date: Thursday, 10 March, 2011, 14:06
On Wed, 2011-03-09 at 08:39 -0800,
Post by Lars Michael
Hello,
I repost this issue, with correct subject and hope to
have some
Post by Lars Michael
feedback. After working with getting UBI images
mounted in Linux (it did
Post by Lars Michael
work for a sec!) the NAND device suddenly went totally
bad after a system
Post by Lars Michael
restart! So all blocks are reported bad and the mtd is
no longer visible
Post by Lars Michael
in Linux.
So do mtd tests work fine with your flash?
Post by Lars Michael
? Before it had just a few bad blocks.
We investigated more and have found that the FlexBus on the Coldfire can cause problems for the NAND device. When the FlexBus devices are removed
the NAND device comes back to life, and is again accessible. So now I
can get back to the UBI part of the project :-)

Regards,
- Lars
Artem Bityutskiy
2011-03-31 12:11:50 UTC
Permalink
Post by Lars Michael
We investigated more and have found that the FlexBus on the Coldfire
can cause problems for the NAND device. When the FlexBus devices are
removed
the NAND device comes back to life, and is again accessible. So now I
can get back to the UBI part of the project :-)
Good! :-)
--
Best Regards,
Artem Bityutskiy (????? ????????)
Loading...