Discussion:
corrupted empty space - again
Michał Przepłata
2013-04-22 14:19:35 UTC
Permalink
Hello Artem,

I am having some issues similar to the ones reported earlier by
others, on this and other
lists (corrupted empty space), but I cannot see reliable solution for
this, hence I'm writing it here for help.
I hope you could point me to some solution.

To the point:
- I am using kernel 2.6.35 (with some patches up to 12.2010), I know - it's old
- chip is Samsung K9F1G08U0D (128MB SLC, 2kB pages, layout is using
ECC for 4x512B subpages),
SoC is Freescale i.mx28
- our device must be finished booting and running apps under 10s, if
this is not meet we are
powered down (by backend device)
- I didn't run any MTD tests/bonnie++ yet for testing this chip/MTD
driver (could be useful
for discovering other issue, but I don't think it matters for case
described below)
- on one of our devices we got UBIFS corruption for R/W data partition
(in empty space area),
below is original bug report log (without further debug messages):
(cut)
[ 0.230000] UBI: attaching mtd1 to ubi0
[ 0.230000] UBI: physical eraseblock size: 131072 bytes (128 KiB)
[ 0.230000] UBI: logical eraseblock size: 126976 bytes
[ 0.230000] UBI: smallest flash I/O unit: 2048
[ 0.230000] UBI: VID header offset: 2048 (aligned 2048)
[ 0.230000] UBI: data offset: 4096
[ 0.770000] UBI: attached mtd1 to ubi0
[ 0.770000] UBI: MTD device name: "gpmi-nfc-general-use"
[ 0.770000] UBI: MTD device size: 117 MiB
[ 0.770000] UBI: number of good PEBs: 940
[ 0.770000] UBI: number of bad PEBs: 0
[ 0.770000] UBI: max. allowed volumes: 128
[ 0.770000] UBI: wear-leveling threshold: 4096
[ 0.770000] UBI: number of internal volumes: 1
[ 0.770000] UBI: number of user volumes: 3
[ 0.780000] UBI: available PEBs: 0
[ 0.780000] UBI: total number of reserved PEBs: 940
[ 0.780000] UBI: number of PEBs reserved for bad PEB handling: 9
[ 0.780000] UBI: max/mean erase counter: 407/123
[ 0.780000] UBI: image sequence number: 0
[ 0.780000] UBI: background thread "ubi_bgt0d" started, PID 30
(cut)
[ 0.940000] VFS: Mounted root (squashfs filesystem) readonly on device 254:0.
[ 1.760000] UBIFS: recovery needed
[ 1.810000] UBIFS: recovery completed
[ 1.810000] UBIFS: mounted UBI device 0, volume 2, name "data"
[ 1.810000] UBIFS: file system size: 76947456 bytes (75144 KiB,
73 MiB, 606 LEBs)
[ 1.810000] UBIFS: journal size: 3809280 bytes (3720 KiB, 3
MiB, 30 LEBs)
[ 1.810000] UBIFS: media format: w4/r0 (latest is w4/r0)
[ 1.810000] UBIFS: default compressor: zlib
[ 1.810000] UBIFS: reserved for root: 3634417 bytes (3549 KiB)
[ 3.460000] UBI error: ubi_io_read: error -74 while reading 126976
bytes from PEB 97:4096, read 126976 bytes
[ 3.470000] UBIFS error (pid 86): ubifs_scan: corrupt empty space
at LEB 318:116009
[ 3.490000] UBIFS error (pid 86): ubifs_scanned_corruption:
corruption at LEB 318:116009
[ 3.490000] 00000000: ffffffdf ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ................................
[ 3.490000] 00000020: ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ................................
(cut)
[ 3.560000] 00001fe0: ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ................................
[ 3.560000] UBIFS error (pid 86): ubifs_scan: LEB 318 scanning failed
[ 3.560000] UBIFS warning (pid 86): ubifs_ro_mode: switched to
read-only mode, error -117
[ 3.560000] UBIFS error (pid 86): make_reservation: cannot reserve
137 bytes in jhead 2, error -117
[ 3.580000] UBIFS error (pid 86): do_writepage: cannot write page 1
of inode 1218, error -117
(cut)

It seems that:
a) error is of single bit-flip kind (read decay) (I don't suspect currently
unstable bits issue during erasing/writting)
b) our NAND driver doesn't protect our empty space (no wonder, as 13
bytes ECC used
per 512B subpage should be left 0xFF until written with real data)
c) as checked, this is the first empty-page (2kB) in this PEB,
previous page contains
some data (and nothing shows that we have more than one page corrupted)

I have tried of changing NAND/MTD driver to return -EUCLEAN instead of
-EBADMSG (to fix
the problem below UBI layer, pretending that we have correctable
bit-flip). Results (with UBI debug turned on):
FAIL#1 - error was still there (UBIFS corruption when mounting data
partition, required for booting),
scrubbing for this PEB was initiated (ubi_wl_scrub_peb), but happend
some time later, when
left running after artificially disconnecting backend (I guess it
was scheduled to ubi_bgt0d task)
FAIL#2 - it seems that PEB 97 was rewritten to PEB 89, however
corrupted empty space was
also preserved (sic!) at the very same offset, hence error is still
there (confirmed with nanddump)

That means that further trying to fix that in NAND/MTD driver is
futile. Am I right?

Questions:
1) is there any chance that merging UBI/UBIFS recent source will make
it go away? It is 2.5 year
of code development and aside of UBI/UBIFS probably I would be
forced to merge also other
subsystems etc. which could result in merging hell I would like to omit.
I have browsed thru the GIT tree and I see that some 2 years ago
some set of patches introduced
'corrupted PEBs list', that from what I understand, would make this
(and only this!) PEB read-only
(unfortunately forever, which will deplete pool of reserved PEBs
sooner, which is also not that nice).
Would merging those patches make UBIFS continue with scanning, or
will this still be scheduled
to bg task (ie. useless in this case)?
2) should I try to change UBI/UBIFS to deal with this problem? Ideally
would be if rewritting/recovering
this PEB would happen immediately at the time of discovery (in UBI
layer). Alternatively, immediately
at UBIFS layer (in ubifs_scan function, when page is checked for
containing only 0xffffffff).
Could you point me to an example that would be proper for this?
3) what about a band-aid solution (commenting out ?goto corrupted;?
line) explained in:
http://e2e.ti.com/support/embedded/linux/f/354/t/171839.aspx
Does UBI/UBIFS does check also for all 0xFF in page before writing
(not as part of any ?extra checks?
debugging)? If so, then maybe such a quick-fix could be used
(fixing ubifs_scan issue), but best
followed with some soon-to-happen recovery, that will recover this
LEB and erase/reuse the PEB?

Thanks for your help and time!

Regards,
Michal Przeplata
Gupta, Pekon
2013-04-23 05:21:04 UTC
Permalink
Post by Michał Przepłata
a) error is of single bit-flip kind (read decay) (I don't suspect currently
unstable bits issue during erasing/writting)
b) our NAND driver doesn't protect our empty space (no wonder, as 13
bytes ECC used
per 512B subpage should be left 0xFF until written with real data)
[Pekon]: This should not matter, suppose you wanted to write
0x77 to a erase-page byte bit 0. So in actual 0x76 would be written on
the device. And corresponding ECC stored in OOB/spare area.
When you read the byte, it would read as 0x76, but ECC stored in
spare/OOB area was calculated for 0x77, so driver would detect a
bit-flip and should correct it too.
So whether bit-flips occur on erased-page or whether after writing,
both can be detected and corrected by your driver.

[Pekon]: Another thing your driver should handle is whether it can
detect bit-flips in OOB/spare area (ECC syndrome) itself or not.
The reason is if your read-ECC itself is corrupted then you should not
mistakenly re-fix your data.
Post by Michał Przepłata
c) as checked, this is the first empty-page (2kB) in this PEB,
previous page contains
some data (and nothing shows that we have more than one page corrupted)
I have tried of changing NAND/MTD driver to return -EUCLEAN instead of
-EBADMSG (to fix
the problem below UBI layer, pretending that we have correctable
[Pekon]: I think, this is wrong approach:
(a)-EBADMSG mean driver encountered multiple bit-flips which your
ECC scheme cannot correct. So it is returning you the corrupted data.
(b)-EUCLEAN indicates that though data had bit-flips but they were
_already_ corrected by the driver. So data is correct.
In addition to it, upper File-System layer can take preventive actions
(like scrubbing in UBIFS) to avoid accumulation of bit-flips in future.

So, replacing -EBADMSG with -EUCLEAN would not fix the corrupted
data, rather it would fool the upper FS layer to use the corrupted data
as the fixed one.
Post by Michał Przepłata
FAIL#1 - error was still there (UBIFS corruption when mounting data
partition, required for booting),
scrubbing for this PEB was initiated (ubi_wl_scrub_peb), but happend
some time later, when
left running after artificially disconnecting backend (I guess it
was scheduled to ubi_bgt0d task)
FAIL#2 - it seems that PEB 97 was rewritten to PEB 89, however
corrupted empty space was
also preserved (sic!) at the very same offset, hence error is still
there (confirmed with nanddump)
[Pekon]: This why you are seeing the bit-flip getting copied to new PEB
because you fooled the FS layer by saying -EUCLEAN.
Post by Michał Przepłata
That means that further trying to fix that in NAND/MTD driver is
futile. Am I right?
[Pekon]: First identify the root cause of the problem
Possibility-1: whether you are actually seeing multiple bit-flips within
same page that your ECC scheme is unable to handle ?
I assume you are using BCH8 algorithm, so check if your
number_of_bit-flips > 8 per page.
You can check this by dumping the page without ECC correction.
nanddump -s <offset> -p <device> -f "file1.hex" (with correction)
nanddump -s <offset> -p -n <device> -f "file1.hex" -n (without correction)
Solution-1: upgrade ur ECC scheme to BCH-16 or similar.

Possibility-2: your NAND driver is not catching the bit-flips correctly.
Solution-2: fix your NAND driver, not MTD or UBI.


With regards, pekon
Gupta, Pekon
2013-04-23 05:27:03 UTC
Permalink
Fixed a typo
Post by Michał Przepłata
a) error is of single bit-flip kind (read decay) (I don't suspect currently
unstable bits issue during erasing/writting)
b) our NAND driver doesn't protect our empty space (no wonder, as 13
bytes ECC used
per 512B subpage should be left 0xFF until written with real data)
[Pekon]: This should not matter, suppose you wanted to write
0x77 to a erase-page byte already having a bit-flip at bit-0.
So in actual 0x76 would be written on the device. And corresponding
calculated ECC gets stored in OOB/spare area.

Now when you read the byte, it would read as 0x76, but ECC stored in
spare/OOB area was calculated for 0x77, so nand-driver would detect a
bit-flip and should correct it too.
So whether bit-flips occur on erased-page or whether after writing,
both can be detected and corrected by your driver.
Artem Bityutskiy
2013-05-15 08:07:57 UTC
Permalink
Post by Michał Przepłata
Hello Artem,
I am having some issues similar to the ones reported earlier by
others, on this and other
lists (corrupted empty space), but I cannot see reliable solution for
this, hence I'm writing it here for help.
I hope you could point me to some solution.
- I am using kernel 2.6.35 (with some patches up to 12.2010), I know - it's old
- chip is Samsung K9F1G08U0D (128MB SLC, 2kB pages, layout is using
ECC for 4x512B subpages),
SoC is Freescale i.mx28
- our device must be finished booting and running apps under 10s, if
this is not meet we are
powered down (by backend device)
- I didn't run any MTD tests/bonnie++ yet for testing this chip/MTD
driver (could be useful
for discovering other issue, but I don't think it matters for case
described below)
- on one of our devices we got UBIFS corruption for R/W data partition
(in empty space area),
(cut)
[ 0.230000] UBI: attaching mtd1 to ubi0
[ 0.230000] UBI: physical eraseblock size: 131072 bytes (128 KiB)
[ 0.230000] UBI: logical eraseblock size: 126976 bytes
[ 0.230000] UBI: smallest flash I/O unit: 2048
[ 0.230000] UBI: VID header offset: 2048 (aligned 2048)
[ 0.230000] UBI: data offset: 4096
[ 0.770000] UBI: attached mtd1 to ubi0
[ 0.770000] UBI: MTD device name: "gpmi-nfc-general-use"
[ 0.770000] UBI: MTD device size: 117 MiB
[ 0.770000] UBI: number of good PEBs: 940
[ 0.770000] UBI: number of bad PEBs: 0
[ 0.770000] UBI: max. allowed volumes: 128
[ 0.770000] UBI: wear-leveling threshold: 4096
[ 0.770000] UBI: number of internal volumes: 1
[ 0.770000] UBI: number of user volumes: 3
[ 0.780000] UBI: available PEBs: 0
[ 0.780000] UBI: total number of reserved PEBs: 940
[ 0.780000] UBI: number of PEBs reserved for bad PEB handling: 9
[ 0.780000] UBI: max/mean erase counter: 407/123
[ 0.780000] UBI: image sequence number: 0
[ 0.780000] UBI: background thread "ubi_bgt0d" started, PID 30
(cut)
[ 0.940000] VFS: Mounted root (squashfs filesystem) readonly on device 254:0.
[ 1.760000] UBIFS: recovery needed
[ 1.810000] UBIFS: recovery completed
[ 1.810000] UBIFS: mounted UBI device 0, volume 2, name "data"
[ 1.810000] UBIFS: file system size: 76947456 bytes (75144 KiB,
73 MiB, 606 LEBs)
[ 1.810000] UBIFS: journal size: 3809280 bytes (3720 KiB, 3
MiB, 30 LEBs)
[ 1.810000] UBIFS: media format: w4/r0 (latest is w4/r0)
[ 1.810000] UBIFS: default compressor: zlib
[ 1.810000] UBIFS: reserved for root: 3634417 bytes (3549 KiB)
[ 3.460000] UBI error: ubi_io_read: error -74 while reading 126976
bytes from PEB 97:4096, read 126976 bytes
[ 3.470000] UBIFS error (pid 86): ubifs_scan: corrupt empty space
at LEB 318:116009
corruption at LEB 318:116009
[ 3.490000] 00000000: ffffffdf ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ................................
[ 3.490000] 00000020: ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ................................
(cut)
[ 3.560000] 00001fe0: ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ................................
[ 3.560000] UBIFS error (pid 86): ubifs_scan: LEB 318 scanning failed
[ 3.560000] UBIFS warning (pid 86): ubifs_ro_mode: switched to
read-only mode, error -117
[ 3.560000] UBIFS error (pid 86): make_reservation: cannot reserve
137 bytes in jhead 2, error -117
[ 3.580000] UBIFS error (pid 86): do_writepage: cannot write page 1
of inode 1218, error -117
(cut)
a) error is of single bit-flip kind (read decay) (I don't suspect currently
unstable bits issue during erasing/writting)
b) our NAND driver doesn't protect our empty space (no wonder, as 13
bytes ECC used
per 512B subpage should be left 0xFF until written with real data)
c) as checked, this is the first empty-page (2kB) in this PEB,
previous page contains
some data (and nothing shows that we have more than one page corrupted)
I have tried of changing NAND/MTD driver to return -EUCLEAN instead of
-EBADMSG (to fix
the problem below UBI layer, pretending that we have correctable
FAIL#1 - error was still there (UBIFS corruption when mounting data
partition, required for booting),
scrubbing for this PEB was initiated (ubi_wl_scrub_peb), but happend
some time later, when
left running after artificially disconnecting backend (I guess it
was scheduled to ubi_bgt0d task)
FAIL#2 - it seems that PEB 97 was rewritten to PEB 89, however
corrupted empty space was
also preserved (sic!) at the very same offset, hence error is still
there (confirmed with nanddump)
That means that further trying to fix that in NAND/MTD driver is
futile. Am I right?
1) is there any chance that merging UBI/UBIFS recent source will make
it go away?
No.
Post by Michał Przepłata
2) should I try to change UBI/UBIFS to deal with this problem? Ideally
would be if rewritting/recovering
this PEB would happen immediately at the time of discovery (in UBI
layer). Alternatively, immediately
at UBIFS layer (in ubifs_scan function, when page is checked for
containing only 0xffffffff).
Could you point me to an example that would be proper for this?
Currently UBI/UBIFS assume that drivers fix-up bit-flips in erased
areas.

Why the fact that the driver does not protect the empty space does not
worry you? Isn't it a flaw? If you have empty space with too many
bit-flips, and write useful data there which you then cannot read, isn't
it a problem?

To me it really sounds like it is job of MTD layer and/or the driver to
protect the empty space. Probably MTD may provide some generic methods
which could be used by those drivers which do not have own protection?
Post by Michał Przepłata
3) what about a band-aid solution (commenting out ?goto corrupted;?
http://e2e.ti.com/support/embedded/linux/f/354/t/171839.aspx
Does UBI/UBIFS does check also for all 0xFF in page before writing
(not as part of any ?extra checks?
debugging)? If so, then maybe such a quick-fix could be used
(fixing ubifs_scan issue), but best
followed with some soon-to-happen recovery, that will recover this
LEB and erase/reuse the PEB?
I think this is a bad idea, because this way you also ignore real
corruptions, instead of noticing them right away.
--
Best Regards,
Artem Bityutskiy
Artem Bityutskiy
2013-05-15 09:58:49 UTC
Permalink
My assumption is that when UBIFS will write 2kB page to the empty space
with bitflips, it will (right after writting) verify ECC of that data to
be sure it is correct, and if it isn't it will write to a different LEB
and scrub this one. That way there will be no data loss. Isn't that the
way things are done?
No, we do not do this - it would be too ineffecient if UBIFS read
everything it just wrote. Besides, MTD layer may do some caching, and
reading may return results from a cache...

The driver can implement this functionality, though. It could then
return an error, and then UBI would have to do recovery, which is not
very fast process, because it involves moving all the data from the
faulty erasebolock somewhere else, and then torture the faulty one.

I think it is more efficient to do something in the driver on the read
path. I really do not know what would be a robust and fast solution,
though. You could check if ECC's bytes in OOB are all 0xFFs, which would
mean that the page is empty, and you need to verify the bit-flips. But
the OOB area itself may have bit-flips in the ECC positions...

But there are people in this list who are more knoledgable in the area.
I am out of MTD business for several years already and forgot a lot of
things. You just need to somehow get their attention :-)
Post by Artem Bityutskiy
I think this is a bad idea, because this way you also ignore real
corruptions, instead of noticing them right away.
On UBI homepage there is mentioned that one should write a process
that will periodically re-read all UBI blocks, in order to not let
bitflips in data blocks to accumulate over long periods of time.
So it is still not there?
No, no one implemented this, although it is easy to do, I think.
Must this be done in kernel-space or
simpler in user-space (I guess reading all *files* from mounted volumes
will not be enough, as some UBI blocks (journaling, other structures)
will not be re-read this way, right?)
Probably a kernel-space thread would do better. And if doing this in
user-space, you should do this by reading UBI volumes directly
(/dev/ubiX_Y), not from the file-system.
BTW. For some reason my previous mail (response to Pekon Gupta) is on
hold by mailing list's owner. Am I considered a spammer for some reason? :)
Probably. The mailing list does not like HTML and "Re: in subject but no
In-Reply-To: in headers".
--
Best Regards,
Artem Bityutskiy
Continue reading on narkive:
Loading...