hujianyang
2014-04-26 09:25:42 UTC
Hi,
I meet some assert failed as Laurence Withers reported in February.
show like this:
[69440.577932] UBIFS assert failed in ubifs_set_page_dirty at 1421 (pid 2067)
[69440.658567] UBIFS assert failed in ubifs_writepage at 1009 (pid 2069)
[69440.735605] UBIFS assert failed in do_writepage at 936 (pid 2069)
[69440.735881] UBIFS assert failed in ubifs_release_budget at 567 (pid 2069)
[69440.736360] UBIFS assert failed in ubifs_release_budget at 567 (pid 2069)
[69441.581541] UBIFS assert failed in ubifs_budget_space at 464 (pid 2070)
[69441.659118] UBIFS assert failed in ubifs_release_budget at 567 (pid 2072)
[69441.740405] UBIFS assert failed in ubifs_set_page_dirty at 1421 (pid 2070)
[69441.822369] UBIFS assert failed in ubifs_writepage at 1009 (pid 2072)
[69441.899574] UBIFS assert failed in do_writepage at 936 (pid 2072)
[69441.899853] UBIFS assert failed in ubifs_release_budget at 567 (pid 2072)
[69450.677679] UBIFS assert failed in ubifs_release_budget at 567 (pid 6)
[69455.757670] UBIFS assert failed in ubifs_release_budget at 567 (pid 6)
[69458.147075] UBIFS assert failed in ubifs_put_super at 1776 (pid 2077)
After communicating with Laurence, I found this assert failed can
be easily reproduced by running mmap(PROT_WRITE, MAP_SHARED) and
fsync with same file at same time.
I think there is a race in __do_fault and ubifs_writepage.
We do ->page_mkwrite in __do_fault, perform space budget and set
PagePrivate, then execute __set_page_dirty_nobuffers to make the
page dirty. But at the end of ubifs_vm_page_mkwrite, we release
page lock.
At the same time, fsync process may hold the lock and do ->writepage
as page is set to dirty in ->page_mkwrite. We will clear page_dirty,
clear page_private and release budget here. In the end, unlock page.
Mmap then get the lock and perform ->set_page_dirty. We will meet
the first assert failed here because dirty bit is clear by fsync.
"UBIFS assert failed in ubifs_set_page_dirty at 1421"
static int ubifs_set_page_dirty(struct page *page)
{
int ret;
ret = __set_page_dirty_nobuffers(page); /* Here reset dirty bit */
/*
* An attempt to dirty a page without budgeting for it - should not
* happen.
*/
ubifs_assert(ret == 0);
return ret;
}
With this assert failed, ->set_page_dirty will reset the dirty bit
without space budget and SetPagePrivate. When we want to writeback
this page, we will meet PagePrivate assert failed.
"UBIFS assert failed in ubifs_writepage at 1009 (pid 2069)
UBIFS assert failed in do_writepage at 936 (pid 2069)"
Then, release budget without budgeting and meet budget assert failed.
"UBIFS assert failed in ubifs_release_budget at 567
UBIFS assert failed in ubifs_budget_space at 464"
c->bi.dd_growth is less than zero now.
I want to fix this problem but I don't have enough knowledge about
this filesystem. In my fix, I remove __set_page_dirty_nobuffers
in ->page_mkwrite and just set dirty bit in ->set_page_dirty.
I don't know if my fix works and causes no other problems. I can only
test it with linux-3.10 and it seems not bad.
-------------------------------------------------------------------
I meet some assert failed as Laurence Withers reported in February.
show like this:
[69440.577932] UBIFS assert failed in ubifs_set_page_dirty at 1421 (pid 2067)
[69440.658567] UBIFS assert failed in ubifs_writepage at 1009 (pid 2069)
[69440.735605] UBIFS assert failed in do_writepage at 936 (pid 2069)
[69440.735881] UBIFS assert failed in ubifs_release_budget at 567 (pid 2069)
[69440.736360] UBIFS assert failed in ubifs_release_budget at 567 (pid 2069)
[69441.581541] UBIFS assert failed in ubifs_budget_space at 464 (pid 2070)
[69441.659118] UBIFS assert failed in ubifs_release_budget at 567 (pid 2072)
[69441.740405] UBIFS assert failed in ubifs_set_page_dirty at 1421 (pid 2070)
[69441.822369] UBIFS assert failed in ubifs_writepage at 1009 (pid 2072)
[69441.899574] UBIFS assert failed in do_writepage at 936 (pid 2072)
[69441.899853] UBIFS assert failed in ubifs_release_budget at 567 (pid 2072)
[69450.677679] UBIFS assert failed in ubifs_release_budget at 567 (pid 6)
[69455.757670] UBIFS assert failed in ubifs_release_budget at 567 (pid 6)
[69458.147075] UBIFS assert failed in ubifs_put_super at 1776 (pid 2077)
After communicating with Laurence, I found this assert failed can
be easily reproduced by running mmap(PROT_WRITE, MAP_SHARED) and
fsync with same file at same time.
I think there is a race in __do_fault and ubifs_writepage.
We do ->page_mkwrite in __do_fault, perform space budget and set
PagePrivate, then execute __set_page_dirty_nobuffers to make the
page dirty. But at the end of ubifs_vm_page_mkwrite, we release
page lock.
At the same time, fsync process may hold the lock and do ->writepage
as page is set to dirty in ->page_mkwrite. We will clear page_dirty,
clear page_private and release budget here. In the end, unlock page.
Mmap then get the lock and perform ->set_page_dirty. We will meet
the first assert failed here because dirty bit is clear by fsync.
"UBIFS assert failed in ubifs_set_page_dirty at 1421"
static int ubifs_set_page_dirty(struct page *page)
{
int ret;
ret = __set_page_dirty_nobuffers(page); /* Here reset dirty bit */
/*
* An attempt to dirty a page without budgeting for it - should not
* happen.
*/
ubifs_assert(ret == 0);
return ret;
}
With this assert failed, ->set_page_dirty will reset the dirty bit
without space budget and SetPagePrivate. When we want to writeback
this page, we will meet PagePrivate assert failed.
"UBIFS assert failed in ubifs_writepage at 1009 (pid 2069)
UBIFS assert failed in do_writepage at 936 (pid 2069)"
Then, release budget without budgeting and meet budget assert failed.
"UBIFS assert failed in ubifs_release_budget at 567
UBIFS assert failed in ubifs_budget_space at 464"
c->bi.dd_growth is less than zero now.
I want to fix this problem but I don't have enough knowledge about
this filesystem. In my fix, I remove __set_page_dirty_nobuffers
in ->page_mkwrite and just set dirty bit in ->set_page_dirty.
I don't know if my fix works and causes no other problems. I can only
test it with linux-3.10 and it seems not bad.
-------------------------------------------------------------------