LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware > Linux - Embedded & Single-board computer
User Name
Password
Linux - Embedded & Single-board computer This forum is for the discussion of Linux on both embedded devices and single-board computers (such as the Raspberry Pi, BeagleBoard and PandaBoard). Discussions involving Arduino, plug computers and other micro-controller like devices are also welcome.

Notices


Reply
  Search this Thread
Old 02-08-2012, 12:37 PM   #1
sarbeswar
LQ Newbie
 
Registered: Feb 2012
Posts: 2

Rep: Reputation: Disabled
When skb_copy/skb_copy_expand is called followed by dev_kfree_skb causes kernel crash


Hello,
I have written a kernel module which is registered to Ethernet controller (i.e. gianfar) in powerpc environment. The Ethernet controller calls tx and rx functions of my module from its tx and rx functions respectively. The job of my module is to add some header and trailer in the tx function and removes them in rx function.

Here is the pseudo code of the tx function,
my module: tx function(struct sk_buff *skb)
{
struct sk_buff *new_skb = NULL;
if (skb_cloned(skb))
{
new_skb = skb_copy(skb, NULL);

if (new_skb == NULL)
{
dev_kree_skb(skb);
return;
}

dev_kfree_skb(skb);
skb = new_skb;
}

// add header
skb_cow (skb, extra_hdr_len);
skb_push (...)
skb_reset_mac_header(skb);

//add trailer
if (skb_tailroom(skb) < extra_tail_room)
{
struct sk_buff *skb2 = NULL;
skb2 = skb_copy_expand (skb, 0, extra_tail_room, GFP_ATOMIC);
if (skb2 == NULL)
{
dev_kfree_skb(skb);
return;
}
dev_kfree_skb (skb);
skb2 = skb;
}
}

my module: rx function (struct sk_buff *skb)
{
//local variable definition

//remove trailer
skb_trim (skb->len-tail_room);

skb_pull (extra_hdr_len);
}

I am using 2.6.33.4, gianfar ethernet driver, power pc platform .

I am experiencing kernel crash in 2 scenarios,
Issue#1) In tx function, if I use skb_copy_expand (without checking for clone and do not call skb_copy), the kernel crashes with the following stack trace. But if I use pskb_expand_head instead of skb_copy_expand, then the kernel does not crash.

Unable to handle kernel paging request for data at address 0x0080ea13
Faulting instruction address: 0xc0091f58
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT MPC831x FSP150
Modules linked in: ptp(P) ptpts(P) tdif(P) elmi(P) lagpdu(P) diag(P) ssm(P) cfm(P) efm(P) esareflector(P) esaprobe(P) pppmgmttnl(P) mgmttnl(P) pktdemux(P) mdio hardw
are_version(P) clipresent(P) monotonic restartcause(P) panic_buffer
NIP: c0091f58 LR: c0091f3c CTR: c0226950
REGS: c0459870 TRAP: 0300 Tainted: P (2.6.33.4-dev_sarbeswarm-64982)
MSR: 00001032 <ME,IR,DR> CR: 22002044 XER: 20000000
DAR: 0080ea13, DSISR: 20000000
TASK = c04383e8[0] 'swapper' THREAD: c0458000
GPR00: 00000000 c0459920 c04383e8 00000000 00000020 c0226988 c0226848 c045d3ac
GPR08: 013a0000 c0460000 00340010 c045d360 42008088 10093070 cfaa6b20 00000000
GPR16: 00000001 000000c0 cfaa6800 cfaa6ae0 0000000a ce5b725e ce5b725e ce5b726c
GPR24: ce5b7280 00000004 00000200 c0226988 00000020 0080ea13 00001032 c045d360
NIP [c0091f58] __kmalloc_track_caller+0x68/0xf0
LR [c0091f3c] __kmalloc_track_caller+0x4c/0xf0
Call Trace:
[c0459920] [c0091f3c] __kmalloc_track_caller+0x4c/0xf0 (unreliable)
[c0459940] [c0226870] __alloc_skb+0x68/0x148
[c0459960] [c0226988] skb_copy_expand+0x38/0xf4
[c0459980] [d211687c] tdif_tx+0xcd0/0xf54 [tdif]
[c04599e0] [c0205348] gfar_start_xmit+0x58/0x528
[c0459a30] [c022ff8c] dev_hard_start_xmit+0x2c0/0x374
[c0459a60] [c0242d74] sch_direct_xmit+0x11c/0x214
[c0459a90] [c0233f8c] dev_queue_xmit+0x394/0x518
[c0459ac0] [c02672cc] ip_finish_output+0x128/0x32c
[c0459ae0] [c02675a4] ip_local_out+0x38/0x54
[c0459af0] [c0267d90] ip_queue_xmit+0x1bc/0x370
[c0459b60] [c027bc70] tcp_transmit_skb+0x374/0x884
[c0459bc0] [c0278fec] __tcp_ack_snd_check+0x64/0xbc
[c0459bd0] [c027ab40] tcp_rcv_established+0x3d0/0x5ec
[c0459c00] [c02810ac] tcp_v4_do_rcv+0xc0/0x1e0
[c0459c30] [c0283304] tcp_v4_rcv+0x6ac/0x804
[c0459c60] [c0262e1c] ip_local_deliver_finish+0xe8/0x268
[c0459c80] [c0262a90] ip_rcv_finish+0x170/0x414
[c0459cc0] [c022ef9c] netif_receive_skb+0x2d8/0x3e4
[c0459cf0] [c0202060] gfar_clean_rx_ring+0x248/0x790
[c0459d80] [c0204d34] gfar_poll+0x384/0x51c
[c0459e20] [c0232410] net_rx_action+0xdc/0x1b8
[c0459e50] [c002b520] __do_softirq+0xc4/0x144
[c0459e90] [c00061ac] do_softirq+0x78/0x80

Issue#2) With pskb_expand_head, if I use skb_copy (in the beginning), then I come across another kernel crash with following stack trace. But if I do not free the old skb, I do not see any crash, but I see memory leak.

bootAttempt to release alive inet socket cd5244a0
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc02144a4
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT MPC8313 SCU
Modules linked in: hardware_version(PF) tdif(PF) restartcause(F) monotonic(F)
NIP: c02144a4 LR: c02144dc CTR: c007a068
REGS: cd5679d0 TRAP: 0300 Tainted: PF (2.6.23.9-dev_sarbeswarm-42685*)
MSR: 00009032 <EE,ME,IR,DR> CR: 44228424 XER: 00000000
DAR: 00000000, DSISR: 22000000
TASK = cd565b20[890] 'UserProcess' THREAD: cd566000
GPR00: c02144dc cd567a80 cd565b20 cd4cac80 00000000 4f3b13a8 00000000 00000000
GPR08: 00000000 00000000 c080fa60 00000000 84224422 10023dd8 00000000 cd567dac
GPR16: cd567db0 cd567db4 00000000 00000000 00001000 cd567da4 cd567da8 cd567dac
GPR24: 0000000d 00000000 cd566000 00000000 00000000 cd4cac80 00000345 cd524700
Call Trace:
[cd567a80] [c02144dc] (unreliable)
[cd567ab0] [c01b1b28]
[cd567ac0] [c00796ac]
[cd567d80] [c0079af4]
[cd567ee0] [c007a2c4]
[cd567f30] [c0004b6c]
[cd567f40] [c000efd0]
--- Exception: c01Instruction dump:
81495a20 816a000c 396b0001 916a000c 813f0008 7fa3eb78 3929ffff 913f0008
817d0000 813d0004 939d0000 939d0004 <91690000> 912b0004 4bfa4fc5 83bf0000
Kernel panic - not syncing: Fatal exception in interrupt
Rebooting in 5 seconds..

I am not sure what is wrong. But I think I am missing some fundamentals of using the skb structure. I have tried adding spin_lock and spin_unlock in the tx and rx function and did not see any difference. Please provide some input.

Appreciate your help.

-Sarbeswar
 
Old 02-08-2012, 12:49 PM   #2
sarbeswar
LQ Newbie
 
Registered: Feb 2012
Posts: 2

Original Poster
Rep: Reputation: Disabled
Couple of notes -

1) the 2nd crash shows 2.6.23.9 as kernel version, I also got the same exception in 2.6.33.4.
2) The "UserProcess" refers to one of the process in user space which basically sends and receives udp packets through netlink socket.

-Sarbeswar
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Destructor called on objects in deque without it being called explicitly Snark1994 Programming 4 07-13-2011 08:05 AM
USB kernel module, how functions get called by kernel falmdavis Linux - Kernel 1 01-18-2011 01:58 PM
Slackware with new Kernel - rc.modules is called more than once? me-$-on Linux - Newbie 5 01-18-2011 02:35 AM
/sbin/init is not called by the kernel raklo Linux - Hardware 2 09-19-2006 05:31 AM
??the difference between skb_copy() and skb_clone() sunnyriver Linux - Networking 0 03-23-2004 08:52 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware > Linux - Embedded & Single-board computer

All times are GMT -5. The time now is 02:05 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration