Skip to content
  1. Jun 28, 2008
    • Rainer Weikusat's avatar
      af_unix: fix 'poll for write'/connected DGRAM sockets · ec0d215f
      Rainer Weikusat authored
      
      
      For n:1 'datagram connections' (eg /dev/log), the unix_dgram_sendmsg
      routine implements a form of receiver-imposed flow control by
      comparing the length of the receive queue of the 'peer socket' with
      the max_ack_backlog value stored in the corresponding sock structure,
      either blocking the thread which caused the send-routine to be called
      or returning EAGAIN. This routine is used by both SOCK_DGRAM and
      SOCK_SEQPACKET sockets. The poll-implementation for these socket types
      is datagram_poll from core/datagram.c. A socket is deemed to be
      writeable by this routine when the memory presently consumed by
      datagrams owned by it is less than the configured socket send buffer
      size. This is always wrong for PF_UNIX non-stream sockets connected to
      server sockets dealing with (potentially) multiple clients if the
      abovementioned receive queue is currently considered to be full.
      'poll' will then return, indicating that the socket is writeable, but
      a subsequent write result in EAGAIN, effectively causing an (usual)
      application to 'poll for writeability by repeated send request with
      O_NONBLOCK set' until it has consumed its time quantum.
      
      The change below uses a suitably modified variant of the datagram_poll
      routines for both type of PF_UNIX sockets, which tests if the
      recv-queue of the peer a socket is connected to is presently
      considered to be 'full' as part of the 'is this socket
      writeable'-checking code. The socket being polled is additionally
      put onto the peer_wait wait queue associated with its peer, because the
      unix_dgram_recvmsg routine does a wake up on this queue after a
      datagram was received and the 'other wakeup call' is done implicitly
      as part of skb destruction, meaning, a process blocked in poll
      because of a full peer receive queue could otherwise sleep forever
      if no datagram owned by its socket was already sitting on this queue.
      Among this change is a small (inline) helper routine named
      'unix_recvq_full', which consolidates the actual testing code (in three
      different places) into a single location.
      
      Signed-off-by: default avatarRainer Weikusat <rweikusat@mssgmbh.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec0d215f
    • Octavian Purdila's avatar
      tcp: fix for splice receive when used with software LRO · db43a282
      Octavian Purdila authored
      
      
      If an skb has nr_frags set to zero but its frag_list is not empty (as
      it can happen if software LRO is enabled), and a previous
      tcp_read_sock has consumed the linear part of the skb, then
      __skb_splice_bits:
      
      (a) incorrectly reports an error and
      
      (b) forgets to update the offset to account for the linear part
      
      Any of the two problems will cause the subsequent __skb_splice_bits
      call (the one that handles the frag_list skbs) to either skip data,
      or, if the unadjusted offset is greater then the size of the next skb
      in the frag_list, make tcp_splice_read loop forever.
      
      Signed-off-by: default avatarOctavian Purdila <opurdila@ixiacom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db43a282
    • Miquel van Smoorenburg's avatar
      tcp: calculate tcp_mem based on low memory instead of all memory · 57413ebc
      Miquel van Smoorenburg authored
      
      
      The tcp_mem array which contains limits on the total amount of memory
      used by TCP sockets is calculated based on nr_all_pages.  On a 32 bits
      x86 system, we should base this on the number of lowmem pages.
      
      Signed-off-by: default avatarMiquel van Smoorenburg <miquels@cistron.nl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57413ebc
    • Andre Haupt's avatar
      hamradio: remove unused variable · 47979821
      Andre Haupt authored
      
      
      Signed-off-by: default avatarAndre Haupt <andre@bitwigglers.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      47979821
  2. Jun 27, 2008
  3. Jun 25, 2008
  4. Jun 24, 2008
  5. Jun 21, 2008
    • Eric W. Biederman's avatar
      netns: Don't receive new packets in a dead network namespace. · b9f75f45
      Eric W. Biederman authored
      
      
      Alexey Dobriyan <adobriyan@gmail.com> writes:
      > Subject: ICMP sockets destruction vs ICMP packets oops
      
      > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
      > icmp_reply() wants ICMP socket.
      >
      > Steps to reproduce:
      >
      > 	launch shell in new netns
      > 	move real NIC to netns
      > 	setup routing
      > 	ping -i 0
      > 	exit from shell
      >
      > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      > IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > PGD 17f3cd067 PUD 17f3ce067 PMD 0 
      > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
      > CPU 0 
      > Modules linked in: usblp usbcore
      > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
      > RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      > RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
      > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
      > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
      > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
      > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
      > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
      > FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
      > CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
      > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
      > Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
      >  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
      >  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
      > Call Trace:
      >  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
      >  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
      >  [<ffffffff803fd645>] icmp_echo+0x65/0x70
      >  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
      >  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
      >  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
      >  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
      >  [<ffffffff803be57d>] process_backlog+0x9d/0x100
      >  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
      >  [<ffffffff80237be5>] __do_softirq+0x75/0x100
      >  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
      >  [<ffffffff8020f085>] do_softirq+0x65/0xa0
      >  [<ffffffff80237af7>] irq_exit+0x97/0xa0
      >  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
      >  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
      >  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
      >  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
      >  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
      >  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
      >  [<ffffffff8040f380>] ? rest_init+0x70/0x80
      > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
      > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
      > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
      > RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
      >  RSP <ffffffff8057fc30>
      > CR2: 0000000000000000
      > ---[ end trace ea161157b76b33e8 ]---
      > Kernel panic - not syncing: Aiee, killing interrupt handler!
      
      Receiving packets while we are cleaning up a network namespace is a
      racy proposition. It is possible when the packet arrives that we have
      removed some but not all of the state we need to fully process it.  We
      have the choice of either playing wack-a-mole with the cleanup routines
      or simply dropping packets when we don't have a network namespace to
      handle them.
      
      Since the check looks inexpensive in netif_receive_skb let's just
      drop the incoming packets.
      
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9f75f45
    • David S. Miller's avatar
      sctp: Make sure N * sizeof(union sctp_addr) does not overflow. · 735ce972
      David S. Miller authored
      
      
      As noticed by Gabriel Campana, the kmalloc() length arg
      passed in by sctp_getsockopt_local_addrs_old() can overflow
      if ->addr_num is large enough.
      
      Therefore, enforce an appropriate limit.
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      735ce972
    • Stephen Hemminger's avatar
      pppoe: warning fix · 2645a3c3
      Stephen Hemminger authored
      
      
      Fix warning:
      drivers/net/pppoe.c: In function 'pppoe_recvmsg':
      drivers/net/pppoe.c:945: warning: comparison of distinct pointer types lacks a cast
      because skb->len is unsigned int and total_len is size_t
      
      Signed-off-by: default avatarStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2645a3c3
  6. Jun 20, 2008
  7. Jun 19, 2008
  8. Jun 18, 2008
    • Patrick McHardy's avatar
      netlink: genl: fix circular locking · 6d1a3fb5
      Patrick McHardy authored
      
      
      genetlink has a circular locking dependency when dumping the registered
      families:
      
      - dump start:
      genl_rcv()            : take genl_mutex
      genl_rcv_msg()        : call netlink_dump_start() while holding genl_mutex
      netlink_dump_start(),
      netlink_dump()        : take nlk->cb_mutex
      ctrl_dumpfamily()     : try to detect this case and not take genl_mutex a
                              second time
      
      - dump continuance:
      netlink_rcv()         : call netlink_dump
      netlink_dump          : take nlk->cb_mutex
      ctrl_dumpfamily()     : take genl_mutex
      
      Register genl_lock as callback mutex with netlink to fix this. This slightly
      widens an already existing module unload race, the genl ops used during the
      dump might go away when the module is unloaded. Thomas Graf is working on a
      seperate fix for this.
      
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d1a3fb5