Skip to content
  1. Sep 13, 2015
    • Greg Kroah-Hartman's avatar
      Linux 3.10.88 · 2ec14270
      Greg Kroah-Hartman authored
      v3.10.88
      2ec14270
    • Yann Droneaud's avatar
      arm64/mm: Remove hack in mmap randomize layout · 82c9aed3
      Yann Droneaud authored
      commit d6c763af upstream.
      
      Since commit 8a0a9bd4 ('random: make get_random_int() more
      random'), get_random_int() returns a random value for each call,
      so comment and hack introduced in mmap_rnd() as part of commit
      1d18c47c ('arm64: MMU fault handling and page table management')
      are incorrects.
      
      Commit 1d18c47c seems to use the same hack introduced by
      commit a5adc91a ('powerpc: Ensure random space between stack
      and mmaps'), latter copied in commit 5a0efea0 ('sparc64: Sharpen
      address space randomization calculations.').
      
      But both architectures were cleaned up as part of commit
      fa8cbaaf ('powerpc+sparc64/mm: Remove hack in mmap randomize
      layout') as hack is no more needed since commit 8a0a9bd4
      
      .
      
      So the present patch removes the comment and the hack around
      get_random_int() on AArch64's mmap_rnd().
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarDan McGee <dpmcgee@gmail.com>
      Signed-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Matthias Brugger <mbrugger@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82c9aed3
    • Horia Geant?'s avatar
      crypto: caam - fix memory corruption in ahash_final_ctx · d51689e2
      Horia Geant? authored
      commit b310c178 upstream.
      
      When doing pointer operation for accessing the HW S/G table,
      a value representing number of entries (and not number of bytes)
      must be used.
      
      Fixes: 045e3678
      
       ("crypto: caam - ahash hmac support")
      Signed-off-by: default avatarHoria Geant? <horia.geanta@freescale.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d51689e2
    • Bart Van Assche's avatar
      libfc: Fix fc_fcp_cleanup_each_cmd() · 0a1b7269
      Bart Van Assche authored
      commit 8f2777f5
      
       upstream.
      
      Since fc_fcp_cleanup_cmd() can sleep this function must not
      be called while holding a spinlock. This patch avoids that
      fc_fcp_cleanup_each_cmd() triggers the following bug:
      
      BUG: scheduling while atomic: sg_reset/1512/0x00000202
      1 lock held by sg_reset/1512:
       #0:  (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
      Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
      Call Trace:
       [<ffffffff816c612c>] dump_stack+0x4f/0x7b
       [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
       [<ffffffff816c87aa>] __schedule+0x71a/0xa10
       [<ffffffff816c8ad2>] schedule+0x32/0x80
       [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
       [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
       [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
       [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
       [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
       [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
       [<ffffffff814a2650>] scsi_ioctl+0x150/0x440
       [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
       [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
       [<ffffffff811da608>] block_ioctl+0x38/0x40
       [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
       [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
       [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a
      
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarVasu Dev <vasu.dev@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a1b7269
    • Alex Deucher's avatar
      drm/radeon: add new OLAND pci id · e7e8231c
      Alex Deucher authored
      commit e037239e
      
       upstream.
      
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7e8231c
    • Michael Walle's avatar
      EDAC, ppc4xx: Access mci->csrows array elements properly · 94c37975
      Michael Walle authored
      commit 5c16179b upstream.
      
      The commit
      
        de3910eb
      
       ("edac: change the mem allocation scheme to
      		 make Documentation/kobject.txt happy")
      
      changed the memory allocation for the csrows member. But ppc4xx_edac was
      forgotten in the patch. Fix it.
      
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.cc
      
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      94c37975
    • Richard Weinberger's avatar
      localmodconfig: Use Kbuild files too · 98e5059b
      Richard Weinberger authored
      commit c0ddc8c7 upstream.
      
      In kbuild it is allowed to define objects in files named "Makefile"
      and "Kbuild".
      Currently localmodconfig reads objects only from "Makefile"s and misses
      modules like nouveau.
      
      Link: http://lkml.kernel.org/r/1437948415-16290-1-git-send-email-richard@nod.at
      
      
      
      Reported-and-tested-by: default avatarLeonidas Spyropoulos <artafinde@gmail.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98e5059b
    • Joe Thornber's avatar
      dm thin metadata: delete btrees when releasing metadata snapshot · 9cdd5586
      Joe Thornber authored
      commit 7f518ad0
      
       upstream.
      
      The device details and mapping trees were just being decremented
      before.  Now btree_del() is called to do a deep delete.
      
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9cdd5586
    • Peter Zijlstra's avatar
      perf: Fix fasync handling on inherited events · 1f6661e2
      Peter Zijlstra authored
      commit fed66e2c
      
       upstream.
      
      Vince reported that the fasync signal stuff doesn't work proper for
      inherited events. So fix that.
      
      Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
      which upon the demise of the file descriptor ensures the allocation is
      freed and state is updated.
      
      Now for perf, we can have the events stick around for a while after the
      original FD is dead because of references from child events. So we
      cannot copy the fasync pointer around. We can however consistently use
      the parent's fasync, as that will be updated.
      
      Reported-and-Tested-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho deMelo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1434011521.1495.71.camel@twins
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f6661e2
    • Wanpeng Li's avatar
      mm/hwpoison: fix page refcount of unknown non LRU page · 50deac6c
      Wanpeng Li authored
      commit 4f32be67
      
       upstream.
      
      After trying to drain pages from pagevec/pageset, we try to get reference
      count of the page again, however, the reference count of the page is not
      reduced if the page is still not on LRU list.
      
      Fix it by adding the put_page() to drop the page reference which is from
      __get_any_page().
      
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50deac6c
    • Manfred Spraul's avatar
      ipc/sem.c: update/correct memory barriers · 30e5bc30
      Manfred Spraul authored
      commit 3ed1f8a9
      
       upstream.
      
      sem_lock() did not properly pair memory barriers:
      
      !spin_is_locked() and spin_unlock_wait() are both only control barriers.
      The code needs an acquire barrier, otherwise the cpu might perform read
      operations before the lock test.
      
      As no primitive exists inside <include/spinlock.h> and since it seems
      noone wants another primitive, the code creates a local primitive within
      ipc/sem.c.
      
      With regards to -stable:
      
      The change of sem_wait_array() is a bugfix, the change to sem_lock() is a
      nop (just a preprocessor redefinition to improve the readability).  The
      bugfix is necessary for all kernels that use sem_wait_array() (i.e.:
      starting from 3.10).
      
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Kirill Tkhai <ktkhai@parallels.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30e5bc30
    • Herton R. Krzesinski's avatar
      ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits · 04d2af28
      Herton R. Krzesinski authored
      commit 602b8593
      
       upstream.
      
      The current semaphore code allows a potential use after free: in
      exit_sem we may free the task's sem_undo_list while there is still
      another task looping through the same semaphore set and cleaning the
      sem_undo list at freeary function (the task called IPC_RMID for the same
      semaphore set).
      
      For example, with a test program [1] running which keeps forking a lot
      of processes (which then do a semop call with SEM_UNDO flag), and with
      the parent right after removing the semaphore set with IPC_RMID, and a
      kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
      CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
      in the kernel log:
      
         Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
         000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
         010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
         Prev obj: start=ffff88003b45c180, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
         Next obj: start=ffff88003b45c200, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
         BUG: spinlock wrong CPU on CPU#2, test/18028
         general protection fault: 0000 [#1] SMP
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: spin_dump+0x53/0xc0
         Call Trace:
           spin_bug+0x30/0x40
           do_raw_spin_unlock+0x71/0xa0
           _raw_spin_unlock+0xe/0x10
           freeary+0x82/0x2a0
           ? _raw_spin_lock+0xe/0x10
           semctl_down.clone.0+0xce/0x160
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           SyS_semctl+0x236/0x2c0
           ? syscall_trace_leave+0xde/0x130
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
         RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
          RSP <ffff88003750fd68>
         ---[ end trace 783ebb76612867a0 ]---
         NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: native_read_tsc+0x0/0x20
         Call Trace:
           ? delay_tsc+0x40/0x70
           __delay+0xf/0x20
           do_raw_spin_lock+0x96/0x140
           _raw_spin_lock+0xe/0x10
           sem_lock_and_putref+0x11/0x70
           SYSC_semtimedop+0x7bf/0x960
           ? handle_mm_fault+0xbf6/0x1880
           ? dequeue_task_fair+0x79/0x4a0
           ? __do_page_fault+0x19a/0x430
           ? kfree_debugcheck+0x16/0x40
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           ? do_audit_syscall_entry+0x66/0x70
           ? syscall_trace_enter_phase1+0x139/0x160
           SyS_semtimedop+0xe/0x10
           SyS_semop+0x10/0x20
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
         Kernel panic - not syncing: softlockup: hung tasks
      
      I wasn't able to trigger any badness on a recent kernel without the
      proper config debugs enabled, however I have softlockup reports on some
      kernel versions, in the semaphore code, which are similar as above (the
      scenario is seen on some servers running IBM DB2 which uses semaphore
      syscalls).
      
      The patch here fixes the race against freeary, by acquiring or waiting
      on the sem_undo_list lock as necessary (exit_sem can race with freeary,
      while freeary sets un->semid to -1 and removes the same sem_undo from
      list_proc or when it removes the last sem_undo).
      
      After the patch I'm unable to reproduce the problem using the test case
      [1].
      
      [1] Test case used below:
      
          #include <stdio.h>
          #include <sys/types.h>
          #include <sys/ipc.h>
          #include <sys/sem.h>
          #include <sys/wait.h>
          #include <stdlib.h>
          #include <time.h>
          #include <unistd.h>
          #include <errno.h>
      
          #define NSEM 1
          #define NSET 5
      
          int sid[NSET];
      
          void thread()
          {
                  struct sembuf op;
                  int s;
                  uid_t pid = getuid();
      
                  s = rand() % NSET;
                  op.sem_num = pid % NSEM;
                  op.sem_op = 1;
                  op.sem_flg = SEM_UNDO;
      
                  semop(sid[s], &op, 1);
                  exit(EXIT_SUCCESS);
          }
      
          void create_set()
          {
                  int i, j;
                  pid_t p;
                  union {
                          int val;
                          struct semid_ds *buf;
                          unsigned short int *array;
                          struct seminfo *__buf;
                  } un;
      
                  /* Create and initialize semaphore set */
                  for (i = 0; i < NSET; i++) {
                          sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                          if (sid[i] < 0) {
                                  perror("semget");
                                  exit(EXIT_FAILURE);
                          }
                  }
                  un.val = 0;
                  for (i = 0; i < NSET; i++) {
                          for (j = 0; j < NSEM; j++) {
                                  if (semctl(sid[i], j, SETVAL, un) < 0)
                                          perror("semctl");
                          }
                  }
      
                  /* Launch threads that operate on semaphore set */
                  for (i = 0; i < NSEM * NSET * NSET; i++) {
                          p = fork();
                          if (p < 0)
                                  perror("fork");
                          if (p == 0)
                                  thread();
                  }
      
                  /* Free semaphore set */
                  for (i = 0; i < NSET; i++) {
                          if (semctl(sid[i], NSEM, IPC_RMID))
                                  perror("IPC_RMID");
                  }
      
                  /* Wait for forked processes to exit */
                  while (wait(NULL)) {
                          if (errno == ECHILD)
                                  break;
                  };
          }
      
          int main(int argc, char **argv)
          {
                  pid_t p;
      
                  srand(time(NULL));
      
                  while (1) {
                          p = fork();
                          if (p < 0) {
                                  perror("fork");
                                  exit(EXIT_FAILURE);
                          }
                          if (p == 0) {
                                  create_set();
                                  goto end;
                          }
      
                          /* Wait for forked processes to exit */
                          while (wait(NULL)) {
                                  if (errno == ECHILD)
                                          break;
                          };
                  }
          end:
                  return 0;
          }
      
      [akpm@linux-foundation.org: use normal comment layout]
      Signed-off-by: default avatarHerton R. Krzesinski <herton@redhat.com>
      Acked-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Rafael Aquini <aquini@redhat.com>
      CC: Aristeu Rozanski <aris@redhat.com>
      Cc: David Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      04d2af28
  2. Aug 17, 2015
    • Greg Kroah-Hartman's avatar
      Linux 3.10.87 · 5a427ce1
      Greg Kroah-Hartman authored
      v3.10.87
      5a427ce1
    • Michal Hocko's avatar
      mm, vmscan: Do not wait for page writeback for GFP_NOFS allocations · 022d35a6
      Michal Hocko authored
      commit ecf5fc6e upstream.
      
      Nikolay has reported a hang when a memcg reclaim got stuck with the
      following backtrace:
      
      PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
        #0 __schedule at ffffffff815ab152
        #1 schedule at ffffffff815ab76e
        #2 schedule_timeout at ffffffff815ae5e5
        #3 io_schedule_timeout at ffffffff815aad6a
        #4 bit_wait_io at ffffffff815abfc6
        #5 __wait_on_bit at ffffffff815abda5
        #6 wait_on_page_bit at ffffffff8111fd4f
        #7 shrink_page_list at ffffffff81135445
        #8 shrink_inactive_list at ffffffff81135845
        #9 shrink_lruvec at ffffffff81135ead
       #10 shrink_zone at ffffffff811360c3
       #11 shrink_zones at ffffffff81136eff
       #12 do_try_to_free_pages at ffffffff8113712f
       #13 try_to_free_mem_cgroup_pages at ffffffff811372be
       #14 try_charge at ffffffff81189423
       #15 mem_cgroup_try_charge at ffffffff8118c6f5
       #16 __add_to_page_cache_locked at ffffffff8112137d
       #17 add_to_page_cache_lru at ffffffff81121618
       #18 pagecache_get_page at ffffffff8112170b
       #19 grow_dev_page at ffffffff811c8297
       #20 __getblk_slow at ffffffff811c91d6
       #21 __getblk_gfp at ffffffff811c92c1
       #22 ext4_ext_grow_indepth at ffffffff8124565c
       #23 ext4_ext_create_new_leaf at ffffffff81246ca8
       #24 ext4_ext_insert_extent at ffffffff81246f09
       #25 ext4_ext_map_blocks at ffffffff8124a848
       #26 ext4_map_blocks at ffffffff8121a5b7
       #27 mpage_map_one_extent at ffffffff8121b1fa
       #28 mpage_map_and_submit_extent at ffffffff8121f07b
       #29 ext4_writepages at ffffffff8121f6d5
       #30 do_writepages at ffffffff8112c490
       #31 __filemap_fdatawrite_range at ffffffff81120199
       #32 filemap_flush at ffffffff8112041c
       #33 ext4_alloc_da_blocks at ffffffff81219da1
       #34 ext4_rename at ffffffff81229b91
       #35 ext4_rename2 at ffffffff81229e32
       #36 vfs_rename at ffffffff811a08a5
       #37 SYSC_renameat2 at ffffffff811a3ffc
       #38 sys_renameat2 at ffffffff811a408e
       #39 sys_rename at ffffffff8119e51e
       #40 system_call_fastpath at ffffffff815afa89
      
      Dave Chinner has properly pointed out that this is a deadlock in the
      reclaim code because ext4 doesn't submit pages which are marked by
      PG_writeback right away.
      
      The heuristic was introduced by commit e62e384e ("memcg: prevent OOM
      with too many dirty pages") and it was applied only when may_enter_fs
      was specified.  The code has been changed by c3b94f44 ("memcg:
      further prevent OOM with too many dirty pages") which has removed the
      __GFP_FS restriction with a reasoning that we do not get into the fs
      code.  But this is not sufficient apparently because the fs doesn't
      necessarily submit pages marked PG_writeback for IO right away.
      
      ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
      submit the bio.  Instead it tries to map more pages into the bio and
      mpage_map_one_extent might trigger memcg charge which might end up
      waiting on a page which is marked PG_writeback but hasn't been submitted
      yet so we would end up waiting for something that never finishes.
      
      Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
      before we go to wait on the writeback.  The page fault path, which is
      the only path that triggers memcg oom killer since 3.12, shouldn't
      require GFP_NOFS and so we shouldn't reintroduce the premature OOM
      killer issue which was originally addressed by the heuristic.
      
      As per David Chinner the xfs is doing similar thing since 2.6.15 already
      so ext4 is not the only affected filesystem.  Moreover he notes:
      
      : For example: IO completion might require unwritten extent conversion
      : which executes filesystem transactions and GFP_NOFS allocations. The
      : writeback flag on the pages can not be cleared until unwritten
      : extent conversion completes. Hence memory reclaim cannot wait on
      : page writeback to complete in GFP_NOFS context because it is not
      : safe to do so, memcg reclaim or otherwise.
      
      [tytso@mit.edu: corrected the control flow]
      Fixes: c3b94f44
      
       ("memcg: further prevent OOM with too many dirty pages")
      Reported-by: default avatarNikolay Borisov <kernel@kyup.com>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      022d35a6
    • NeilBrown's avatar
      md/bitmap: return an error when bitmap superblock is corrupt. · bc0a524c
      NeilBrown authored
      commit b97e9257 upstream
          Use separate bitmaps for each nodes in the cluster
      
      bitmap_read_sb() validates the bitmap superblock that it reads in.
      If it finds an inconsistency like a bad magic number or out-of-range
      version number, it prints an error and returns, but it incorrectly
      returns zero, so the array is still assembled with the (invalid) bitmap.
      
      This means it could try to use a bitmap with a new version number which
      it therefore does not understand.
      
      This bug was introduced in 3.5 and fix as part of a larger patch in 4.1.
      So the patch is suitable for any -stable kernel in that range.
      
      Fixes: 27581e5a
      
       ("md/bitmap: centralise allocation of bitmap file pages.")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Reported-by: default avatarGuoQing Jiang <gqjiang@suse.com>
      bc0a524c
    • Paolo Bonzini's avatar
      kvm: x86: fix kvm_apic_has_events to check for NULL pointer · d7a681b7
      Paolo Bonzini authored
      commit ce40cd3f
      
       upstream.
      
      Malicious (or egregiously buggy) userspace can trigger it, but it
      should never happen in normal operation.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarWang Kai <morgan.wang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7a681b7
    • Amanieu d'Antras's avatar
      signal: fix information leak in copy_siginfo_from_user32 · a6bb9353
      Amanieu d'Antras authored
      commit 3c00cb5e
      
       upstream.
      
      This function can leak kernel stack data when the user siginfo_t has a
      positive si_code value.  The top 16 bits of si_code descibe which fields
      in the siginfo_t union are active, but they are treated inconsistently
      between copy_siginfo_from_user32, copy_siginfo_to_user32 and
      copy_siginfo_to_user.
      
      copy_siginfo_from_user32 is called from rt_sigqueueinfo and
      rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
      of si_code.
      
      This fixes the following information leaks:
      x86:   8 bytes leaked when sending a signal from a 32-bit process to
             itself. This leak grows to 16 bytes if the process uses x32.
             (si_code = __SI_CHLD)
      x86:   100 bytes leaked when sending a signal from a 32-bit process to
             a 64-bit process. (si_code = -1)
      sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
             64-bit process. (si_code = any)
      
      parsic and s390 have similar bugs, but they are not vulnerable because
      rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
      to a different process.  These bugs are also fixed for consistency.
      
      Signed-off-by: default avatarAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6bb9353
    • Amanieu d'Antras's avatar
      signal: fix information leak in copy_siginfo_to_user · 16a49557
      Amanieu d'Antras authored
      commit 26135022
      
       upstream.
      
      This function may copy the si_addr_lsb, si_lower and si_upper fields to
      user mode when they haven't been initialized, which can leak kernel
      stack data to user mode.
      
      Just checking the value of si_code is insufficient because the same
      si_code value is shared between multiple signals.  This is solved by
      checking the value of si_signo in addition to si_code.
      
      Signed-off-by: default avatarAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16a49557
    • Amanieu d'Antras's avatar
      signalfd: fix information leak in signalfd_copyinfo · 5c233bff
      Amanieu d'Antras authored
      commit 3ead7c52
      
       upstream.
      
      This function may copy the si_addr_lsb field to user mode when it hasn't
      been initialized, which can leak kernel stack data to user mode.
      
      Just checking the value of si_code is insufficient because the same
      si_code value is shared between multiple signals.  This is solved by
      checking the value of si_signo in addition to si_code.
      
      Signed-off-by: default avatarAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c233bff
    • Fabio Estevam's avatar
      ARM: 7819/1: fiq: Cast the first argument of flush_icache_range() · 22ab6a2b
      Fabio Estevam authored
      commit 7cb3be0a upstream.
      
      Commit 2ba85e7a
      
       (ARM: Fix FIQ code on VIVT CPUs) causes the following build warning:
      
      arch/arm/kernel/fiq.c:92:3: warning: passing argument 1 of 'cpu_cache.coherent_kern_range' makes integer from pointer without a cast [enabled by default]
      
      Cast it as '(unsigned long)base' to avoid the warning.
      
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@freescale.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Martin Kaiser <lists@kaiser.cx>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22ab6a2b
    • Russell King's avatar
      ARM: Fix FIQ code on VIVT CPUs · 627cd157
      Russell King authored
      commit 2ba85e7a
      
       upstream.
      
      Aaro Koskinen reports the following oops:
      Installing fiq handler from c001b110, length 0x164
      Unable to handle kernel paging request at virtual address ffff1224
      pgd = c0004000
      [ffff1224] *pgd=00000000, *pte=11fff0cb, *ppte=11fff00a
      ...
      [<c0013154>] (set_fiq_handler+0x0/0x6c) from [<c0365d38>] (ams_delta_init_fiq+0xa8/0x160)
       r6:00000164 r5:c001b110 r4:00000000 r3:fefecb4c
      [<c0365c90>] (ams_delta_init_fiq+0x0/0x160) from [<c0365b14>] (ams_delta_init+0xd4/0x114)
       r6:00000000 r5:fffece10 r4:c037a9e0
      [<c0365a40>] (ams_delta_init+0x0/0x114) from [<c03613b4>] (customize_machine+0x24/0x30)
      
      This is because the vectors page is now write-protected, and to change
      code in there we must write to its original alias.  Make that change,
      and adjust the cache flushing such that the code will become visible
      to the instruction stream on VIVT CPUs.
      
      Reported-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Tested-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Martin Kaiser <lists@kaiser.cx>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      627cd157
    • Russell King's avatar
      ARM: Fix !kuser helpers case · 28d4d6e9
      Russell King authored
      commit 1b16c4bc
      
       upstream.
      
      Fix yet another build failure caused by a weird set of configuration
      settings:
      
        LD      init/built-in.o
      arch/arm/kernel/built-in.o: In function `__dabt_usr':
      /home/tom3q/kernel/arch/arm/kernel/entry-armv.S:377: undefined reference to `kuser_cmpxchg64_fixup'
      arch/arm/kernel/built-in.o: In function `__irq_usr':
      /home/tom3q/kernel/arch/arm/kernel/entry-armv.S:387: undefined reference to `kuser_cmpxchg64_fixup'
      
      caused by:
      CONFIG_KUSER_HELPERS=n
      CONFIG_CPU_32v6K=n
      CONFIG_NEEDS_SYSCALL_FOR_CMPXCHG=n
      
      Reported-by: default avatarTomasz Figa <tomasz.figa@gmail.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Cc: Martin Kaiser <lists@kaiser.cx>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28d4d6e9
    • Al Viro's avatar
      sg_start_req(): make sure that there's not too many elements in iovec · 4d0dd435
      Al Viro authored
      commit 451a2886
      
       upstream.
      
      unfortunately, allowing an arbitrary 16bit value means a possibility of
      overflow in the calculation of total number of pages in bio_map_user_iov() -
      we rely on there being no more than PAGE_SIZE members of sum in the
      first loop there.  If that sum wraps around, we end up allocating
      too small array of pointers to pages and it's easy to overflow it in
      the second loop.
      
      X-Coverup: TINC (and there's no lumber cartel either)
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [bwh: s/MAX_UIOVEC/UIO_MAXIOV/. This was fixed upstream by commit
       fdc81f45
      
       ("sg_start_req(): use import_iovec()"), but we don't have
        that function.]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d0dd435
    • NeilBrown's avatar
      md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies · c4a6d3f3
      NeilBrown authored
      commit 423f04d6 upstream.
      
      raid1_end_read_request() assumes that the In_sync bits are consistent
      with the ->degaded count.
      raid1_spare_active updates the In_sync bit before the ->degraded count
      and so exposes an inconsistency, as does error()
      So extend the spinlock in raid1_spare_active() and error() to hide those
      inconsistencies.
      
      This should probably be part of
        Commit: 34cab6f4 ("md/raid1: fix test for 'was read error from
        last working device'.")
      as it addresses the same issue.  It fixes the same bug and should go
      to -stable for same reasons.
      
      Fixes: 76073054
      
       ("md/raid1: clean up read_balance.")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4a6d3f3
    • Joseph Qi's avatar
      ocfs2: fix BUG in ocfs2_downconvert_thread_do_work() · 2a4cb7b5
      Joseph Qi authored
      commit 209f7512
      
       upstream.
      
      The "BUG_ON(list_empty(&osb->blocked_lock_list))" in
      ocfs2_downconvert_thread_do_work can be triggered in the following case:
      
      ocfs2dc has firstly saved osb->blocked_lock_count to local varibale
      processed, and then processes the dentry lockres.  During the dentry
      put, it calls iput and then deletes rw, inode and open lockres from
      blocked list in ocfs2_mark_lockres_freeing.  And this causes the
      variable `processed' to not reflect the number of blocked lockres to be
      processed, which triggers the BUG.
      
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2a4cb7b5
    • Marcus Gelderie's avatar
      ipc: modify message queue accounting to not take kernel data structures into account · 2934eb36
      Marcus Gelderie authored
      commit de54b9ac upstream.
      
      A while back, the message queue implementation in the kernel was
      improved to use btrees to speed up retrieval of messages, in commit
      d6629859 ("ipc/mqueue: improve performance of send/recv").
      
      That patch introducing the improved kernel handling of message queues
      (using btrees) has, as a by-product, changed the meaning of the QSIZE
      field in the pseudo-file created for the queue.  Before, this field
      reflected the size of the user-data in the queue.  Since, it also takes
      kernel data structures into account.  For example, if 13 bytes of user
      data are in the queue, on my machine the file reports a size of 61
      bytes.
      
      There was some discussion on this topic before (for example
      https://lkml.org/lkml/2014/10/1/115).  Commenting on a th lkml, Michael
      Kerrisk gave the following background
      (https://lkml.org/lkml/2015/6/16/74
      
      ):
      
          The pseudofiles in the mqueue filesystem (usually mounted at
          /dev/mqueue) expose fields with metadata describing a message
          queue. One of these fields, QSIZE, as originally implemented,
          showed the total number of bytes of user data in all messages in
          the message queue, and this feature was documented from the
          beginning in the mq_overview(7) page. In 3.5, some other (useful)
          work happened to break the user-space API in a couple of places,
          including the value exposed via QSIZE, which now includes a measure
          of kernel overhead bytes for the queue, a figure that renders QSIZE
          useless for its original purpose, since there's no way to deduce
          the number of overhead bytes consumed by the implementation.
          (The other user-space breakage was subsequently fixed.)
      
      This patch removes the accounting of kernel data structures in the
      queue.  Reporting the size of these data-structures in the QSIZE field
      was a breaking change (see Michael's comment above).  Without the QSIZE
      field reporting the total size of user-data in the queue, there is no
      way to deduce this number.
      
      It should be noted that the resource limit RLIMIT_MSGQUEUE is counted
      against the worst-case size of the queue (in both the old and the new
      implementation).  Therefore, the kernel overhead accounting in QSIZE is
      not necessary to help the user understand the limitations RLIMIT imposes
      on the processes.
      
      Signed-off-by: default avatarMarcus Gelderie <redmnic@gmail.com>
      Acked-by: default avatarDoug Ledford <dledford@redhat.com>
      Acked-by: default avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: John Duffy <jb_duffy@btinternet.com>
      Cc: Arto Bendiken <arto@bendiken.net>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2934eb36
    • Dan Carpenter's avatar
      ALSA: hda - fix cs4210_spdif_automute() · 5d6e5895
      Dan Carpenter authored
      commit 44008f08 upstream.
      
      Smatch complains that we have nested checks for "spdif_present".  It
      turns out the current behavior isn't correct, we should remove the first
      check and keep the second.
      
      Fixes: 1077a024
      
       ('ALSA: hda - Use generic parser for Cirrus codec driver')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d6e5895
    • Nicholas Bellinger's avatar
      iscsi-target: Fix iscsit_start_kthreads failure OOPs · 621468a3
      Nicholas Bellinger authored
      commit e5419865 upstream.
      
      This patch fixes a regression introduced with the following commit
      in v4.0-rc1 code, where a iscsit_start_kthreads() failure triggers
      a NULL pointer dereference OOPs:
      
          commit 88dcd2da
      
      
          Author: Nicholas Bellinger <nab@linux-iscsi.org>
          Date:   Thu Feb 26 22:19:15 2015 -0800
      
              iscsi-target: Convert iscsi_thread_set usage to kthread.h
      
      To address this bug, move iscsit_start_kthreads() immediately
      preceeding the transmit of last login response, before signaling
      a successful transition into full-feature-phase within existing
      iscsi_target_do_tx_login_io() logic.
      
      This ensures that no target-side resource allocation failures can
      occur after the final login response has been successfully sent.
      
      Also, it adds a iscsi_conn->rx_login_comp to allow the RX thread
      to sleep to prevent other socket related failures until the final
      iscsi_post_login_handler() call is able to complete.
      
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarNicholas Bellinger <nab@daterainc.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      621468a3
    • Ilya Dryomov's avatar
      rbd: fix copyup completion race · dff252b8
      Ilya Dryomov authored
      commit 2761713d
      
       upstream.
      
      For write/discard obj_requests that involved a copyup method call, the
      opcode of the first op is CEPH_OSD_OP_CALL and the ->callback is
      rbd_img_obj_copyup_callback().  The latter frees copyup pages, sets
      ->xferred and delegates to rbd_img_obj_callback(), the "normal" image
      object callback, for reporting to block layer and putting refs.
      
      rbd_osd_req_callback() however treats CEPH_OSD_OP_CALL as a trivial op,
      which means obj_request is marked done in rbd_osd_trivial_callback(),
      *before* ->callback is invoked and rbd_img_obj_copyup_callback() has
      a chance to run.  Marking obj_request done essentially means giving
      rbd_img_obj_callback() a license to end it at any moment, so if another
      obj_request from the same img_request is being completed concurrently,
      rbd_img_obj_end_request() may very well be called on such prematurally
      marked done request:
      
      <obj_request-1/2 reply>
      handle_reply()
        rbd_osd_req_callback()
          rbd_osd_trivial_callback()
          rbd_obj_request_complete()
          rbd_img_obj_copyup_callback()
          rbd_img_obj_callback()
                                          <obj_request-2/2 reply>
                                          handle_reply()
                                            rbd_osd_req_callback()
                                              rbd_osd_trivial_callback()
            for_each_obj_request(obj_request->img_request) {
              rbd_img_obj_end_request(obj_request-1/2)
              rbd_img_obj_end_request(obj_request-2/2) <--
            }
      
      Calling rbd_img_obj_end_request() on such a request leads to trouble,
      in particular because its ->xfferred is 0.  We report 0 to the block
      layer with blk_update_request(), get back 1 for "this request has more
      data in flight" and then trip on
      
          rbd_assert(more ^ (which == img_request->obj_request_count));
      
      with rhs (which == ...) being 1 because rbd_img_obj_end_request() has
      been called for both requests and lhs (more) being 1 because we haven't
      got a chance to set ->xfferred in rbd_img_obj_copyup_callback() yet.
      
      To fix this, leverage that rbd wants to call class methods in only two
      cases: one is a generic method call wrapper (obj_request is standalone)
      and the other is a copyup (obj_request is part of an img_request).  So
      make a dedicated handler for CEPH_OSD_OP_CALL and directly invoke
      rbd_img_obj_copyup_callback() from it if obj_request is part of an
      img_request, similar to how CEPH_OSD_OP_READ handler invokes
      rbd_img_obj_request_read_callback().
      
      Since rbd_img_obj_copyup_callback() is now being called from the OSD
      request callback (only), it is renamed to rbd_osd_copyup_callback().
      
      Cc: Alex Elder <elder@linaro.org>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dff252b8
    • Herbert Xu's avatar
      crypto: ixp4xx - Remove bogus BUG_ON on scattered dst buffer · d3646ba7
      Herbert Xu authored
      commit f898c522
      
       upstream.
      
      This patch removes a bogus BUG_ON in the ablkcipher path that
      triggers when the destination buffer is different from the source
      buffer and is scattered.
      
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d3646ba7
    • Marek Marczykowski-Górecki's avatar
      xen/gntdevt: Fix race condition in gntdev_release() · 292f5367
      Marek Marczykowski-Górecki authored
      commit 30b03d05
      
       upstream.
      
      While gntdev_release() is called the MMU notifier is still registered
      and can traverse priv->maps list even if no pages are mapped (which is
      the case -- gntdev_release() is called after all). But
      gntdev_release() will clear that list, so make sure that only one of
      those things happens at the same time.
      
      Signed-off-by: default avatarMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      292f5367
    • Andy Lutomirski's avatar
      x86/xen: Probe target addresses in set_aliased_prot() before the hypercall · 3f2c206a
      Andy Lutomirski authored
      commit aa1acff3
      
       upstream.
      
      The update_va_mapping hypercall can fail if the VA isn't present
      in the guest's page tables.  Under certain loads, this can
      result in an OOPS when the target address is in unpopulated vmap
      space.
      
      While we're at it, add comments to help explain what's going on.
      
      This isn't a great long-term fix.  This code should probably be
      changed to use something like set_memory_ro.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <dvrabel@cantab.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f2c206a
    • David S. Miller's avatar
      sparc64: Fix userspace FPU register corruptions. · 683d1a7f
      David S. Miller authored
      [ Upstream commit 44922150
      
       ]
      
      If we have a series of events from userpsace, with %fprs=FPRS_FEF,
      like follows:
      
      ETRAP
      	ETRAP
      		VIS_ENTRY(fprs=0x4)
      		VIS_EXIT
      		RTRAP (kernel FPU restore with fpu_saved=0x4)
      	RTRAP
      
      We will not restore the user registers that were clobbered by the FPU
      using kernel code in the inner-most trap.
      
      Traps allocate FPU save slots in the thread struct, and FPU using
      sequences save the "dirty" FPU registers only.
      
      This works at the initial trap level because all of the registers
      get recorded into the top-level FPU save area, and we'll return
      to userspace with the FPU disabled so that any FPU use by the user
      will take an FPU disabled trap wherein we'll load the registers
      back up properly.
      
      But this is not how trap returns from kernel to kernel operate.
      
      The simplest fix for this bug is to always save all FPU register state
      for anything other than the top-most FPU save area.
      
      Getting rid of the optimized inner-slot FPU saving code ends up
      making VISEntryHalf degenerate into plain VISEntry.
      
      Longer term we need to do something smarter to reinstate the partial
      save optimizations.  Perhaps the fundament error is having trap entry
      and exit allocate FPU save slots and restore register state.  Instead,
      the VISEntry et al. calls should be doing that work.
      
      This bug is about two decades old.
      
      Reported-by: default avatarJames Y Knight <jyknight@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      683d1a7f
    • David S. Miller's avatar
      sparc64: Fix FPU register corruption with AES crypto offload. · 2312fd49
      David S. Miller authored
      [ Upstream commit f4da3628
      
       ]
      
      The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the
      key material is preloaded into the FPU registers, and then we loop
      over and over doing the crypt operation, reusing those pre-cooked key
      registers.
      
      There are intervening blkcipher*() calls between the crypt operation
      calls.  And those might perform memcpy() and thus also try to use the
      FPU.
      
      The sparc64 kernel FPU usage mechanism is designed to allow such
      recursive uses, but with a catch.
      
      There has to be a trap between the two FPU using threads of control.
      
      The mechanism works by, when the FPU is already in use by the kernel,
      allocating a slot for FPU saving at trap time.  Then if, within the
      trap handler, we try to use the FPU registers, the pre-trap FPU
      register state is saved into the slot.  Then at trap return time we
      notice this and restore the pre-trap FPU state.
      
      Over the long term there are various more involved ways we can make
      this work, but for a quick fix let's take advantage of the fact that
      the situation where this happens is very limited.
      
      All sparc64 chips that support the crypto instructiosn also are using
      the Niagara4 memcpy routine, and that routine only uses the FPU for
      large copies where we can't get the source aligned properly to a
      multiple of 8 bytes.
      
      We look to see if the FPU is already in use in this context, and if so
      we use the non-large copy path which only uses integer registers.
      
      Furthermore, we also limit this special logic to when we are doing
      kernel copy, rather than a user copy.
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2312fd49
    • Peter Zijlstra's avatar
      perf/x86/amd: Rework AMD PMU init code · 3d823198
      Peter Zijlstra authored
      commit 1b45adcd
      
       upstream.
      
      Josh reported that his QEMU is a bad hardware emulator and trips a
      WARN in the AMD PMU init code. He requested the WARN be turned into a
      pr_err() or similar.
      
      While there, rework the code a little.
      
      Reported-by: default avatarJosh Boyer <jwboyer@redhat.com>
      Acked-by: default avatarRobert Richter <rric@kernel.org>
      Acked-by: default avatarJacob Shin <jacob.shin@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130521110537.GG26912@twins.programming.kicks-ass.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d823198
    • Guenter Roeck's avatar
      mfd: sm501: dbg_regs attribute must be read-only · 14c99cd5
      Guenter Roeck authored
      commit 8a8320c2
      
       upstream.
      
      Fix:
      
      sm501 sm501: SM501 At b3e00000: Version 050100a0, 8 Mb, IRQ 100
      Attribute dbg_regs: write permission without 'store'
      ------------[ cut here ]------------
      WARNING: at drivers/base/core.c:620
      
      dbg_regs does not have a write function and must therefore be marked
      as read-only.
      
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14c99cd5
    • Xie XiuQi's avatar
      ipmi: fix timeout calculation when bmc is disconnected · 471bfba8
      Xie XiuQi authored
      commit e21404dc
      
       upstream.
      
      Loading ipmi_si module while bmc is disconnected, we found the timeout
      is longer than 5 secs.  Actually it takes about 3 mins and 20
      secs.(HZ=250)
      
      error message as below:
        Dec 12 19:08:59 linux kernel: IPMI BT: timeout in RD_WAIT [ ] 1 retries left
        Dec 12 19:08:59 linux kernel: BT: write 4 bytes seq=0x01 03 18 00 01
        [...]
        Dec 12 19:12:19 linux kernel: IPMI BT: timeout in RD_WAIT [ ]
        Dec 12 19:12:19 linux kernel: failed 2 retries, sending error response
        Dec 12 19:12:19 linux kernel: IPMI: BT reset (takes 5 secs)
        Dec 12 19:12:19 linux kernel: IPMI BT: flag reset [ ]
      
      Function wait_for_msg_done() use schedule_timeout_uninterruptible(1) to
      sleep 1 tick, so we should subtract jiffies_to_usecs(1) instead of 100
      usecs from timeout.
      
      Reported-by: default avatarHu Shiyuan <hushiyuan@huawei.com>
      Signed-off-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      471bfba8
    • Benjamin Randazzo's avatar
      md: use kzalloc() when bitmap is disabled · 21c7d380
      Benjamin Randazzo authored
      commit b6878d9e
      
       upstream.
      
      In drivers/md/md.c get_bitmap_file() uses kmalloc() for creating a
      mdu_bitmap_file_t called "file".
      
      5769         file = kmalloc(sizeof(*file), GFP_NOIO);
      5770         if (!file)
      5771                 return -ENOMEM;
      
      This structure is copied to user space at the end of the function.
      
      5786         if (err == 0 &&
      5787             copy_to_user(arg, file, sizeof(*file)))
      5788                 err = -EFAULT
      
      But if bitmap is disabled only the first byte of "file" is initialized
      with zero, so it's possible to read some bytes (up to 4095) of kernel
      space memory from user space. This is an information leak.
      
      5775         /* bitmap disabled, zero the first byte and copy out */
      5776         if (!mddev->bitmap_info.file)
      5777                 file->pathname[0] = '\0';
      
      Signed-off-by: default avatarBenjamin Randazzo <benjamin@randazzo.fr>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21c7d380
    • Dirk Behme's avatar
      USB: sierra: add 1199:68AB device ID · e850ac8e
      Dirk Behme authored
      commit 74472233
      
       upstream.
      
      Add support for the Sierra Wireless AR8550 device with
      USB descriptor 0x1199, 0x68AB.
      
      It is common with MC879x modules 1199:683c/683d which
      also are composite devices with 7 interfaces (0..6)
      and also MDM62xx based as the AR8550.
      
      The major difference are only the interface attributes
      02/02/01 on interfaces 3 and 4 on the AR8550. They are
      vendor specific ff/ff/ff on MC879x modules.
      
      lsusb reports:
      
      Bus 001 Device 004: ID 1199:68ab Sierra Wireless, Inc.
      Device Descriptor:
        bLength                18
        bDescriptorType         1
        bcdUSB               2.00
        bDeviceClass            0 (Defined at Interface level)
        bDeviceSubClass         0
        bDeviceProtocol         0
        bMaxPacketSize0        64
        idVendor           0x1199 Sierra Wireless, Inc.
        idProduct          0x68ab
        bcdDevice            0.06
        iManufacturer           3 Sierra Wireless, Incorporated
        iProduct                2 AR8550
        iSerial                 0
        bNumConfigurations      1
        Configuration Descriptor:
          bLength                 9
          bDescriptorType         2
          wTotalLength          198
          bNumInterfaces          7
          bConfigurationValue     1
          iConfiguration          1 Sierra Configuration
          bmAttributes         0xe0
            Self Powered
            Remote Wakeup
          MaxPower                0mA
          Interface Descriptor:
            bLength                 9
            bDescriptorType         4
            bInterfaceNumber        0
            bAlternateSetting       0
            bNumEndpoints           2
            bInterfaceClass       255 Vendor Specific Class
            bInterfaceSubClass    255 Vendor Specific Subclass
            bInterfaceProtocol    255 Vendor Specific Protocol
            iInterface              0
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x81  EP 1 IN
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x01  EP 1 OUT
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
          Interface Descriptor:
            bLength                 9
            bDescriptorType         4
            bInterfaceNumber        1
            bAlternateSetting       0
            bNumEndpoints           2
            bInterfaceClass       255 Vendor Specific Class
            bInterfaceSubClass    255 Vendor Specific Subclass
            bInterfaceProtocol    255 Vendor Specific Protocol
            iInterface              0
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x82  EP 2 IN
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x02  EP 2 OUT
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
          Interface Descriptor:
            bLength                 9
            bDescriptorType         4
            bInterfaceNumber        2
            bAlternateSetting       0
            bNumEndpoints           2
            bInterfaceClass       255 Vendor Specific Class
            bInterfaceSubClass    255 Vendor Specific Subclass
            bInterfaceProtocol    255 Vendor Specific Protocol
            iInterface              0
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x83  EP 3 IN
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x03  EP 3 OUT
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
          Interface Descriptor:
            bLength                 9
            bDescriptorType         4
            bInterfaceNumber        3
            bAlternateSetting       0
            bNumEndpoints           3
            bInterfaceClass         2 Communications
            bInterfaceSubClass      2 Abstract (modem)
            bInterfaceProtocol      1 AT-commands (v.25ter)
            iInterface              0
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x84  EP 4 IN
              bmAttributes            3
                Transfer Type            Interrupt
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0040  1x 64 bytes
              bInterval               5
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x85  EP 5 IN
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x04  EP 4 OUT
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
          Interface Descriptor:
            bLength                 9
            bDescriptorType         4
            bInterfaceNumber        4
            bAlternateSetting       0
            bNumEndpoints           3
            bInterfaceClass         2 Communications
            bInterfaceSubClass      2 Abstract (modem)
            bInterfaceProtocol      1 AT-commands (v.25ter)
            iInterface              0
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x86  EP 6 IN
              bmAttributes            3
                Transfer Type            Interrupt
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0040  1x 64 bytes
              bInterval               5
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x87  EP 7 IN
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x05  EP 5 OUT
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
          Interface Descriptor:
            bLength                 9
            bDescriptorType         4
            bInterfaceNumber        5
            bAlternateSetting       0
            bNumEndpoints           3
            bInterfaceClass       255 Vendor Specific Class
            bInterfaceSubClass    255 Vendor Specific Subclass
            bInterfaceProtocol    255 Vendor Specific Protocol
            iInterface              0
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x88  EP 8 IN
              bmAttributes            3
                Transfer Type            Interrupt
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0040  1x 64 bytes
              bInterval               5
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x89  EP 9 IN
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x06  EP 6 OUT
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
          Interface Descriptor:
            bLength                 9
            bDescriptorType         4
            bInterfaceNumber        6
            bAlternateSetting       0
            bNumEndpoints           3
            bInterfaceClass       255 Vendor Specific Class
            bInterfaceSubClass    255 Vendor Specific Subclass
            bInterfaceProtocol    255 Vendor Specific Protocol
            iInterface              0
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x8a  EP 10 IN
              bmAttributes            3
                Transfer Type            Interrupt
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0040  1x 64 bytes
              bInterval               5
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x8b  EP 11 IN
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
            Endpoint Descriptor:
              bLength                 7
              bDescriptorType         5
              bEndpointAddress     0x07  EP 7 OUT
              bmAttributes            2
                Transfer Type            Bulk
                Synch Type               None
                Usage Type               Data
              wMaxPacketSize     0x0200  1x 512 bytes
              bInterval              32
      Device Qualifier (for other device speed):
        bLength                10
        bDescriptorType         6
        bcdUSB               2.00
        bDeviceClass            0 (Defined at Interface level)
        bDeviceSubClass         0
        bDeviceProtocol         0
        bMaxPacketSize0        64
        bNumConfigurations      1
      Device Status:     0x0001
        Self Powered
      
      Signed-off-by: default avatarDirk Behme <dirk.behme@de.bosch.com>
      Cc: Lars Melin <larsm17@gmail.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e850ac8e
    • Mathias Nyman's avatar
      xhci: fix off by one error in TRB DMA address boundary check · c0f94181
      Mathias Nyman authored
      commit 7895086a
      
       upstream.
      
      We need to check that a TRB is part of the current segment
      before calculating its DMA address.
      
      Previously a ring segment didn't use a full memory page, and every
      new ring segment got a new memory page, so the off by one
      error in checking the upper bound was never seen.
      
      Now that we use a full memory page, 256 TRBs (4096 bytes), the off by one
      didn't catch the case when a TRB was the first element of the next segment.
      
      This is triggered if the virtual memory pages for a ring segment are
      next to each in increasing order where the ring buffer wraps around and
      causes errors like:
      
      [  106.398223] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 1
      [  106.398230] xhci_hcd 0000:00:14.0: Looking for event-dma fffd3000 trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0
      
      The trb-end address is one outside the end-seg address.
      
      Tested-by: default avatarArkadiusz Miśkiewicz <arekm@maven.pl>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c0f94181