Skip to content
  1. Aug 25, 2015
    • Jiri Slaby's avatar
      Linux 3.12.47 · 5f6177d6
      Jiri Slaby authored
      v3.12.47
      5f6177d6
    • Ilya Dryomov's avatar
      rbd: fix copyup completion race · eca3a901
      Ilya Dryomov authored
      commit 2761713d
      
       upstream.
      
      For write/discard obj_requests that involved a copyup method call, the
      opcode of the first op is CEPH_OSD_OP_CALL and the ->callback is
      rbd_img_obj_copyup_callback().  The latter frees copyup pages, sets
      ->xferred and delegates to rbd_img_obj_callback(), the "normal" image
      object callback, for reporting to block layer and putting refs.
      
      rbd_osd_req_callback() however treats CEPH_OSD_OP_CALL as a trivial op,
      which means obj_request is marked done in rbd_osd_trivial_callback(),
      *before* ->callback is invoked and rbd_img_obj_copyup_callback() has
      a chance to run.  Marking obj_request done essentially means giving
      rbd_img_obj_callback() a license to end it at any moment, so if another
      obj_request from the same img_request is being completed concurrently,
      rbd_img_obj_end_request() may very well be called on such prematurally
      marked done request:
      
      <obj_request-1/2 reply>
      handle_reply()
        rbd_osd_req_callback()
          rbd_osd_trivial_callback()
          rbd_obj_request_complete()
          rbd_img_obj_copyup_callback()
          rbd_img_obj_callback()
                                          <obj_request-2/2 reply>
                                          handle_reply()
                                            rbd_osd_req_callback()
                                              rbd_osd_trivial_callback()
            for_each_obj_request(obj_request->img_request) {
              rbd_img_obj_end_request(obj_request-1/2)
              rbd_img_obj_end_request(obj_request-2/2) <--
            }
      
      Calling rbd_img_obj_end_request() on such a request leads to trouble,
      in particular because its ->xfferred is 0.  We report 0 to the block
      layer with blk_update_request(), get back 1 for "this request has more
      data in flight" and then trip on
      
          rbd_assert(more ^ (which == img_request->obj_request_count));
      
      with rhs (which == ...) being 1 because rbd_img_obj_end_request() has
      been called for both requests and lhs (more) being 1 because we haven't
      got a chance to set ->xfferred in rbd_img_obj_copyup_callback() yet.
      
      To fix this, leverage that rbd wants to call class methods in only two
      cases: one is a generic method call wrapper (obj_request is standalone)
      and the other is a copyup (obj_request is part of an img_request).  So
      make a dedicated handler for CEPH_OSD_OP_CALL and directly invoke
      rbd_img_obj_copyup_callback() from it if obj_request is part of an
      img_request, similar to how CEPH_OSD_OP_READ handler invokes
      rbd_img_obj_request_read_callback().
      
      Since rbd_img_obj_copyup_callback() is now being called from the OSD
      request callback (only), it is renamed to rbd_osd_copyup_callback().
      
      Cc: Alex Elder <elder@linaro.org>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      [idryomov@gmail.com: backport to < 3.18: context]
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      eca3a901
    • Alex Deucher's avatar
      drm/radeon: add new OLAND pci id · f81a12e9
      Alex Deucher authored
      commit e037239e
      
       upstream.
      
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f81a12e9
    • Michael Walle's avatar
      EDAC, ppc4xx: Access mci->csrows array elements properly · 2a1fd7df
      Michael Walle authored
      commit 5c16179b upstream.
      
      The commit
      
        de3910eb
      
       ("edac: change the mem allocation scheme to
      		 make Documentation/kobject.txt happy")
      
      changed the memory allocation for the csrows member. But ppc4xx_edac was
      forgotten in the patch. Fix it.
      
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.cc
      
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2a1fd7df
    • Richard Weinberger's avatar
      localmodconfig: Use Kbuild files too · 6713bcb7
      Richard Weinberger authored
      commit c0ddc8c7 upstream.
      
      In kbuild it is allowed to define objects in files named "Makefile"
      and "Kbuild".
      Currently localmodconfig reads objects only from "Makefile"s and misses
      modules like nouveau.
      
      Link: http://lkml.kernel.org/r/1437948415-16290-1-git-send-email-richard@nod.at
      
      
      
      Reported-and-tested-by: default avatarLeonidas Spyropoulos <artafinde@gmail.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6713bcb7
    • Joe Thornber's avatar
      dm thin metadata: delete btrees when releasing metadata snapshot · 1d124d81
      Joe Thornber authored
      commit 7f518ad0
      
       upstream.
      
      The device details and mapping trees were just being decremented
      before.  Now btree_del() is called to do a deep delete.
      
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1d124d81
    • Peter Zijlstra's avatar
      perf: Fix fasync handling on inherited events · 78010f59
      Peter Zijlstra authored
      commit fed66e2c
      
       upstream.
      
      Vince reported that the fasync signal stuff doesn't work proper for
      inherited events. So fix that.
      
      Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
      which upon the demise of the file descriptor ensures the allocation is
      freed and state is updated.
      
      Now for perf, we can have the events stick around for a while after the
      original FD is dead because of references from child events. So we
      cannot copy the fasync pointer around. We can however consistently use
      the parent's fasync, as that will be updated.
      
      Reported-and-Tested-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho deMelo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1434011521.1495.71.camel@twins
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      78010f59
    • Bob Liu's avatar
      xen-blkfront: don't add indirect pages to list when !feature_persistent · 78e034d3
      Bob Liu authored
      commit 7b076750
      
       upstream.
      
      We should consider info->feature_persistent when adding indirect page to list
      info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.
      
      When we are using persistent grants the indirect_pages list
      should always be empty because blkfront has pre-allocated enough
      persistent pages to fill all requests on the ring.
      
      Acked-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      78e034d3
    • Wanpeng Li's avatar
      mm/hwpoison: fix page refcount of unknown non LRU page · 5b7ceda1
      Wanpeng Li authored
      commit 4f32be67
      
       upstream.
      
      After trying to drain pages from pagevec/pageset, we try to get reference
      count of the page again, however, the reference count of the page is not
      reduced if the page is still not on LRU list.
      
      Fix it by adding the put_page() to drop the page reference which is from
      __get_any_page().
      
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5b7ceda1
    • Herton R. Krzesinski's avatar
      ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits · 9ae388fb
      Herton R. Krzesinski authored
      commit 602b8593
      
       upstream.
      
      The current semaphore code allows a potential use after free: in
      exit_sem we may free the task's sem_undo_list while there is still
      another task looping through the same semaphore set and cleaning the
      sem_undo list at freeary function (the task called IPC_RMID for the same
      semaphore set).
      
      For example, with a test program [1] running which keeps forking a lot
      of processes (which then do a semop call with SEM_UNDO flag), and with
      the parent right after removing the semaphore set with IPC_RMID, and a
      kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
      CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
      in the kernel log:
      
         Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
         000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
         010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
         Prev obj: start=ffff88003b45c180, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
         Next obj: start=ffff88003b45c200, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
         BUG: spinlock wrong CPU on CPU#2, test/18028
         general protection fault: 0000 [#1] SMP
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: spin_dump+0x53/0xc0
         Call Trace:
           spin_bug+0x30/0x40
           do_raw_spin_unlock+0x71/0xa0
           _raw_spin_unlock+0xe/0x10
           freeary+0x82/0x2a0
           ? _raw_spin_lock+0xe/0x10
           semctl_down.clone.0+0xce/0x160
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           SyS_semctl+0x236/0x2c0
           ? syscall_trace_leave+0xde/0x130
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
         RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
          RSP <ffff88003750fd68>
         ---[ end trace 783ebb76612867a0 ]---
         NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: native_read_tsc+0x0/0x20
         Call Trace:
           ? delay_tsc+0x40/0x70
           __delay+0xf/0x20
           do_raw_spin_lock+0x96/0x140
           _raw_spin_lock+0xe/0x10
           sem_lock_and_putref+0x11/0x70
           SYSC_semtimedop+0x7bf/0x960
           ? handle_mm_fault+0xbf6/0x1880
           ? dequeue_task_fair+0x79/0x4a0
           ? __do_page_fault+0x19a/0x430
           ? kfree_debugcheck+0x16/0x40
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           ? do_audit_syscall_entry+0x66/0x70
           ? syscall_trace_enter_phase1+0x139/0x160
           SyS_semtimedop+0xe/0x10
           SyS_semop+0x10/0x20
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
         Kernel panic - not syncing: softlockup: hung tasks
      
      I wasn't able to trigger any badness on a recent kernel without the
      proper config debugs enabled, however I have softlockup reports on some
      kernel versions, in the semaphore code, which are similar as above (the
      scenario is seen on some servers running IBM DB2 which uses semaphore
      syscalls).
      
      The patch here fixes the race against freeary, by acquiring or waiting
      on the sem_undo_list lock as necessary (exit_sem can race with freeary,
      while freeary sets un->semid to -1 and removes the same sem_undo from
      list_proc or when it removes the last sem_undo).
      
      After the patch I'm unable to reproduce the problem using the test case
      [1].
      
      [1] Test case used below:
      
          #include <stdio.h>
          #include <sys/types.h>
          #include <sys/ipc.h>
          #include <sys/sem.h>
          #include <sys/wait.h>
          #include <stdlib.h>
          #include <time.h>
          #include <unistd.h>
          #include <errno.h>
      
          #define NSEM 1
          #define NSET 5
      
          int sid[NSET];
      
          void thread()
          {
                  struct sembuf op;
                  int s;
                  uid_t pid = getuid();
      
                  s = rand() % NSET;
                  op.sem_num = pid % NSEM;
                  op.sem_op = 1;
                  op.sem_flg = SEM_UNDO;
      
                  semop(sid[s], &op, 1);
                  exit(EXIT_SUCCESS);
          }
      
          void create_set()
          {
                  int i, j;
                  pid_t p;
                  union {
                          int val;
                          struct semid_ds *buf;
                          unsigned short int *array;
                          struct seminfo *__buf;
                  } un;
      
                  /* Create and initialize semaphore set */
                  for (i = 0; i < NSET; i++) {
                          sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                          if (sid[i] < 0) {
                                  perror("semget");
                                  exit(EXIT_FAILURE);
                          }
                  }
                  un.val = 0;
                  for (i = 0; i < NSET; i++) {
                          for (j = 0; j < NSEM; j++) {
                                  if (semctl(sid[i], j, SETVAL, un) < 0)
                                          perror("semctl");
                          }
                  }
      
                  /* Launch threads that operate on semaphore set */
                  for (i = 0; i < NSEM * NSET * NSET; i++) {
                          p = fork();
                          if (p < 0)
                                  perror("fork");
                          if (p == 0)
                                  thread();
                  }
      
                  /* Free semaphore set */
                  for (i = 0; i < NSET; i++) {
                          if (semctl(sid[i], NSEM, IPC_RMID))
                                  perror("IPC_RMID");
                  }
      
                  /* Wait for forked processes to exit */
                  while (wait(NULL)) {
                          if (errno == ECHILD)
                                  break;
                  };
          }
      
          int main(int argc, char **argv)
          {
                  pid_t p;
      
                  srand(time(NULL));
      
                  while (1) {
                          p = fork();
                          if (p < 0) {
                                  perror("fork");
                                  exit(EXIT_FAILURE);
                          }
                          if (p == 0) {
                                  create_set();
                                  goto end;
                          }
      
                          /* Wait for forked processes to exit */
                          while (wait(NULL)) {
                                  if (errno == ECHILD)
                                          break;
                          };
                  }
          end:
                  return 0;
          }
      
      [akpm@linux-foundation.org: use normal comment layout]
      Signed-off-by: default avatarHerton R. Krzesinski <herton@redhat.com>
      Acked-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Rafael Aquini <aquini@redhat.com>
      CC: Aristeu Rozanski <aris@redhat.com>
      Cc: David Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9ae388fb
    • Manfred Spraul's avatar
      ipc/sem.c: update/correct memory barriers · e50758a0
      Manfred Spraul authored
      commit 3ed1f8a9
      
       upstream.
      
      sem_lock() did not properly pair memory barriers:
      
      !spin_is_locked() and spin_unlock_wait() are both only control barriers.
      The code needs an acquire barrier, otherwise the cpu might perform read
      operations before the lock test.
      
      As no primitive exists inside <include/spinlock.h> and since it seems
      noone wants another primitive, the code creates a local primitive within
      ipc/sem.c.
      
      With regards to -stable:
      
      The change of sem_wait_array() is a bugfix, the change to sem_lock() is a
      nop (just a preprocessor redefinition to improve the readability).  The
      bugfix is necessary for all kernels that use sem_wait_array() (i.e.:
      starting from 3.10).
      
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Kirill Tkhai <ktkhai@parallels.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e50758a0
    • Michal Hocko's avatar
      mm, vmscan: Do not wait for page writeback for GFP_NOFS allocations · 56c65036
      Michal Hocko authored
      commit ecf5fc6e upstream.
      
      Nikolay has reported a hang when a memcg reclaim got stuck with the
      following backtrace:
      
      PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
        #0 __schedule at ffffffff815ab152
        #1 schedule at ffffffff815ab76e
        #2 schedule_timeout at ffffffff815ae5e5
        #3 io_schedule_timeout at ffffffff815aad6a
        #4 bit_wait_io at ffffffff815abfc6
        #5 __wait_on_bit at ffffffff815abda5
        #6 wait_on_page_bit at ffffffff8111fd4f
        #7 shrink_page_list at ffffffff81135445
        #8 shrink_inactive_list at ffffffff81135845
        #9 shrink_lruvec at ffffffff81135ead
       #10 shrink_zone at ffffffff811360c3
       #11 shrink_zones at ffffffff81136eff
       #12 do_try_to_free_pages at ffffffff8113712f
       #13 try_to_free_mem_cgroup_pages at ffffffff811372be
       #14 try_charge at ffffffff81189423
       #15 mem_cgroup_try_charge at ffffffff8118c6f5
       #16 __add_to_page_cache_locked at ffffffff8112137d
       #17 add_to_page_cache_lru at ffffffff81121618
       #18 pagecache_get_page at ffffffff8112170b
       #19 grow_dev_page at ffffffff811c8297
       #20 __getblk_slow at ffffffff811c91d6
       #21 __getblk_gfp at ffffffff811c92c1
       #22 ext4_ext_grow_indepth at ffffffff8124565c
       #23 ext4_ext_create_new_leaf at ffffffff81246ca8
       #24 ext4_ext_insert_extent at ffffffff81246f09
       #25 ext4_ext_map_blocks at ffffffff8124a848
       #26 ext4_map_blocks at ffffffff8121a5b7
       #27 mpage_map_one_extent at ffffffff8121b1fa
       #28 mpage_map_and_submit_extent at ffffffff8121f07b
       #29 ext4_writepages at ffffffff8121f6d5
       #30 do_writepages at ffffffff8112c490
       #31 __filemap_fdatawrite_range at ffffffff81120199
       #32 filemap_flush at ffffffff8112041c
       #33 ext4_alloc_da_blocks at ffffffff81219da1
       #34 ext4_rename at ffffffff81229b91
       #35 ext4_rename2 at ffffffff81229e32
       #36 vfs_rename at ffffffff811a08a5
       #37 SYSC_renameat2 at ffffffff811a3ffc
       #38 sys_renameat2 at ffffffff811a408e
       #39 sys_rename at ffffffff8119e51e
       #40 system_call_fastpath at ffffffff815afa89
      
      Dave Chinner has properly pointed out that this is a deadlock in the
      reclaim code because ext4 doesn't submit pages which are marked by
      PG_writeback right away.
      
      The heuristic was introduced by commit e62e384e ("memcg: prevent OOM
      with too many dirty pages") and it was applied only when may_enter_fs
      was specified.  The code has been changed by c3b94f44 ("memcg:
      further prevent OOM with too many dirty pages") which has removed the
      __GFP_FS restriction with a reasoning that we do not get into the fs
      code.  But this is not sufficient apparently because the fs doesn't
      necessarily submit pages marked PG_writeback for IO right away.
      
      ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
      submit the bio.  Instead it tries to map more pages into the bio and
      mpage_map_one_extent might trigger memcg charge which might end up
      waiting on a page which is marked PG_writeback but hasn't been submitted
      yet so we would end up waiting for something that never finishes.
      
      Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
      before we go to wait on the writeback.  The page fault path, which is
      the only path that triggers memcg oom killer since 3.12, shouldn't
      require GFP_NOFS and so we shouldn't reintroduce the premature OOM
      killer issue which was originally addressed by the heuristic.
      
      As per David Chinner the xfs is doing similar thing since 2.6.15 already
      so ext4 is not the only affected filesystem.  Moreover he notes:
      
      : For example: IO completion might require unwritten extent conversion
      : which executes filesystem transactions and GFP_NOFS allocations. The
      : writeback flag on the pages can not be cleared until unwritten
      : extent conversion completes. Hence memory reclaim cannot wait on
      : page writeback to complete in GFP_NOFS context because it is not
      : safe to do so, memcg reclaim or otherwise.
      
      [tytso@mit.edu: corrected the control flow]
      Fixes: c3b94f44
      
       ("memcg: further prevent OOM with too many dirty pages")
      Reported-by: default avatarNikolay Borisov <kernel@kyup.com>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      
      56c65036
    • NeilBrown's avatar
      md/bitmap: return an error when bitmap superblock is corrupt. · eb037379
      NeilBrown authored
      commit b97e9257 upstream
          Use separate bitmaps for each nodes in the cluster
      
      bitmap_read_sb() validates the bitmap superblock that it reads in.
      If it finds an inconsistency like a bad magic number or out-of-range
      version number, it prints an error and returns, but it incorrectly
      returns zero, so the array is still assembled with the (invalid) bitmap.
      
      This means it could try to use a bitmap with a new version number which
      it therefore does not understand.
      
      This bug was introduced in 3.5 and fix as part of a larger patch in 4.1.
      So the patch is suitable for any -stable kernel in that range.
      
      Fixes: 27581e5a
      
       ("md/bitmap: centralise allocation of bitmap file pages.")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Reported-by: default avatarGuoQing Jiang <gqjiang@suse.com>
      eb037379
    • Al Viro's avatar
      path_openat(): fix double fput() · da59de4c
      Al Viro authored
      commit f15133df
      
       upstream.
      
      path_openat() jumps to the wrong place after do_tmpfile() - it has
      already done path_cleanup() (as part of path_lookupat() called by
      do_tmpfile()), so doing that again can lead to double fput().
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      da59de4c
    • Amanieu d'Antras's avatar
      signal: fix information leak in copy_siginfo_from_user32 · 42570358
      Amanieu d'Antras authored
      commit 3c00cb5e
      
       upstream.
      
      This function can leak kernel stack data when the user siginfo_t has a
      positive si_code value.  The top 16 bits of si_code descibe which fields
      in the siginfo_t union are active, but they are treated inconsistently
      between copy_siginfo_from_user32, copy_siginfo_to_user32 and
      copy_siginfo_to_user.
      
      copy_siginfo_from_user32 is called from rt_sigqueueinfo and
      rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
      of si_code.
      
      This fixes the following information leaks:
      x86:   8 bytes leaked when sending a signal from a 32-bit process to
             itself. This leak grows to 16 bytes if the process uses x32.
             (si_code = __SI_CHLD)
      x86:   100 bytes leaked when sending a signal from a 32-bit process to
             a 64-bit process. (si_code = -1)
      sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
             64-bit process. (si_code = any)
      
      parsic and s390 have similar bugs, but they are not vulnerable because
      rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
      to a different process.  These bugs are also fixed for consistency.
      
      Signed-off-by: default avatarAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      42570358
    • Amanieu d'Antras's avatar
      signal: fix information leak in copy_siginfo_to_user · ed5dbb4e
      Amanieu d'Antras authored
      commit 26135022
      
       upstream.
      
      This function may copy the si_addr_lsb, si_lower and si_upper fields to
      user mode when they haven't been initialized, which can leak kernel
      stack data to user mode.
      
      Just checking the value of si_code is insufficient because the same
      si_code value is shared between multiple signals.  This is solved by
      checking the value of si_signo in addition to si_code.
      
      Signed-off-by: default avatarAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ed5dbb4e
    • Amanieu d'Antras's avatar
      signalfd: fix information leak in signalfd_copyinfo · ce624e9c
      Amanieu d'Antras authored
      commit 3ead7c52
      
       upstream.
      
      This function may copy the si_addr_lsb field to user mode when it hasn't
      been initialized, which can leak kernel stack data to user mode.
      
      Just checking the value of si_code is insufficient because the same
      si_code value is shared between multiple signals.  This is solved by
      checking the value of si_signo in addition to si_code.
      
      Signed-off-by: default avatarAmanieu d'Antras <amanieu@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ce624e9c
    • Andy Lutomirski's avatar
      x86/ldt: Further fix FPU emulation · 6afc9ab2
      Andy Lutomirski authored
      commit 12e244f4
      
       upstream.
      
      The previous fix confused a selector with a segment prefix.  Fix it.
      
      Compile-tested only.
      
      Cc: stable@vger.kernel.org
      Cc: Juergen Gross <jgross@suse.com>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Fixes: 4809146b
      
       ("x86/ldt: Correct FPU emulation access to LDT")
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6afc9ab2
    • Juergen Gross's avatar
      x86/ldt: Correct FPU emulation access to LDT · 542402ab
      Juergen Gross authored
      commit 4809146b upstream.
      
      Commit 37868fe1
      
       ("x86/ldt: Make modify_ldt synchronous")
      introduced a new struct ldt_struct anchored at mm->context.ldt.
      
      Adapt the x86 fpu emulation code to use that new structure.
      
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: billm@melbpc.org.au
      Link: http://lkml.kernel.org/r/1438883674-1240-1-git-send-email-jgross@suse.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      542402ab
    • Juergen Gross's avatar
      x86/ldt: Correct LDT access in single stepping logic · 8993fe75
      Juergen Gross authored
      commit 136d9d83 upstream.
      
      Commit 37868fe1
      
       ("x86/ldt: Make modify_ldt synchronous")
      introduced a new struct ldt_struct anchored at mm->context.ldt.
      
      convert_ip_to_linear() was changed to reflect this, but indexing
      into the ldt has to be changed as the pointer is no longer void *.
      
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bp@suse.de
      Link: http://lkml.kernel.org/r/1438848278-12906-1-git-send-email-jgross@suse.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8993fe75
    • Andy Lutomirski's avatar
      x86/ldt: Make modify_ldt synchronous · 62fc7228
      Andy Lutomirski authored
      commit 37868fe1
      
       upstream.
      
      modify_ldt() has questionable locking and does not synchronize
      threads.  Improve it: redesign the locking and synchronize all
      threads' LDTs using an IPI on all modifications.
      
      This will dramatically slow down modify_ldt in multithreaded
      programs, but there shouldn't be any multithreaded programs that
      care about modify_ldt's performance in the first place.
      
      This fixes some fallout from the CVE-2015-5157 fixes.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      62fc7228
    • Peter Zijlstra's avatar
      rcu: Move lockless_dereference() out of rcupdate.h · 554b079b
      Peter Zijlstra authored
      commit 0a04b016
      
       upstream.
      
      I want to use lockless_dereference() from seqlock.h, which would mean
      including rcupdate.h from it, however rcupdate.h already includes
      seqlock.h.
      
      Avoid this by moving lockless_dereference() into compiler.h. This is
      somewhat tricky since it uses smp_read_barrier_depends() which isn't
      available there, but its a CPP macro so we can get away with it.
      
      The alternative would be moving it into asm/barrier.h, but that would
      be updating each arch (I can do if people feel that is more
      appropriate).
      
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      554b079b
    • Paul E. McKenney's avatar
      rcu: Provide counterpart to rcu_dereference() for non-RCU situations · afadae32
      Paul E. McKenney authored
      commit 54ef6df3
      
       upstream.
      
      Although rcu_dereference() and friends can be used in situations where
      object lifetimes are being managed by something other than RCU, the
      resulting sparse and lockdep-RCU noise can be annoying.  This commit
      therefore supplies a lockless_dereference(), which provides the
      protection for dereferences without the RCU-related debugging noise.
      
      Reported-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      afadae32
    • Peter Zijlstra's avatar
      arch: Introduce smp_load_acquire(), smp_store_release() · ad3b8fca
      Peter Zijlstra authored
      commit 47933ad4
      
       upstream.
      
      A number of situations currently require the heavyweight smp_mb(),
      even though there is no need to order prior stores against later
      loads.  Many architectures have much cheaper ways to handle these
      situations, but the Linux kernel currently has no portable way
      to make use of them.
      
      This commit therefore supplies smp_load_acquire() and
      smp_store_release() to remedy this situation.  The new
      smp_load_acquire() primitive orders the specified load against
      any subsequent reads or writes, while the new smp_store_release()
      primitive orders the specifed store against any prior reads or
      writes.  These primitives allow array-based circular FIFOs to be
      implemented without an smp_mb(), and also allow a theoretical
      hole in rcu_assign_pointer() to be closed at no additional
      expense on most architectures.
      
      In addition, the RCU experience transitioning from explicit
      smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
      and rcu_assign_pointer(), respectively resulted in substantial
      improvements in readability.  It therefore seems likely that
      replacing other explicit barriers with smp_load_acquire() and
      smp_store_release() will provide similar benefits.  It appears
      that roughly half of the explicit barriers in core kernel code
      might be so replaced.
      
      [Changelog by PaulMck]
      
      Reviewed-by: default avatar"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Michael Ellerman <michael@ellerman.id.au>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Victor Kaplansky <VICTORK@il.ibm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ad3b8fca
    • Andy Lutomirski's avatar
      x86/nmi/64: Switch stacks on userspace NMI entry · e0de15fc
      Andy Lutomirski authored
      commit 9b6e6a83
      
       upstream.
      
      Returning to userspace is tricky: IRET can fail, and ESPFIX can
      rearrange the stack prior to IRET.
      
      The NMI nesting fixup relies on a precise stack layout and
      atomic IRET.  Rather than trying to teach the NMI nesting fixup
      to handle ESPFIX and failed IRET, punt: run NMIs that came from
      user mode on the normal kernel stack.
      
      This will make some nested NMIs visible to C code, but the C
      code is okay with that.
      
      As a side effect, this should speed up perf: it eliminates an
      RDMSR when NMIs come from user mode.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e0de15fc
    • Andy Lutomirski's avatar
      x86/nmi/64: Remove asm code that saves CR2 · 3682103f
      Andy Lutomirski authored
      commit 0e181bb5
      
       upstream.
      
      Now that do_nmi saves CR2, we don't need to save it in asm.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3682103f
    • Andy Lutomirski's avatar
      x86/nmi: Enable nested do_nmi() handling for 64-bit kernels · 012ace61
      Andy Lutomirski authored
      commit 9d050416
      
       upstream.
      
      32-bit kernels handle nested NMIs in C.  Enable the exact same
      handling on 64-bit kernels as well.  This isn't currently
      necessary, but it will become necessary once the asm code starts
      allowing limited nesting.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      012ace61
    • NeilBrown's avatar
      md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies · 44174364
      NeilBrown authored
      commit 423f04d6 upstream.
      
      raid1_end_read_request() assumes that the In_sync bits are consistent
      with the ->degaded count.
      raid1_spare_active updates the In_sync bit before the ->degraded count
      and so exposes an inconsistency, as does error()
      So extend the spinlock in raid1_spare_active() and error() to hide those
      inconsistencies.
      
      This should probably be part of
        Commit: 34cab6f4 ("md/raid1: fix test for 'was read error from
        last working device'.")
      as it addresses the same issue.  It fixes the same bug and should go
      to -stable for same reasons.
      
      Fixes: 76073054
      
       ("md/raid1: clean up read_balance.")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      44174364
    • Joseph Qi's avatar
      ocfs2: fix BUG in ocfs2_downconvert_thread_do_work() · df691629
      Joseph Qi authored
      commit 209f7512
      
       upstream.
      
      The "BUG_ON(list_empty(&osb->blocked_lock_list))" in
      ocfs2_downconvert_thread_do_work can be triggered in the following case:
      
      ocfs2dc has firstly saved osb->blocked_lock_count to local varibale
      processed, and then processes the dentry lockres.  During the dentry
      put, it calls iput and then deletes rw, inode and open lockres from
      blocked list in ocfs2_mark_lockres_freeing.  And this causes the
      variable `processed' to not reflect the number of blocked lockres to be
      processed, which triggers the BUG.
      
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      df691629
    • Marcus Gelderie's avatar
      ipc: modify message queue accounting to not take kernel data structures into account · be797479
      Marcus Gelderie authored
      commit de54b9ac upstream.
      
      A while back, the message queue implementation in the kernel was
      improved to use btrees to speed up retrieval of messages, in commit
      d6629859 ("ipc/mqueue: improve performance of send/recv").
      
      That patch introducing the improved kernel handling of message queues
      (using btrees) has, as a by-product, changed the meaning of the QSIZE
      field in the pseudo-file created for the queue.  Before, this field
      reflected the size of the user-data in the queue.  Since, it also takes
      kernel data structures into account.  For example, if 13 bytes of user
      data are in the queue, on my machine the file reports a size of 61
      bytes.
      
      There was some discussion on this topic before (for example
      https://lkml.org/lkml/2014/10/1/115).  Commenting on a th lkml, Michael
      Kerrisk gave the following background
      (https://lkml.org/lkml/2015/6/16/74
      
      ):
      
          The pseudofiles in the mqueue filesystem (usually mounted at
          /dev/mqueue) expose fields with metadata describing a message
          queue. One of these fields, QSIZE, as originally implemented,
          showed the total number of bytes of user data in all messages in
          the message queue, and this feature was documented from the
          beginning in the mq_overview(7) page. In 3.5, some other (useful)
          work happened to break the user-space API in a couple of places,
          including the value exposed via QSIZE, which now includes a measure
          of kernel overhead bytes for the queue, a figure that renders QSIZE
          useless for its original purpose, since there's no way to deduce
          the number of overhead bytes consumed by the implementation.
          (The other user-space breakage was subsequently fixed.)
      
      This patch removes the accounting of kernel data structures in the
      queue.  Reporting the size of these data-structures in the QSIZE field
      was a breaking change (see Michael's comment above).  Without the QSIZE
      field reporting the total size of user-data in the queue, there is no
      way to deduce this number.
      
      It should be noted that the resource limit RLIMIT_MSGQUEUE is counted
      against the worst-case size of the queue (in both the old and the new
      implementation).  Therefore, the kernel overhead accounting in QSIZE is
      not necessary to help the user understand the limitations RLIMIT imposes
      on the processes.
      
      Signed-off-by: default avatarMarcus Gelderie <redmnic@gmail.com>
      Acked-by: default avatarDoug Ledford <dledford@redhat.com>
      Acked-by: default avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: John Duffy <jb_duffy@btinternet.com>
      Cc: Arto Bendiken <arto@bendiken.net>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      be797479
    • Dan Carpenter's avatar
      ALSA: hda - fix cs4210_spdif_automute() · 525bd2e4
      Dan Carpenter authored
      commit 44008f08 upstream.
      
      Smatch complains that we have nested checks for "spdif_present".  It
      turns out the current behavior isn't correct, we should remove the first
      check and keep the second.
      
      Fixes: 1077a024
      
       ('ALSA: hda - Use generic parser for Cirrus codec driver')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      525bd2e4
    • Nicholas Bellinger's avatar
      iscsi-target: Fix iscsit_start_kthreads failure OOPs · 7a2c1a8e
      Nicholas Bellinger authored
      commit e5419865 upstream.
      
      This patch fixes a regression introduced with the following commit
      in v4.0-rc1 code, where a iscsit_start_kthreads() failure triggers
      a NULL pointer dereference OOPs:
      
          commit 88dcd2da
      
      
          Author: Nicholas Bellinger <nab@linux-iscsi.org>
          Date:   Thu Feb 26 22:19:15 2015 -0800
      
              iscsi-target: Convert iscsi_thread_set usage to kthread.h
      
      To address this bug, move iscsit_start_kthreads() immediately
      preceeding the transmit of last login response, before signaling
      a successful transition into full-feature-phase within existing
      iscsi_target_do_tx_login_io() logic.
      
      This ensures that no target-side resource allocation failures can
      occur after the final login response has been successfully sent.
      
      Also, it adds a iscsi_conn->rx_login_comp to allow the RX thread
      to sleep to prevent other socket related failures until the final
      iscsi_post_login_handler() call is able to complete.
      
      Cc: Sagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarNicholas Bellinger <nab@daterainc.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7a2c1a8e
    • Roger Quadros's avatar
      ARM: OMAP2+: hwmod: Fix _wait_target_ready() for hwmods without sysc · 6c1bb29f
      Roger Quadros authored
      commit 9a258afa upstream.
      
      For hwmods without sysc, _init_mpu_rt_base(oh) won't be called and so
      _find_mpu_rt_port(oh) will return NULL thus preventing ready state check
      on those modules after the module is enabled.
      
      This can potentially cause a bus access error if the module is accessed
      before the module is ready.
      
      Fix this by unconditionally calling _init_mpu_rt_base() during hwmod
      _init(). Do ioremap only if we need SYSC access.
      
      Eventhough _wait_target_ready() check doesn't really need MPU RT port but
      just the PRCM registers, we still mandate that the hwmod must have an
      MPU RT port if ready state check needs to be done. Else it would mean that
      the module is not accessible by MPU so there is no point in waiting
      for target to be ready.
      
      e.g. this fixes the below DCAN bus access error on AM437x-gp-evm.
      
      [   16.672978] ------------[ cut here ]------------
      [   16.677885] WARNING: CPU: 0 PID: 1580 at drivers/bus/omap_l3_noc.c:147 l3_interrupt_handler+0x234/0x35c()
      [   16.687946] 44000000.ocp:L3 Custom Error: MASTER M2 (64-bit) TARGET L4_PER_0 (Read): Data Access in User mode during Functional access
      [   16.700654] Modules linked in: xhci_hcd btwilink ti_vpfe dwc3 videobuf2_core ov2659 bluetooth v4l2_common videodev ti_am335x_adc kfifo_buf industrialio c_can_platform videobuf2_dma_contig media snd_soc_tlv320aic3x pixcir_i2c_ts c_can dc
      [   16.731144] CPU: 0 PID: 1580 Comm: rpc.statd Not tainted 3.14.26-02561-gf733aa036398 #180
      [   16.739747] Backtrace:
      [   16.742336] [<c0011108>] (dump_backtrace) from [<c00112a4>] (show_stack+0x18/0x1c)
      [   16.750285]  r6:00000093 r5:00000009 r4:eab5b8a8 r3:00000000
      [   16.756252] [<c001128c>] (show_stack) from [<c05a4418>] (dump_stack+0x20/0x28)
      [   16.763870] [<c05a43f8>] (dump_stack) from [<c0037120>] (warn_slowpath_common+0x6c/0x8c)
      [   16.772408] [<c00370b4>] (warn_slowpath_common) from [<c00371e4>] (warn_slowpath_fmt+0x38/0x40)
      [   16.781550]  r8:c05d1f90 r7:c0730844 r6:c0730448 r5:80080003 r4:ed0cd210
      [   16.788626] [<c00371b0>] (warn_slowpath_fmt) from [<c027fa94>] (l3_interrupt_handler+0x234/0x35c)
      [   16.797968]  r3:ed0cd480 r2:c0730508
      [   16.801747] [<c027f860>] (l3_interrupt_handler) from [<c0063758>] (handle_irq_event_percpu+0x54/0x1bc)
      [   16.811533]  r10:ed005600 r9:c084855b r8:0000002a r7:00000000 r6:00000000 r5:0000002a
      [   16.819780]  r4:ed0e6d80
      [   16.822453] [<c0063704>] (handle_irq_event_percpu) from [<c00638f0>] (handle_irq_event+0x30/0x40)
      [   16.831789]  r10:eb2b6938 r9:eb2b6960 r8:bf011420 r7:fa240100 r6:00000000 r5:0000002a
      [   16.840052]  r4:ed005600
      [   16.842744] [<c00638c0>] (handle_irq_event) from [<c00661d8>] (handle_fasteoi_irq+0x74/0x128)
      [   16.851702]  r4:ed005600 r3:00000000
      [   16.855479] [<c0066164>] (handle_fasteoi_irq) from [<c0063068>] (generic_handle_irq+0x28/0x38)
      [   16.864523]  r4:0000002a r3:c0066164
      [   16.868294] [<c0063040>] (generic_handle_irq) from [<c000ef60>] (handle_IRQ+0x38/0x8c)
      [   16.876612]  r4:c081c640 r3:00000202
      [   16.880380] [<c000ef28>] (handle_IRQ) from [<c00084f0>] (gic_handle_irq+0x30/0x5c)
      [   16.888328]  r6:eab5ba38 r5:c0804460 r4:fa24010c r3:00000100
      [   16.894303] [<c00084c0>] (gic_handle_irq) from [<c05a8d80>] (__irq_svc+0x40/0x50)
      [   16.902193] Exception stack(0xeab5ba38 to 0xeab5ba80)
      [   16.907499] ba20:                                                       00000000 00000006
      [   16.916108] ba40: fa1d0000 fa1d0008 ed3d3000 eab5bab4 ed3d3460 c0842af4 bf011420 eb2b6960
      [   16.924716] ba60: eb2b6938 eab5ba8c eab5ba90 eab5ba80 bf035220 bf07702c 600f0013 ffffffff
      [   16.933317]  r7:eab5ba6c r6:ffffffff r5:600f0013 r4:bf07702c
      [   16.939317] [<bf077000>] (c_can_plat_read_reg_aligned_to_16bit [c_can_platform]) from [<bf035220>] (c_can_get_berr_counter+0x38/0x64 [c_can])
      [   16.952696] [<bf0351e8>] (c_can_get_berr_counter [c_can]) from [<bf010294>] (can_fill_info+0x124/0x15c [can_dev])
      [   16.963480]  r5:ec8c9740 r4:ed3d3000
      [   16.967253] [<bf010170>] (can_fill_info [can_dev]) from [<c0502fa8>] (rtnl_fill_ifinfo+0x58c/0x8fc)
      [   16.976749]  r6:ec8c9740 r5:ed3d3000 r4:eb2b6780
      [   16.981613] [<c0502a1c>] (rtnl_fill_ifinfo) from [<c0503408>] (rtnl_dump_ifinfo+0xf0/0x1dc)
      [   16.990401]  r10:ec8c9740 r9:00000000 r8:00000000 r7:00000000 r6:ebd4d1b4 r5:ed3d3000
      [   16.998671]  r4:00000000
      [   17.001342] [<c0503318>] (rtnl_dump_ifinfo) from [<c050e6e4>] (netlink_dump+0xa8/0x1e0)
      [   17.009772]  r10:00000000 r9:00000000 r8:c0503318 r7:ebf3e6c0 r6:ebd4d1b4 r5:ec8c9740
      [   17.018050]  r4:ebd4d000
      [   17.020714] [<c050e63c>] (netlink_dump) from [<c050ec10>] (__netlink_dump_start+0x104/0x154)
      [   17.029591]  r6:eab5bd34 r5:ec8c9980 r4:ebd4d000
      [   17.034454] [<c050eb0c>] (__netlink_dump_start) from [<c0505604>] (rtnetlink_rcv_msg+0x110/0x1f4)
      [   17.043778]  r7:00000000 r6:ec8c9980 r5:00000f40 r4:ebf3e6c0
      [   17.049743] [<c05054f4>] (rtnetlink_rcv_msg) from [<c05108e8>] (netlink_rcv_skb+0xb4/0xc8)
      [   17.058449]  r8:eab5bdac r7:ec8c9980 r6:c05054f4 r5:ec8c9980 r4:ebf3e6c0
      [   17.065534] [<c0510834>] (netlink_rcv_skb) from [<c0504134>] (rtnetlink_rcv+0x24/0x2c)
      [   17.073854]  r6:ebd4d000 r5:00000014 r4:ec8c9980 r3:c0504110
      [   17.079846] [<c0504110>] (rtnetlink_rcv) from [<c05102ac>] (netlink_unicast+0x180/0x1ec)
      [   17.088363]  r4:ed0c6800 r3:c0504110
      [   17.092113] [<c051012c>] (netlink_unicast) from [<c0510670>] (netlink_sendmsg+0x2ac/0x380)
      [   17.100813]  r10:00000000 r8:00000008 r7:ec8c9980 r6:ebd4d000 r5:eab5be70 r4:eab5bee4
      [   17.109083] [<c05103c4>] (netlink_sendmsg) from [<c04dfdb4>] (sock_sendmsg+0x90/0xb0)
      [   17.117305]  r10:00000000 r9:eab5a000 r8:becdda3c r7:0000000c r6:ea978400 r5:eab5be70
      [   17.125563]  r4:c05103c4
      [   17.128225] [<c04dfd24>] (sock_sendmsg) from [<c04e1c28>] (SyS_sendto+0xb8/0xdc)
      [   17.136001]  r6:becdda5c r5:00000014 r4:ecd37040
      [   17.140876] [<c04e1b70>] (SyS_sendto) from [<c000e680>] (ret_fast_syscall+0x0/0x30)
      [   17.148923]  r10:00000000 r8:c000e804 r7:00000122 r6:becdda5c r5:0000000c r4:becdda5c
      [   17.157169] ---[ end trace 2b71e15b38f58bad ]---
      
      Fixes: 6423d6df
      
       ("ARM: OMAP2+: hwmod: check for module address space during init")
      Signed-off-by: default avatarRoger Quadros <rogerq@ti.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6c1bb29f
    • Herbert Xu's avatar
      crypto: ixp4xx - Remove bogus BUG_ON on scattered dst buffer · 2204205e
      Herbert Xu authored
      commit f898c522
      
       upstream.
      
      This patch removes a bogus BUG_ON in the ablkcipher path that
      triggers when the destination buffer is different from the source
      buffer and is scattered.
      
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2204205e
  2. Aug 19, 2015
    • Andy Lutomirski's avatar
      x86/xen: Probe target addresses in set_aliased_prot() before the hypercall · 21934b88
      Andy Lutomirski authored
      commit aa1acff3
      
       upstream.
      
      The update_va_mapping hypercall can fail if the VA isn't present
      in the guest's page tables.  Under certain loads, this can
      result in an OOPS when the target address is in unpopulated vmap
      space.
      
      While we're at it, add comments to help explain what's going on.
      
      This isn't a great long-term fix.  This code should probably be
      changed to use something like set_memory_ro.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <dvrabel@cantab.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      21934b88
    • Axel Lin's avatar
      ASoC: pcm1681: Fix setting de-emphasis sampling rate selection · 2befcb5c
      Axel Lin authored
      commit fa8173a3
      
       upstream.
      
      The de-emphasis sampling rate selection is controlled by BIT[3:4] of
      PCM1681_DEEMPH_CONTROL register. Do proper left shift to set it.
      
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Acked-by: default avatarMarek Belisko <marek.belisko@streamunlimited.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2befcb5c
    • Benjamin Randazzo's avatar
      md: use kzalloc() when bitmap is disabled · 6578b22c
      Benjamin Randazzo authored
      commit b6878d9e
      
       upstream.
      
      In drivers/md/md.c get_bitmap_file() uses kmalloc() for creating a
      mdu_bitmap_file_t called "file".
      
      5769         file = kmalloc(sizeof(*file), GFP_NOIO);
      5770         if (!file)
      5771                 return -ENOMEM;
      
      This structure is copied to user space at the end of the function.
      
      5786         if (err == 0 &&
      5787             copy_to_user(arg, file, sizeof(*file)))
      5788                 err = -EFAULT
      
      But if bitmap is disabled only the first byte of "file" is initialized
      with zero, so it's possible to read some bytes (up to 4095) of kernel
      space memory from user space. This is an information leak.
      
      5775         /* bitmap disabled, zero the first byte and copy out */
      5776         if (!mddev->bitmap_info.file)
      5777                 file->pathname[0] = '\0';
      
      Signed-off-by: default avatarBenjamin Randazzo <benjamin@randazzo.fr>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6578b22c
    • David S. Miller's avatar
      sparc64: Fix userspace FPU register corruptions. · 94649903
      David S. Miller authored
      [ Upstream commit 44922150
      
       ]
      
      If we have a series of events from userpsace, with %fprs=FPRS_FEF,
      like follows:
      
      ETRAP
      	ETRAP
      		VIS_ENTRY(fprs=0x4)
      		VIS_EXIT
      		RTRAP (kernel FPU restore with fpu_saved=0x4)
      	RTRAP
      
      We will not restore the user registers that were clobbered by the FPU
      using kernel code in the inner-most trap.
      
      Traps allocate FPU save slots in the thread struct, and FPU using
      sequences save the "dirty" FPU registers only.
      
      This works at the initial trap level because all of the registers
      get recorded into the top-level FPU save area, and we'll return
      to userspace with the FPU disabled so that any FPU use by the user
      will take an FPU disabled trap wherein we'll load the registers
      back up properly.
      
      But this is not how trap returns from kernel to kernel operate.
      
      The simplest fix for this bug is to always save all FPU register state
      for anything other than the top-most FPU save area.
      
      Getting rid of the optimized inner-slot FPU saving code ends up
      making VISEntryHalf degenerate into plain VISEntry.
      
      Longer term we need to do something smarter to reinstate the partial
      save optimizations.  Perhaps the fundament error is having trap entry
      and exit allocate FPU save slots and restore register state.  Instead,
      the VISEntry et al. calls should be doing that work.
      
      This bug is about two decades old.
      
      Reported-by: default avatarJames Y Knight <jyknight@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      94649903
    • Xie XiuQi's avatar
      ipmi: fix timeout calculation when bmc is disconnected · 1602b97c
      Xie XiuQi authored
      commit e21404dc
      
       upstream.
      
      Loading ipmi_si module while bmc is disconnected, we found the timeout
      is longer than 5 secs.  Actually it takes about 3 mins and 20
      secs.(HZ=250)
      
      error message as below:
        Dec 12 19:08:59 linux kernel: IPMI BT: timeout in RD_WAIT [ ] 1 retries left
        Dec 12 19:08:59 linux kernel: BT: write 4 bytes seq=0x01 03 18 00 01
        [...]
        Dec 12 19:12:19 linux kernel: IPMI BT: timeout in RD_WAIT [ ]
        Dec 12 19:12:19 linux kernel: failed 2 retries, sending error response
        Dec 12 19:12:19 linux kernel: IPMI: BT reset (takes 5 secs)
        Dec 12 19:12:19 linux kernel: IPMI BT: flag reset [ ]
      
      Function wait_for_msg_done() use schedule_timeout_uninterruptible(1) to
      sleep 1 tick, so we should subtract jiffies_to_usecs(1) instead of 100
      usecs from timeout.
      
      Reported-by: default avatarHu Shiyuan <hushiyuan@huawei.com>
      Signed-off-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1602b97c
    • Mimi Zohar's avatar
      ima: extend "mask" policy matching support · a5423d5e
      Mimi Zohar authored
      commit 4351c294
      
       upstream.
      
      The current "mask" policy option matches files opened as MAY_READ,
      MAY_WRITE, MAY_APPEND or MAY_EXEC.  This patch extends the "mask"
      option to match files opened containing one of these modes.  For
      example, "mask=^MAY_READ" would match files opened read-write.
      
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarDr. Greg Wettstein <gw@idfusion.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a5423d5e