gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gcl-devel] NX and exec-shield kernel summary


From: root
Subject: [Gcl-devel] NX and exec-shield kernel summary
Date: Thu, 1 Jul 2004 00:42:33 -0400

Camm,

I follow the Linux Kernel mailing list and this was a summary
posting of the NX (no execute) and exec-shield status:
 

4. 'NX' Security Features Coming To 2.6

2 Jun - 8 Jun (66 posts) Archive Link: "[announce] [patch] NX (No
eXecute) support for x86, 2.6.7-rc2-bk2"

Topics: Executable File Format, Microsoft, Security, Spam, Virtual
Memory People: Ingo Molnar, Linus Torvalds, Doug McNaught, Jakub
Jelinek, Brian Gerst, Christoph Hellwig, William Lee Irwin III, Andi
Kleen, Andy Lutomirski, Arjan van de Ven, Gerhard Mack, Jun Nakajima,
Rusty Russell

Ingo Molnar said on behalf of Red Hat:

we'd like to announce the availability of the following kernel patch:

http://redhat.com/~mingo/nx-patches/nx-2.6.7-rc2-bk2-AE

which makes use of the 'NX' x86 feature pioneered in AMD64 CPUs and
for which support has also been announced by Intel. (other x86 CPU
vendors, Transmeta and VIA announced support as well. Windows support
for NX has also been announced by Microsoft, for their next service
pack.) The NX feature is also being marketed as 'Enhanced Virus
Protection'. This patch makes sure Linux has full support for this
hardware feature on x86 too.

What does this patch do? The pagetable format of current x86 CPUs does
not have an 'execute' bit. This means that even if an application maps
a memory area without PROT_EXEC, the CPU will still allow code to be
executed in this memory. This property is often abused by exploits
when they manage to inject hostile code into this memory, for example
via a buffer overflow.

The NX feature changes this and adds a 'dont execute' bit to the PAE
pagetable format. But since the flag defaults to zero (for
compatibility reasons), all pages are executable by default and the
kernel has to be taught to make use of this bit.

If the NX feature is supported by the CPU then the patched kernel
turns on NX and it will enforce userspace executability constraints
such as a no-exec stack and no-exec mmap and data areas. This means
less chance for stack overflows and buffer-overflows to cause
exploits.

furthermore, the patch also implements 'NX protection' for kernelspace
code: only the kernel code and modules are executable - so even
kernel-space overflows are harder (in some cases, impossible) to
exploit. Here is how kernel code that tries to execute off the stack
is stopped:

 kernel tried to access NX-protected page - exploit attempt? (uid: 500)
 Unable to handle kernel paging request at virtual address f78d0f40
  printing eip:
 ...

The patch is based on a prototype NX patch written for 2.4 by Intel -
special thanks go to Suresh Siddha and Jun Nakajima @ Intel. The
existing NX support in the 64-bit x86_64 kernels has been written by
Andi Kleen and this patch is modeled after his code.

Arjan van de Ven has also provided lots of feedback and he has
integrated the patch into the Fedora Core 2 kernel. Test rpms are
available for download at:

http://redhat.com/~arjanv/2.6/RPMS.kernel/

the kernel-2.6.6-1.411 rpms have the NX patch applied.

here's a quickstart to recompile the vanilla kernel from source with
the NX patch:

http://redhat.com/~mingo/nx-patches/QuickStart-NX.txt

There were a lot of technical suggestions and comments from folks like
Christoph Hellwig, Andi Kleen, Rusty Russell, and Gerhard Mack. Also,
Linus Torvalds asked:

Just out of interest - how many legacy apps are broken by this? I
assume it's a non-zero number, but wouldn't mind to be happily
surprised.

And do we have some way of on a per-process basis say "avoid NX
because this old version of Oracle/flash/whatever-binary-thing doesn't
run with it"?

In answer to the first question, Ingo and Arjan van de Ven (also from
Red Hat) confirmed that the amount of legacy breakage was in fact
zero. Ingo also explained that just in case, any breakage from this
would be less than other breakage already introduced by Red Hat. He
put it, "in the full install of FC1 and FC2 the number is zero - and
Fedora has exec-shield which does a couple of things more: it makes
the heap non-executable as well [this broke X], it randomizes the
address-space layout and has a 4:4 VM [which broke the Sun JVM]." Doug
McNaught added, close by, "Lisp systems like CMUCL and SBCL (plus
commercial Lisps) had problems with FC1 due to execshield. They tend
to do things like compile code on the fly to heap memory and expect to
be able to run it." And Jakub Jelinek (also from Red Hat) replied,
"They will still work, as long as you don't recompile them with recent
toolchain. When you recompile them, they either needs to be taught to
DTRT (i.e. use mmap with PROT_EXEC for executable stuff), or can be
linked with -Wl,-z,execstack to mark them as needing executable
stack. prelink package also contains execstack(8) utility which can be
used on already linked binaries/shared libraries."

To Linus' second question, about the possibility of per-process
avoidance of NX for compatibility reasons, Ingo explained:

we have three mechanisms for this in Fedora:

   1.

      the PT_GNU_STACK flag itself - you can turn executability on/off
      compile-time or even after the fact via the execstack(8) utility
      Jakub wrote. This only affects the stack's executability - if an
      application assumes a non-PROT_EXEC mmap() can be executed it
      might still break with NX - but based on experience with Fedora
      Core i'd say there's almost no such application.

      this method works in 2.6 too, since it supports
      PT_GNU_STACK. gcc's PT_GNU_STACK mechanism is very conservative
      - e.g. if an application does an asm() then gcc assumes that it
      might rely on stack executability and emits the X
      flag. [applications can then turn this off in the source if
      stack executability is not required.] Likewise, if gcc emits
      trampolines then the X flag is emitted too. (glibc knows about
      PT_GNU_STACK all across - so e.g. if a nonexec stack application
      dlopen()s a library that needs stack executability then glibc
      makes the stack executable on the fly via
      PROT_GROWSDOWN/GROWSUP.)
   2.

      via a runtime method: via the i386 personality. So an
      application can trigger the 'legacy' Linux VM layout by e.g
      doing 'i386 java ./test.class'.

      this is a hack in Fedora - we wanted to have a finegrained
      runtime mechanism just in case. But it would be nice to have
      this upstream too - e.g. via a PERSONALITY_3G?

   3.

      via a kernel boot parameter (exec-shield=0)

      with the NX patch this becomes noexec=off [the same flag works
      on x86_64 too]. This method is the most inflexible one, and is a
      last-resort thing. (Fedora also has a runtime global switch to
      turn off the VM layout changes.)

here's a list of applications that we had to fix/work around in Fedora
when the VM layout changed:

    * emacs _rebuild_. (it coredumps itself during build ... xemacs is OK.)

    * some JDKs. Since they generate code and try to be as fast as
      possible they tend to rely more on VM details than normal
      applications.

    * X's module loader assumed that brk was executable. (fixed)

    * Wine. (it implements another OS so it's by definition very
      sensitive to layout changes.)

most of the breakages were unclean x86-only code that would have
broken if ported over to 64-bit anyway.

old, legacy applications dont have the PT_GNU_STACK flag so they all
work fine.

Regarding Wine's breakage when the Virtual Memory Subsystem changed,
Brian Gerst disagreed with Ingo's explanation, and remarked, "Wine
breaks because of the part of exec-shield that relocates shared libs
to low addresses, where the (stripped) Windows binaries expect to be
loaded at. NX stack doesn't affect it." Ingo accepted this, adding, "I
think Wine could get around this by creating a dummy ELF section in
the Wine binary that covers the first 1GB or so. Wine could still use
ordinary dynamic libraries - those would go above that 1GB. Then once
Wine has loaded up it can munmap() that first 1GB. (this would not
work if Wine has to dlopen() new libraries after this phase - does
that happen?)" But Christoph Hellwig suggested, "Why can't wine just
implement it's own binfmt_pecoff? Sounds like the much simpler
solutuion." And William Lee Irwin III said, "I'd be in favor of this
also. An executable format with wide enough usage is worth adding
kernel support for loading it."

Ingo replied also to his own long post, dealing with his item 2 above,
the runtime method of triggering the legacy Linux VM subsystem. He
said:

i've attached a patch that provides a cleaner solution. It does 3 changes:

    * it adds a ADDR_SPACE_EXECUTABLE bit to the personality 'bug
      bits' section. This bit if set will make the stack
      executable. (if in the future we decide to make the malloc()
      heap non-exec [which i definitely think we should], that
      property will also listen to this bit.)

    * in elf.h, it changes the x86 personality inheritance code to
      match that of x86_64 - which is a much saner method. This means
      if a complex app that does exec()s will all run with the
      personality of the parent(s).

    * in exec.c, since address-space executability is a
      security-relevant item, we must clear the personality when we
      exec a setuid binary. I believe this is also a (small) security
      robustness fix for current 64-bit architectures.

(the patch also adds a break to the elf_ex.e_phnum loop - there can
only be one STACK header in the binary and once we found it we should
not iterate through the remaining program headers (if any).)

we didnt want to add a non-standard personality flag to Fedora so we
abused PER_LINUX32 as the compatibility flag - but this only works on
x86. With the ADDR_SPACE_EXECUTABLE flag there would be a standard
method to fall back to 'legacy' executability assumptions Linux
applications might make.

Andi replied to the third item in the list above, regarding clearing
the 'personality' when executing a setuid binary. He said, "This means
I cannot easily force an i386 uname or 3GB address space on suid
programs anymore on x86-64. While in theory it could be a small
security problem I think the utility is much greater. It's hard to see
how setting NX could cause a security hole. The program may crash, but
it is unlikely to be exploitable." Andy Lutomirski replied:

The whole point of NX, though, is that it prevents certain classes of
exploits. If a setuid binary is vulnerable to one of these, then
Ingo's patch "fixes" it. Your approach breaks that.

I don't like Ingo's fix either, though. At least it should check
CAP_PTRACE or some such. A better fix would be for LSM to pass down a
flag indicating a change of security context. I'll throw that in to my
caps/apply_creds cleanup, in case that ever gets applied.

Andi thought it would be overkill to require an LSM module, but he
agreed that Andy had a good point, although Andi also objected, "that
only applies to the NX personality bit. For the uname emulation it is
not an issue. So maybe the dropping on exec should only zero a few
selected personality bits, but not all." This made sense to Andy; and
close by, Ingo said, "ok, how about the attached patch then? There's a
PERS_DROP_ON_SUID mask that we drop upon setuid - all the other
personality bits get inherited." Andy replied, "This is wrong on
SELinux (and presumably with other LSMs). It also does unexpected
things if you fail to exec a setuid executable." He posted his own
patch, and Linus Torvalds came in with:

Let's not do this at all.

Anything that changes subtle behaviour at suid-execute time is just
wrong. Imagine an app that has been tested in normal use, and then has
a subtle bug when executed set-uid, simply because the address space
layout changes. Or something that mysteriously works when you're root,
but not when you're anything else. Ouch.

I think we should just look at the executable itself, not whether it
is suid. If the executable says it is "NX-approved", then it's
NX-approved. End of story - just try to make sure that as many
executables as possible get compiled with the newer compiler suite
that enables it.

Add a tool to let people turn on/off the NX bit on an executable if it
turns out the executable can't work with it (let's say it was compiled
and tested on a CPU without NX support), and everybody should be
happy. You can have a trivial script that turns on the NX bit on all
the legacy apps too, and then if testing shows iot wasn't a good idea,
you can turn it off again on a per-executable basis.

Ingo did a bunch more work, posting patches; and Arjan also remarked,
"the prelink rpm on Fedora has such a tool" [to flip the NX bit on an
executable] "already fwiw. (it's part of prelink because the elf
manipulations needed are quite similar to the ones prelink does so
infrastructure is shared)" Linus replied:

Just for fun, can somebody that has the required hardware just test
old apps with NX turned on?

I know we used to put the signal handler trampoline on the stack, but
these days that should all be handled with the magic executable
syscall page, so _normally_ I don't think an old application should
even really care.

In fact, it would be interesting to just hear somebody running an
older distribution with a new CPU and a new kernel, and see just how
many programs need to be marked non-NX in "normal running".

Arjan replied, "I know that in a FC1 full install there are less than
5 binaries that don't run with NX. (one uses nested functions in C and
passes function pointers to the inner function around which causes gcc
to emit a stack trampoline, and gcc then marks the binary as non-NX,
the others have asm in them that we didn't fix in time to be properly
marked)." And Linus said:

If things are really that good, why are we even worrying about this?

It sounds like we should just have NX on by default even for
executables that don't have any NX info records, and have some way of
marking the (very few) executables that don't want it. Maybe have the
NX fault print a warning when it happens for an executable that
defaulted to NX on.

I think most people have seen the security disaster that causes most
of the emails on the net to be spam. So this should be _trivial_ to
explain to people when they complain about default behaviour breaking
their strange legacy app. Especially if there's a trivial tool to add
an elf section to make it work again.

So instead of having complex things to try to turn NX on for suid, we
should aim to turn ot on as widely as possible, _even_if_ that means
that people who upgrade hardware might have to do some trivial MIS
stuff.

Make a kernel bootup option to default to legacy mode if somebody
literally has trouble booting and fixing their thing due to "init" or
similar being one of the problematic cases. Together with a printk()
that says which executable triggered, it should be trivial to clean up
a system.

 

5. exec-shield Patch Updated For 2.6.7-rc2-bk2

2 Jun - 4 Jun (4 posts) Archive Link: "[patch] exec-shield patch for
2.6.7-rc2-bk2, integrated with NX" Topics: Big Memory Support People:
Ingo Molnar, Christoph Hellwig, Joe Korty

Ingo Molnar said:

Here's the latest exec-shield patch for 2.6.7-rc2-bk2, integrated with
the 'NX' feature (see the announcement from earlier today):

http://redhat.com/~mingo/exec-shield/exec-shield-on-nx-2.6.7-rc2-bk2-A7

you first have to apply the NX patch, which can be found at:

http://redhat.com/~mingo/nx-patches/nx-2.6.7-rc2-bk2-AE

prebuild kernel RPMs for Fedora Core 2, with this latest version of
exec-shield, are available at:

http://redhat.com/~arjanv/2.6/RPMS.kernel/

(kernel-2.6.6-1.411 has this latest, NX-aware exec-shield.)

if the CPU supports NX (and the kernel has been compiled with
CONFIG_HIGHMEM64G) then exec-shield will use NX to provide page-level
finegrained control over execution. On legacy CPUs that dont support
NX the segment-limit method is used to control execution (in a coarser
way). In the NX case the segment-limit is turned off altogether.

e.g. on an Athlon64 box the boot message looks:

NX (Execute Disable) protection: active

on a CPU without NX the boot message is:

NX (Execute Disable) protection: not present!
Using x86 segment limits to approximate NX protection

note: the NX patch will also protect against kernel-space code injection.

all the other components of exec-shield are identical between NX and
non-NX: the brk area is non-executable, libraries and PIE binaries are
moved into the ascii-shield as much as possible, and all aspects of
the address space are randomized.

Christoph Hellwig thought the patch was too big and included more
stuff than some folks would want. He asked, "Any chance to split this
up a bit? Having the pure non-exec stack (and maybe heap) would be
really nice while the randomization feature are a litte bit too much
security by obscurity for my taste." Joe Korty disagreed, "It's no
more security by obscurity than keeping your key secret is security by
obscurity. (One can think of the randomization as a white-noise key)."
Nevertheless, Ingo posted a new patch with the randomization code
removed. But he added, "but i still think randomization is useful as a
last-resort barrier, against worm-alike mass attacks. There's a huge
difference between a 1-packet infection and a 2-hour brute-force
search over broadband, in terms of the 'economy' of worms."






reply via email to

[Prev in Thread] Current Thread [Next in Thread]