Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document that the x86 port has known vulnerabilities #1108

Open
DemiMarie opened this issue Sep 27, 2023 · 14 comments
Open

Document that the x86 port has known vulnerabilities #1108

DemiMarie opened this issue Sep 27, 2023 · 14 comments

Comments

@DemiMarie
Copy link

DemiMarie commented Sep 27, 2023

The x86 port has at least at one point been formally verified at the C source code level. However, this means very little in practice, because nearly every x86 CPU is vulnerable to transient execution attacks that require mitigation in software. These mitigations must be implemented in assembler due to critical sections in which all control flow instructions are not safe.

Unfortunately, seL4 currently implements its mitigations in C. Worse, seL4 has not been mitigated for recent vulnerabilities such as Zenbleed, Inception, and Downfall. seL4 also lacks interrupt remapping, which means that it is possible for device drivers to at least crash the system and possibly escalate their privileges.

Mitigating these problems in user-space (via e.g. retpoline) is not feasible without enabling the KernelX86DangerousMSR option, and may be infeasible even then. For instance, the mitigation for Zenbleed required writing a “chicken bit” to an MSR.
Additionally, seL4 has no support for loading microcode.

Rather than continue to provide a kernel that is secure in theory but not in practice, I recommend that the x86 port be disabled by default. Enabling it would require an argument (such as -DBUILD_INSECURE_KERNEL_THAT_LEAKS_SECRETS:TRUE) that promises that the user knows the resulting kernel build will be insecure. Since seL4 is currently used almost exclusively in embedded systems, where the use of x86 is rare, I doubt this will impact any users.

I do not recommend trying to fix the x86 port, unless someone is willing to pay the seL4 Foundation to do it properly. Proper support means that the seL4 Foundation is included in the hardware vulnerability pre-disclosure channels that Intel and AMD maintain, so that it can have the necessary fixes when the embargo breaks. Proper support also means implementing the various speculation barriers in assembly. On processors vulnerable to Branch Type Confusion, it is also necessary to perform an IBPB at syscall entry for security, unless the kernel address space has no secrets at all.

@DemiMarie
Copy link
Author

Pinging @andyhhp who deals with these problems for the Xen Project hypervisor.

@lsf37
Copy link
Member

lsf37 commented Sep 27, 2023

The kernel does not actually promise information flow guarantees on x86, but I agree that the limitations of the x86 port should be made a lot clearer. There are no current plans to fix Intel's information flow problems on the software side.

@DemiMarie
Copy link
Author

The kernel does not actually promise information flow guarantees on x86, but I agree that the limitations of the x86 port should be made a lot clearer.

Does that include the risk of privilege escalation due to missing interrupt remapping, or is that a bug that will be fixed?

There are no current plans to fix Intel's information flow problems on the software side.

I don’t fault you for this at all — it would require a huge amount of effort that could not be used on other projects. That said, I do question if the current state of the port is actually useful in practice.

Also, what about speculative execution vulnerabilities on Arm and RISC-V? Some Arm CPUs are vulnerable to Meltdown and Spectre.

@DemiMarie
Copy link
Author

To elaborate: It is very, very hard to think of a use-case for the x86 port as it stands today. There are almost no workloads I can think of where confidentiality is not needed, and I doubt those workloads will be running on x86. Therefore, I suspect that the x86 port has (or at least should have) no users, and the value of a port with no users is not just zero but negative — it takes time away from the maintenance of actually-used ports.

@DemiMarie DemiMarie changed the title Remove the x86 port or document it as unsupported Document that the x86 port is unsupported and has known vulnerabilities Sep 27, 2023
@GWeck
Copy link

GWeck commented Sep 27, 2023

Does this mean that the Makatea project is now dead, too? If seL4 has these problems with the x86 architecture, it may not be worthwhile to spend more effort on building a secure system on top of this kernel on x86 processors.

Just my personal opinion: It's a pity that the Itanium processors, which do not seem to be affected by the speculative execution problems, were no commercial success and so did not replace a far inferior architecture which has its roots at a time long ago.

@lsf37
Copy link
Member

lsf37 commented Sep 27, 2023

This issue conflates too many things. The x86 port never had information flow proofs or guarantees. It is not unsupported, it just doesn't pretend to provide things that no other kernel provides either. Playing the catch-up game of trying to provide software fixes for underlying hardware vulnerabilities as they are discovered does not make anything actually secure. It may make it temporarily harder to exploit, but the fundamental problems tend to remain as we can see by new variants coming out every few months.

Patching may be good enough for specific applications, and with commercial interest and funding it would probably also make sense to add those known mitigations to seL4, but it is a budget/effort/pay-off question that currently comes out negative for the foundation.

The current state might not fit Qubes, but that doesn't mean the x86 port is useless or unsupported. There is ongoing work (e.g. #1098) on MSI issues for instance. Those may or may not be real problems for a specific system (afaics, they are real for using seL4 for Qubes), e.g. there are plenty of embedded systems with a way more controlled system architecture where none of these lead to security concerns, and that do run x86.

There is also work to show that speculative execution and other micro-architectural state does not fundamentally need to lead to information flow problems. There is some hope that the underlying mechanisms for that can make it into future processors, esp if the current untenable trend continues. That's more of a medium to long term perspective, though.

I do think that documentation of what the current seL4 port is good for and what not should be improved.

@DemiMarie DemiMarie changed the title Document that the x86 port is unsupported and has known vulnerabilities Document that the x86 port has known vulnerabilities Sep 27, 2023
@DemiMarie
Copy link
Author

This issue conflates too many things. The x86 port never had information flow proofs or guarantees. It is not unsupported, it just doesn't pretend to provide things that no other kernel provides either. Playing the catch-up game of trying to provide software fixes for underlying hardware vulnerabilities as they are discovered does not make anything actually secure. It may make it temporarily harder to exploit, but the fundamental problems tend to remain as we can see by new variants coming out every few months.

In my world (desktops and servers), people really do expect that a kernel with x86 support will play this came, and will consider a kernel that doesn’t to be vulnerable. I changed the title to reflect this. I dislike the current situation as much as anyone, but it’s the best that can be done on the hardware my users have.

Patching may be good enough for specific applications, and with commercial interest and funding it would probably also make sense to add those known mitigations to seL4, but it is a budget/effort/pay-off question that currently comes out negative for the foundation.

This is a valid decision, but it should be clearly documented so that people know to not use the x86 port for situations where this matters.

The current state might not fit Qubes, but that doesn't mean the x86 port is useless or unsupported. There is ongoing work (e.g. #1098) on MSI issues for instance. Those may or may not be real problems for a specific system (afaics, they are real for using seL4 for Qubes), e.g. there are plenty of embedded systems with a way more controlled system architecture where none of these lead to security concerns, and that do run x86.

Can you provide an example? Some of these vulnerabilities are equivalent to “any component can read arbitrary physical memory,” which is a security vulnerability in almost any system. The vulnerabilities that require IBPB-on-syscall-entry to fix can’t be mitigated by e.g. using retpolines and other mitigations in userland; they require kernel support.

The only case I can think of where these vulnerabilities do not matter are systems that does not deal with confidential information at all, but where the integrity and availability of a certain process (such as a real-time control loop) are safety-critical. An attacker being able to obtain all information about what a vehicle is doing is vaslty less severe than the same attacker being able to make the vehicle stop working or seize control of it.

There is also work to show that speculative execution and other micro-architectural state does not fundamentally need to lead to information flow problems. There is some hope that the underlying mechanisms for that can make it into future processors, esp if the current untenable trend continues. That's more of a medium to long term perspective, though.

As mentioned on various calls, I am highly skeptical about time protection being able to support fully dynamic, general-purpose systems such as Qubes OS. The reasons are that a general-purpose system must not let the CPU go idle for very long when there is work to be done, and can’t statically partition L2 and higher caches because it doesn’t have enough information. Therefore, a covert channel (in the form of overall system load) is guaranteed to be present and cannot be eliminated, and cache partitioning would need to be done at runtime in a way that adapts to workloads as they are created and assigned.

To be clear: I would LOVE if time protection could work, but the systems I work on represent the worst case for it where nothing is known at compile-time and leaving CPUs idle isn’t an option. If time protection can work for Qubes, it can probably work for anything. Please do try to prove me wrong here.

I do think that documentation of what the current seL4 port is good for and what not should be improved.

As mentioned above, a specific example where the current port is appropriate would be appreciated. Feel free to use Qubes OS as an example of where the current port is not appropriate.

@Indanz
Copy link
Contributor

Indanz commented Oct 4, 2023

unless the kernel address space has no secrets at all.

I would like to point out that it's easy to create a system with seL4 that has no secrets in kernel address space. seL4 itself only has bookkeeping which is not very interesting for attackers, all the interesting information is in user space. The only reason seL4 maps all memory is because any normal memory may be used for kernel objects.

All that is needed to guarantee that memory with sensitive information is not mapped in kernel space is to mark it as "device memory" instead of normal memory. On ARM and RISC-V this can be done via device tree with "no-map", x86 may need some other way to configure that. If memory is marked as device memory, it only means seL4 won't use it for kernel objects and won't map it. If you use seL4 as a secure hypervisor you just want a few hundred MBs for kernel objects and use the rest for the virtual machines.

Of course this doesn't stop speculative attacks that work between user space processes on the same core, or that give user space processes access to all physical memory even if it's unmapped. For those other mitigations are needed.

As for your other points:

  • No way to update microcode at runtime with seL4: True, but with the lack of a standard seL4 system, it is easier to just update the microcode via a BIOS update than adding a special system call to do it on the fly. Intel recommends to do microcode updates via the BIOS anyway.
  • Zenbleed can be fixed with a microcode update.
  • Downfall is like Zenbleed, a very specific hardware bug that should be fixed by microcode.
  • Inception is similar to Spectrev2 and needs similar measures: Flushing the RSB, which is done by x86_flush_rsb(), which is implemented in assembly btw. There is no support for SBPB, but there is for IBPB. Maybe the flushing needs to be improved a bit to also handle Inception.
  • Interrupt remapping is being worked on.

Another thing I would like to point out is that most systems are vulnerable to either Rowhammer or ECCploit kind of attacks, and yet they are still used. If you care about systems that are both practically and theoretically secure today, only use very simple hardware and software without speculation or SDRAM.

But you are right that seL4 doesn't have "proper" x86 support, it is a fairly small-scale open-source project which doesn't have those kind of resources yet. As it is, it's not worth adding workarounds to seL4 for CPU bugs in specific chips that will be fixed in either new hardware or with microcode updates. It would help if companies interested in using seL4 as a secure hypervisor would join the seL4 foundation, or if people interested in that use case would help with their time. When seL4 is deployed widely as a x86 hypervisor, support for individual chip bugs will make much more sense, but seL4 isn't at that stage yet.

But all-in-all, after looking into it, seL4's x86 support is much better than I expected, mostly thanks to @AdrianDanis. The situation isn't as dire as you make it seem to be.

@DemiMarie
Copy link
Author

unless the kernel address space has no secrets at all.

I would like to point out that it's easy to create a system with seL4 that has no secrets in kernel address space. seL4 itself only has bookkeeping which is not very interesting for attackers, all the interesting information is in user space. The only reason seL4 maps all memory is because any normal memory may be used for kernel objects.

All that is needed to guarantee that memory with sensitive information is not mapped in kernel space is to mark it as "device memory" instead of normal memory. On ARM and RISC-V this can be done via device tree with "no-map", x86 may need some other way to configure that. If memory is marked as device memory, it only means seL4 won't use it for kernel objects and won't map it. If you use seL4 as a secure hypervisor you just want a few hundred MBs for kernel objects and use the rest for the virtual machines.

Of course this doesn't stop speculative attacks that work between user space processes on the same core, or that give user space processes access to all physical memory even if it's unmapped. For those other mitigations are needed.

As for your other points:

  • No way to update microcode at runtime with seL4: True, but with the lack of a standard seL4 system, it is easier to just update the microcode via a BIOS update than adding a special system call to do it on the fly. Intel recommends to do microcode updates via the BIOS anyway.

Only works in embedded systems — in the general PC case the HW vendor cannot be trusted to ship BIOS updates quickly. That said, I don’t think seL4 is anywhere near ready for such generic systems.

  • Zenbleed can be fixed with a microcode update.
  • Downfall is like Zenbleed, a very specific hardware bug that should be fixed by microcode.
  • Inception is similar to Spectrev2 and needs similar measures: Flushing the RSB, which is done by x86_flush_rsb(), which is implemented in assembly btw. There is no support for SBPB, but there is for IBPB. Maybe the flushing needs to be improved a bit to also handle Inception.

x86_flush_rsb() being in assembly isn’t enough — the entire kernel entry path must be in assembly until all mitigations have been finished, and the entire exit path must be in assembly once certain parts of the userspace/guest context have been established. If you look at the Xen source tree, you will find comments saying stuff like “WARNING! ret, call *, jmp * not safe before/beyond this point.” Those comments delimit areas in which only assembly code that does not contain control flow is safe to execute. C code is not safe to execute in these areas, because the compiler is free to insert such instructions.

  • Interrupt remapping is being worked on.

Another thing I would like to point out is that most systems are vulnerable to either Rowhammer or ECCploit kind of attacks, and yet they are still used. If you care about systems that are both practically and theoretically secure today, only use very simple hardware and software without speculation or SDRAM.

But you are right that seL4 doesn't have "proper" x86 support, it is a fairly small-scale open-source project which doesn't have those kind of resources yet. As it is, it's not worth adding workarounds to seL4 for CPU bugs in specific chips that will be fixed in either new hardware or with microcode updates.

Hardware replacement is not a practical security fix.

It would help if companies interested in using seL4 as a secure hypervisor would join the seL4 foundation, or if people interested in that use case would help with their time. When seL4 is deployed widely as a x86 hypervisor, support for individual chip bugs will make much more sense, but seL4 isn't at that stage yet.

But all-in-all, after looking into it, seL4's x86 support is much better than I expected, mostly thanks to @AdrianDanis. The situation isn't as dire as you make it seem to be.

It’s still pretty bad.

@Indanz
Copy link
Contributor

Indanz commented Oct 16, 2023

Only works in embedded systems — in the general PC case the HW vendor cannot be trusted to ship BIOS updates quickly.

Embedded systems don't usually run untrusted code and are not as vulnerable to these kind of speculation attacks. Usually when you can run arbitrary code you're done, no need for further, complicated speculation attacks.

And with embedded systems it's also easy to make sure that seL4 doesn't map sensitive user space memory, like I mentioned before. That combined with not running untrusted software on the same core as trusted software pretty much fixes all speculation attacks for embedded systems, without any weird and complicated work-arounds in the kernel.

Agreed that counting on BIOS vendors for anything is wishful thinking.

That said, I don’t think seL4 is anywhere near ready for such generic systems.

Agreed, so that's why I'm slightly puzzled by your strong opinion about seL4's suitability for x86 based on its current sub-optimal speculation attack protection. It's only an issue for systems that are dynamic and generic enough to run arbitrary code.

Currently the only use case for that is using seL4 as a hypervisor to run multiple Linux VMs. Perhaps an important use case, but still quite niche if you ask me. seL4 is open source, which means if people want a use case to be well supported, they have to put in the effort or hire someone to make the effort. But arguing that support for a whole platform should be disabled just because your use case is not handled as well as you want is ridiculous.

x86_flush_rsb() being in assembly isn’t enough — the entire kernel entry path must be in assembly until all mitigations have been finished, and the entire exit path must be in assembly once certain parts of the userspace/guest context have been established. If you look at the Xen source tree, you will find comments saying stuff like “WARNING! ret, call *, jmp * not safe before/beyond this point.” Those comments delimit areas in which only assembly code that does not contain control flow is safe to execute. C code is not safe to execute in these areas, because the compiler is free to insert such instructions.

There are no indirect jumps or calls in the kernel, so there was no need for such strict mitigations. It's only since Retbleed that any return instruction can be used for speculative attacks. Retbleed is mitigated with IBRS and that is done in C as the first thing after assembly calls the handle function. The compiler is not going to magically inject indirect jumps before a volatile assembly statement, so your argument that having the mitigation in C is not good enough is not convincing. If IBRS is not enough to mitigate Retbleed or other, newer attacks, please point them out. Same for x86_flush_rsb(), it is the first thing done with inline assembly in a C function on vmexit.

Hardware replacement is not a practical security fix.

No offence, but if your hardware is broken, you should replace it.

If you have so much hardware you can't afford to replace it, you still can't expect a small organisation like seL4 to spend a lot of resources on temporary work-arounds for broken hardware. seL4 is open source, all the speculation mitigations done by others are open source, nothing is stopping anyone from putting those two together if they're not happy with the current mitigations in seL4. The speculation fixes are very limited and have no greater impact on the seL4 code base (except making binary verification harder or impossible).

It’s still pretty bad.

Lack of proper AMD support seems like a way bigger problem to me currently.

@DemiMarie
Copy link
Author

Only works in embedded systems — in the general PC case the HW vendor cannot be trusted to ship BIOS updates quickly.

Embedded systems don't usually run untrusted code and are not as vulnerable to these kind of speculation attacks. Usually when you can run arbitrary code you're done, no need for further, complicated speculation attacks.

Embedded systems shouldn’t run untrusted code, but it is quite possible that (due to some vulnerability) a userspace process may become compromised. A major advantage of seL4 is its ability to limit the impact of such a compromise.

And with embedded systems it's also easy to make sure that seL4 doesn't map sensitive user space memory, like I mentioned before. That combined with not running untrusted software on the same core as trusted software pretty much fixes all speculation attacks for embedded systems, without any weird and complicated work-arounds in the kernel.

Not running untrusted and trusted software on the same core is a very non-obvious requirement that I doubt most embedded systems developers would know they needed to meet.

Agreed that counting on BIOS vendors for anything is wishful thinking.

That said, I don’t think seL4 is anywhere near ready for such generic systems.

Agreed, so that's why I'm slightly puzzled by your strong opinion about seL4's suitability for x86 based on its current sub-optimal speculation attack protection. It's only an issue for systems that are dynamic and generic enough to run arbitrary code.

Just because a system should not run arbitrary code does not mean that it will not.

Currently the only use case for that is using seL4 as a hypervisor to run multiple Linux VMs. Perhaps an important use case, but still quite niche if you ask me.

It’s niche to some people. For others, it is the entire reason they were considering seL4.

seL4 is open source, which means if people want a use case to be well supported, they have to put in the effort or hire someone to make the effort. But arguing that support for a whole platform should be disabled just because your use case is not handled as well as you want is ridiculous.

In retrospect, yes. That’s why I changed the title of this issue.

x86_flush_rsb() being in assembly isn’t enough — the entire kernel entry path must be in assembly until all mitigations have been finished, and the entire exit path must be in assembly once certain parts of the userspace/guest context have been established. If you look at the Xen source tree, you will find comments saying stuff like “WARNING! ret, call *, jmp * not safe before/beyond this point.” Those comments delimit areas in which only assembly code that does not contain control flow is safe to execute. C code is not safe to execute in these areas, because the compiler is free to insert such instructions.

There are no indirect jumps or calls in the kernel, so there was no need for such strict mitigations. It's only since Retbleed that any return instruction can be used for speculative attacks. Retbleed is mitigated with IBRS and that is done in C as the first thing after assembly calls the handle function. The compiler is not going to magically inject indirect jumps before a volatile assembly statement, so your argument that having the mitigation in C is not good enough is not convincing. If IBRS is not enough to mitigate Retbleed or other, newer attacks, please point them out. Same for x86_flush_rsb(), it is the first thing done with inline assembly in a C function on vmexit.

Hardware replacement is not a practical security fix.

No offence, but if your hardware is broken, you should replace it.

None taken, but until hardware vulnerabilities result in product recalls, this is not going to happen in practice.

If you have so much hardware you can't afford to replace it, you still can't expect a small organisation like seL4 to spend a lot of resources on temporary work-arounds for broken hardware. seL4 is open source, all the speculation mitigations done by others are open source, nothing is stopping anyone from putting those two together if they're not happy with the current mitigations in seL4. The speculation fixes are very limited and have no greater impact on the seL4 code base (except making binary verification harder or impossible).

I don’t expect seL4 to fix this problem itself. Doing so would require a huge amount of resources that I expect would be better spent elsewhere, such as preventing integrity and availability attacks (far more important than confidentiality for e.g. cars or airplanes) and (as you mentioned below) AMD support.

What I do expect is that the current situation be documented, or at least that a PR to add such documentation would be accepted. I suspect most people who are not familiar with seL4 will expect seL4 to provide more security than (on x86) it actually does, just like I did. Xen really does try to do this, and there are users who really do depend on it. That seL4 does not do this came as a serious surprise to me. I even suggested that Qubes OS consider using it as a Xen replacement, and I now know that would have been a disaster.

@Indanz
Copy link
Contributor

Indanz commented Oct 19, 2023

Not running untrusted and trusted software on the same core is a very non-obvious requirement that I doubt most embedded systems developers would know they needed to meet.

Then they don't care about security and haven't paid any attention to the endless stream of speculative attacks.

I would think most embedded devs would keep untrusted or non-critical software as far away from the critical software, if not for security, then at least for performance or reliability reasons.

I would hope that cloud providers find it obvious that they should run VMs on different cores if they care about security, or are you saying that that is non-obvious to them too?

Just because a system should not run arbitrary code does not mean that it will not.

It can only run arbitrary code if one of the following is true:

  • The process is running an interpreter or doing JIT.
  • The process has write access to executable memory.
  • The process has filesystem access and can overwrite or add applications that will be run either immediately or at next reboot.

I hope you agree that avoiding the above is trivial for an embedded system.

You can manipulate the stack to get a fair degree of arbitrary code execution, but doing Spectre kind of attacks that way is almost impossible. There are also stack protection compiler options to protect against such attacks, making it even harder.

I don’t expect seL4 to fix this problem itself. Doing so would require a huge amount of resources that I expect would be better spent elsewhere, such as preventing integrity and availability attacks (far more important than confidentiality for e.g. cars or airplanes) and (as you mentioned below) AMD support.

It takes a huge amount of resources if you are willing to put in a lot of time and effort to having runtime, dynamic selection between different mitigations and to be slightly faster than when just using IBRS. But if you go for the simpler, compile-time configured fixes then it doesn't take a huge amount of work, maybe a few weeks of work to bring things up-to-date to the newest attacks and checking if the current mitigations are adequate or not.

What I do expect is that the current situation be documented, or at least that a PR to add such documentation would be accepted. I suspect most people who are not familiar with seL4 will expect seL4 to provide more security than (on x86) it actually does, just like I did. Xen really does try to do this, and there are users who really do depend on it. That seL4 does not do this came as a serious surprise to me. I even suggested that Qubes OS consider using it as a Xen replacement, and I now know that would have been a disaster.

Most people who are not familiar with seL4 will expect seL4 to provide more functionality than it actually does, and most will underestimate how far seL4 is from being ready-to-use for their use case.

I agree that documentation should be good enough that people can see what's missing, but for a lot of things, that's "everything that's not explicitly mentioned", and people assume things too easily.

But to come back to your original point, you should at least specify why seL4 is vulnerable against which attack. The vulnerabilities are not known at this point and your hand waving isn't convincing so far. Don't get me wrong, I agree that seL4 isn't fully up-to-date with all speculative attacks and that the chance is high that seL4 doesn't mitigate all of them properly (assuming up-to-date microcode), but "suspected vulnerabilities" is a big step from "known vulnerabilities".

@DemiMarie
Copy link
Author

DemiMarie commented Oct 19, 2023

Not running untrusted and trusted software on the same core is a very non-obvious requirement that I doubt most embedded systems developers would know they needed to meet.

Then they don't care about security and haven't paid any attention to the endless stream of speculative attacks.

This assumes that there are enough cores for each security domain to get its own core. In Qubes OS, for instance, this is simply not true: one can easily have 12 or more non-dom0 VMs on a 4 core CPU, all of which are considered mutually distrusting as far as the hypervisor (Xen) is concerned. When applied to a web browser, it would mean that one could not have google.com, duckduckgo.com, wikipedia.org, github.com, and gitlab.com open at the same time on a 4-core machine. Such a browser would not be usable.

In an ideal world, CPU vendors would use speculative taint tracking to stop the stream of speculative attacks. Unfortunately, they have not.

I would think most embedded devs would keep untrusted or non-critical software as far away from the critical software, if not for security, then at least for performance or reliability reasons.

The entire purpose of the MCS work is to allow multiplexing software of different criticalities on the same core.

I would hope that cloud providers find it obvious that they should run VMs on different cores if they care about security, or are you saying that that is non-obvious to them too?

This isn’t a fair comparison. Cloud providers sell discrete chunks of compute and it is up to the user to use that compute. These chunks of compute are generally an integral number of cores. Embedded systems designers have a number of security domains that is determined by the needs of the product, and which may be much larger than the number of cores.

Just because a system should not run arbitrary code does not mean that it will not.

It can only run arbitrary code if one of the following is true:

  • The process is running an interpreter or doing JIT.
  • The process has write access to executable memory.
  • The process has filesystem access and can overwrite or add applications that will be run either immediately or at next reboot.

I hope you agree that avoiding the above is trivial for an embedded system.

If you are going to be running Linux in a VM (which many uses of seL4 will need to, for compatibility reasons), then no, because Linux uses self-modifying code internally. It might be possible to avoid that, but I am not sure.

More generally, “will only ever run trusted code” is a policy decision and is not something seL4 should be making.

You can manipulate the stack to get a fair degree of arbitrary code execution, but doing Spectre kind of attacks that way is almost impossible.

Is it? ROP chains are Turing-complete.

There are also stack protection compiler options to protect against such attacks, making it even harder.

Stack protection has been repeatedly bypassed.

I don’t expect seL4 to fix this problem itself. Doing so would require a huge amount of resources that I expect would be better spent elsewhere, such as preventing integrity and availability attacks (far more important than confidentiality for e.g. cars or airplanes) and (as you mentioned below) AMD support.

It takes a huge amount of resources if you are willing to put in a lot of time and effort to having runtime, dynamic selection between different mitigations and to be slightly faster than when just using IBRS. But if you go for the simpler, compile-time configured fixes then it doesn't take a huge amount of work, maybe a few weeks of work to bring things up-to-date to the newest attacks and checking if the current mitigations are adequate or not.

By “just using IBRS”, do you mean disabling all prediction of indirect branches, except for the return stack buffer, for the kernel, userspace, and guest VMs?

What I do expect is that the current situation be documented, or at least that a PR to add such documentation would be accepted. I suspect most people who are not familiar with seL4 will expect seL4 to provide more security than (on x86) it actually does, just like I did. Xen really does try to do this, and there are users who really do depend on it. That seL4 does not do this came as a serious surprise to me. I even suggested that Qubes OS consider using it as a Xen replacement, and I now know that would have been a disaster.

Most people who are not familiar with seL4 will expect seL4 to provide more functionality than it actually does, and most will underestimate how far seL4 is from being ready-to-use for their use case.

Until I learned about the speculation problems, I assumed that the seL4 kernel already provided enough features for security-critical dynamic systems like Qubes OS, and that it was userspace that was not ready yet. Now I know that this is false. Even if I was willing and able to write the needed userspace code, this would still be insufficient.

I agree that documentation should be good enough that people can see what's missing, but for a lot of things, that's "everything that's not explicitly mentioned", and people assume things too easily.

What about something like this?

Currently, the seL4 developers do not have access to information about hardware vulnerabilities that are under embargo. This means that workarounds for these vulnerabilities may be delayed by weeks or even months. Furthermore, seL4 does not have the code needed to select optimal mitigations at runtime, nor does it provide an interface by which userspace can do so. Instead, seL4 expects the mitigations to be chosen at compile-time.

For this reason, seL4 is not suitable for dynamic systems on processors vulnerable to speculative execution vulnerabilities (such as x86), unless confidentiality is not a concern. The only exception is if workloads are statically partitioned such that speculative execution cannot be used to leak information. This means XYZ. If such partitioning is not possible, and if confidentiality is needed, an OS or hypervisor that does mitigate speculative execution attacks should be used instead.

In particular, this means that seL4 is not currently usable for general-purpose operating systems or virtualization solutions, even one is willing and able to write the needed userspace tooling. This also means that seL4 is not a good fit for systems like Qubes OS or OpenXT.

This should warn people from using seL4 in areas where it is not suitable.

But to come back to your original point, you should at least specify why seL4 is vulnerable against which attack. The vulnerabilities are not known at this point and your hand waving isn't convincing so far. Don't get me wrong, I agree that seL4 isn't fully up-to-date with all speculative attacks and that the chance is high that seL4 doesn't mitigate all of them properly (assuming up-to-date microcode), but "suspected vulnerabilities" is a big step from "known vulnerabilities".

The specific known vulnerabilities I am aware of are:

  • Not using IOMMU interrupt remapping.
  • Not using the Zenbleed chicken bit on platforms that need it. Waiting for updated microcode is not an option because the relevant AGESA binary that contains it will not be released until December 2023. AMD and users expect that OSs will use the chicken bit in the meantime.
  • Missing flushing for Inception.
  • Disabling Spectre mitigations by default and not providing a clear recommendation on how to handle them.

The general problem is that seL4 does not have patches ready to release when the embargo expires, whereas Linux and Xen do. And at least Qubes OS users really do depend on that.

@andyhhp
Copy link

andyhhp commented Oct 21, 2023

I'm not going to get involved in a debate over what sel4 should or shouldn't do WRT documentation. I'm not qualified to comment.

But I do have some technical corrections about speculative mitigations.

There are no indirect jumps or calls in the kernel, so there was no need for such strict mitigations. It's only since Retbleed that any return instruction can be used for speculative attacks.

I'm afraid that's simply not true. Note that the Author Date on xen-project/xen@5e79629 is Fri Nov 3 16:17:00 2017.

The easiest way to hijack rets is with Spectre-v1.2, a.k.a. Branch Target Bypass Store, but you can with most of the Spectre-v2 techniques too. This was known prior to things breaking publicly, just as we knew that retpoline was not safe on Skylake systems with the RSB-underflow-falls-back-to-BTB behaviour. Retbleed (the Intel half) was the researchers demonstrating that the attack discussed in length on LKML in Jan 2018 was entirely doable, and not a theoretical concern.

Retbleed is mitigated with IBRS and that is done in C as the first thing after assembly calls the handle function.

Retbleed (the Intel half) is not fully mitigated with IBRS. The mode-tagging in hardware still leaves kernel code vulnerable to confused-deputy induced RSB-underflow attacks, as demonstrated in the Spectre-BHI paper. It took until AlderLake CPUs for Intel re-introduce a knob to fully disable the RSB-underflow behaviour leading to (the Intel half) of Retbleed.

Retbleed (the AMD half) is totally different and I begged the discoverers to name it differently, but alas.

The compiler is not going to magically inject indirect jumps before a volatile assembly statement,

The compiler is free to insert whatever it likes, and routinely does for UBSAN, stack protector, etc. More importantly, the compiler can and does insert calls to memcpy()/memset() even for simple assignments.

And it's not indirect calls that you want to worry about (they will be rare, admittedly), it's the ret from the function call you didn't realise was there that you need to worry about.

ret instructions really are fiendish when it comes to speculation. There's a good reason why the CET-ShadowStack spec had emergency speculation changes in v3, intentionally limiting the behaviour of rets when they could be held to a safer bar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants