Thursday, December 27, 2018

Securing The Hardware Supply Chain

This is the third part of a series about trust in digital content that might be called:
Is this the real life?
Is this just fantasy?
We are moving down the stack:
  • The first part was Certificate Transparency, about how we know we are getting content from the Web site we intended to.
  • The second part was Securing The Software Supply Chain, about how we know we're running the software we intended to, such as the browser that got the content whose certificate was transparent.
  • This part is about how we can know that the hardware the software we secured is running on is doing what we expect it to.
Below the fold, some rather long and scary commentary.

Attacks on the hardware supply chain have been in the news recently, with the firestorm of publicity sparked by Bloomberg's, probably erroneous reports, of a Chinese attack on Supermicro motherboards that added "rice-grain" sized malign chips. Efforts to secure the software supply chain will be for naught if the hardware it runs on is compromised. What can be done to reduce risks from the hardware?

Hardware Implants

As regards Bloomberg's story, the experts agree on three main points.

First, the details cannot be correct. Patrick Kennedy's Investigating Implausible Bloomberg Supermicro Stories is the most detailed critique, but aspects of his analysis are supported by, among others, Riverloop Security's A Tale of Two Supply Chains, and Joe Fitzpatrick's Hardware Implants.

Source
Second, attacks using hardware implants are feasible. A year before the Bloomberg story, Joe Fitzpatrick listed four scenarios:
  1. Modify the ASPEED flash chip [3] to give a backdoor that can drop a payload into the host CPU’s memory sometime after boot.
  2. Modify the PC Bios flash chip [2] to drop a bootkit backdoor into the OS sometime after boot.
  3. Solder a device onto the board to intercept/monitor/modify the values read from the flash chip as they are accessed to inject malicious code somewhere
  4. Find debug connections on the testpoints [5] to allow debugger controll of the ASPEED BMC [1], allowing you to direct it to drop a payload into memory
I think that #4 would probably be the coolest illustration of the point. you could glue the microcontroller you’ve got upside-down to the top of the ASPEED chip, and then solder its legs to some nearby testpoints (AKA dead-bug-style soldering).
Note that both 1 and 2 are in effect firmware attacks and, unlike 3 and 4, are not visible.

Third, hardware implants are not the best way to attack via the hardware supply chain. Joe Fitzpatrick writes:
There are plenty of software vectors for exploiting a system. None of them require silicon design, hardware prototyping, or manufacturing processes, and none of them leave behind a physical item once they’re implanted.
In contrast to Bloomberg's report, reports of the NSA's hardware supply chain attacks are well documented. Among the Snowden revelations documented by Glenn Greenwald was this from an NSA manager in 2010:
Here’s how it works: shipments of computer network devices (servers, routers, etc,) being delivered to our targets throughout the world are intercepted. Next, they are redirected to a secret location where Tailored Access Operations/Access Operations (AO-S326) employees, with the support of the Remote Operations Center (S321), enable the installation of beacon implants directly into our targets’ electronic devices. These devices are then re-packaged and placed back into transit to the original destination. All of this happens with the support of Intelligence Community partners and the technical wizards in TAO.
Source
The picture supports the belief that in this case the NSA was injecting malicious firmware rather than hardware implants. The "beacon implant" can be very small, its only purpose being to locate the compromised device and provide a toehold for further compromise. Earlier, Der Spiegel had reported that it wasn't just routers and firmware:
If a target person, agency or company orders a new computer or related accessories, for example, TAO can divert the shipping delivery to its own secret workshops. The NSA calls this method interdiction. At these so-called "load stations," agents carefully open the package in order to load malware onto the electronics, or even install hardware components that can provide backdoor access for the intelligence agencies. All subsequent steps can then be conducted from the comfort of a remote computer.

These minor disruptions in the parcel shipping business rank among the "most productive operations" conducted by the NSA hackers, one top secret document relates in enthusiastic terms. This method, the presentation continues, allows TAO to obtain access to networks "around the world."
Obviously, many of TAO's operations only involve malicious firmware, but note the wording "install hardware components". Even Cisco found it hard to detect whether this was happening:
Cisco has poked around its routers for possible spy chips, but to date has not found anything because it necessarily does not know what NSA taps may look like, according to Stewart.
At Hacker News, lmilcin's post and the subsequent discussion shows that sophisticated supply chain attacks on supposedly secure hardware really do happen:
I have worked in card payment industry. We would be getting products from China with added boards to beam credit card information. This wasn't state-sponsored attack. Devices were modified while on production line (most likely by bribed employees) as once they were closed they would have anti-tampering mechanism activated so that later it would not be possible to open the device without setting the tamper flag.

Once this was noticed we started weighing the terminals because we could not open the devices (once opened they become useless).

They have learned of this so they started scraping non-essential plastic from inside the device to offset the weight of the added board.

We have ended up measuring angular momentum on a special fixture. There are very expensive laboratory tables to measure angular momentum. I have created a fixture where the device could be placed in two separate positions. The theory is that if the weight and all possible angular momentums match then the devices have to be identical. We could not measure all possible angular momentums but it was possible to measure one or two that would not be known to the attacker.
In contrast to TAO's, this is a broadcast attack. The bribed employees probably don't know where the devices are to be shipped, and knowing the location isn't necessary for the attack to be profitable. Bribing employees has the added advantage of increasing the difficulty of correctly attributing the attack.

Firmware

Much of what we think of as "hardware" contains software to which what we think of as "software" has no access or visibility. Examples include Intel's Management Engine, the baseband processor in mobile devices, and complex I/O devices such as NICs or GPUs. Even if this "firmware" is visible to the system CPU, it is likely supplied as a "binary blob" whose source code is inaccessible. For example, a friend reports that updating his BIOS also updated his USB Type C interface, his Intel Management Engine, and his Embedded Controller. None of this software, nor the firmware in his WiFi chip and other I/O devices, is open source and thus cannot be secured via reproducible builds and a transparency overlay.

Bloomberg's reporting implies that the putative Supermicro attack was targeted, though fairly broadly. Dan Goodin points out the lack of security in Supermicro's Board Management Controllers (BMCs):
several researchers ... unearthed a variety of serious vulnerabilities and weaknesses in Supermicro motherboard firmware (PDF) in 2013 and 2014. This time frame closely aligns with the 2014 to 2015 hardware attacks Bloomberg reported. Chief among the Supermicro weaknesses, the firmware update process didn’t use digital signing to ensure only authorized versions were installed. ... Also in 2013, a team of academic researchers published a scathing critique of Supermicro security (PDF). ... The critical flaws included a buffer overflow in the boards’ Web interface that gave attackers unfettered root access to the server and a binary file that stored administrator passwords in plaintext. ... for the past five years, it was trivial for people with physical access to the boards to flash them with custom firmware that has the same capabilities as the hardware implants reported by Bloomberg.
He then asks If Supermicro boards were so bug-ridden, why would hackers ever need implants?:
Besides requiring considerably less engineering muscle than hardware implants, backdoored firmware would arguably be easier to seed into the supply chain. The manipulations could happen in the factory, either by compromising the plants’ computers or gaining the cooperation of one or more employees or by intercepting boards during shipping the way the NSA did with the Cisco gear they backdoored.

Either way, attackers wouldn’t need the help of factory managers, and if the firmware was changed during shipping, that would make it easier to ensure the modified hardware reached only intended targets, rather than risking collateral damage on other companies.
It isn't just server motherboards, designed for remote access via the BMC, that have remotely exploitable vulnerabilities, so do regular PC's BIOS:
Though there's been long suspicion that spy agencies have exotic means of remotely compromising computer BIOS, these remote exploits were considered rare and difficult to attain.

Legbacore founders Corey Kallenberg and Xeno Kovah's Cansecwest presentation ... automates the process of discovering these vulnerabilities. Kallenberg and Kovah are confident that they can find many more BIOS vulnerabilities; they will also demonstrate many new BIOS attacks that require physical access.
But in the context of a targeted attack on sophisticated customers such as Apple, Amazon and telcos it is important to note that the customer's defenses would likely make exploiting the weak security of the motherboards impossible once they were installed.

Patrick Kennedy's analysis starts from this point:
Even smaller organizations with a handful of servers generally have segregated BMC networks. That basic starting point, from where large companies take further steps, looks something like this.
The key here is that the companies named are all sophisticated, and will have better protections than your average small to medium enterprise. Bloomberg’s report describes an attack that is not possible at the companies listed in the article.
Compromising the systems in transit to a known destination, or selectively at the factory would be necessary, and well within the capability of a nation state. Selectivity is important; as Joe Fitzpatrick points out, a broadcast attack is noisy:
Every board has it, but we probably only care about one targeted customer of the board. This is where it gets complicated. If 10 million backdoored motherboards all ping the same home server, everyone will notice.
An attacker need only get a few compromised systems into the target, he does not want to compromise them all. For major targets with many systems this poses two problems. For the attacker, intercepting without being detected a truck-load of systems all destined for the same customer in order to compromise a few of them is harder than intercepting a single box in transit. And for the defense, in that sampled audits of the supply chain are unlikely to detect the needle in the haystack. Supermicro publicized the result of such an audit:
Reuters' Joseph Menn reported that the audit was apparently undertaken by Nardello & Co, a global investigative firm founded by former US federal prosecutor Daniel Nardello. According to Reuters' source, the firm examined sample motherboards that Supermicro had sold to Apple and Amazon, as well as software and design files for products. No malicious hardware was found in the audit, and no beacons or other network transmissions that would be indicative of a backdoor were detected in testing.
But with customers the size of Apple and Amazon the audit would be very unlikely to find a targeted attack.

Given the target's likely defenses, it is the server software not just the BMC that needs to be compromised. The BMC could be the start of such an attack, but the result would be detectable at the server level. I/O interfaces are other potential routes for a compromise:
Bloomberg had reported that in addition to targeting Apple and Amazon Web Services, Chinese intelligence had managed to get implanted hardware inside an unnamed major telecommunications provider. The alleged victim was never named, with Bloomberg's report citing a non-disclosure agreement signed by the company Bloomberg used as its source for the story, Sepio Systems. Sepio's co-CEO, Yossi Appleboum, claimed that a scan had revealed the implant and that it had been added to an Ethernet adapter when the computer was manufactured.
Other routes would have included disk drive firmware as used by the "Equation Group":
One of the Equation Group's malware platforms, for instance, rewrote the hard-drive firmware of infected computers—a never-before-seen engineering marvel that worked on 12 drive categories from manufacturers including Western Digital, Maxtor, Samsung, IBM, Micron, Toshiba, and Seagate.

The malicious firmware created a secret storage vault that survived military-grade disk wiping and reformatting, making sensitive data stolen from victims available even after reformatting the drive and reinstalling the operating system. The firmware also provided programming interfaces that other code in Equation Group's sprawling malware library could access. Once a hard drive was compromised, the infection was impossible to detect or remove.
The revelation of this compromise three and a half years ago led drive manufacturers to secure their firmware update mechanism. Two years earlier the amazing Bunnie Huang and his colleague xobs had demonstrated essentially the same vulnerability for smaller devices in their Chaos Computer Conference talk called "On Hacking MicroSD Cards".

Cooper Quintin at the EFF's DeepLinks blog weighed in at the time with a typically clear overview of the issue entitled Are Your Devices Hardwired For Betrayal?. The three principles:
  • Firmware must be properly audited.
  • Firmware updates must be signed.
  • We need a mechanism for verifying installed firmware.
Adhering to these principles would help, but each of them is problematic in its own way:
  • Auditing requires third-party access to proprietary source code, and is expensive. While Supermicro's business model would likely be able to afford it, this isn't the case for cheaper devices that also attach to the network, including the Internet of Things.
  • Signing depends upon the vendor's ability to keep their private key secret, and to revoke their keys promptly in the event of a compromise. The vendor's private key is a very high-value target for attackers, as illustrated in 2015 when it was revealed that NSA and GCHQ had compromised Gemalto's network to obtain their private key and thus the ability to compromise SIM cards.
  • Verifying requires extracting the binary firmware from the device for analysis. Physically removing the flash chip containing the firmware is expensive, but the only way to be sure of obtaining the actual contents. Otherwise, access is via the firmware being extracted, which can be programmed to lie.
Source
The more firmware is open source, the more the techniques of Securing The Software Supply Chain can be used to defend it. Microsoft's recent announcement of Project Mu, open sourcing the UEFI implementation used in Surface devices and Hyper-V, is encouraging. The goal is to provide:
a code structure and development process for efficiently building scalable and serviceable firmware. These enhancements allow Project Mu devices to support Firmware as a Service (FaaS). Similar to Windows as a Service, Firmware as a Service optimizes UEFI and other system firmware for timely quality patches that keep firmware up to date and enables efficient development of post-launch features.
...
we learned that the open source UEFI implementation TianoCore was not optimized for rapid servicing across multiple product lines. We spent several product cycles iterating on FaaS, and have now published the result as free, open source Project Mu! We are hopeful that the ecosystem will incorporate these ideas and code, as well as provide us with ongoing feedback to continue improvements.
Project Mu features:
  • A code structure & development process optimized for Firmware as a Service
  • An on-screen keyboard
  • Secure management of UEFI settings
  • Improved security by removing unnecessary legacy code, a practice known as attack surface reduction
  • High-performance boot
  • Modern BIOS menu examples
  • Numerous tests & tools to analyze and optimize UEFI quality.
Designing a firmware development, maintenance and distribution channel holistically, rather than bolting maintenance and update on as afterthoughts, is a critical advance.

Chip-Level Attacks

Source
In A2: Analog Malicious Hardware (also here) Kaiyuan Yang et al describe the potential for chip-level attacks:
While the move to smaller transistors has been a boon for performance it has dramatically increased the cost to fabricate chips using those smaller transistors. This forces the vast majority of chip design companies to trust a third party — often overseas — to fabricate their design. To guard against shipping chips with errors (intentional or otherwise) chip design companies rely on post-fabrication testing. Unfortunately, this type of testing leaves the door open to malicious modifications since attackers can craft attack triggers requiring a sequence of unlikely events, which will never be encountered by even the most diligent tester.
The paper describes previous chip-level attacks and the techniques for detecting them. Then they:
show how a fabrication-time attacker can leverage analog circuits to create a hardware attack that is small (i.e., requires as little as one gate) and stealthy (i.e., requires an unlikely trigger sequence before effecting a chip’s functionality). In the open spaces of an already placed and routed design, we construct a circuit that uses capacitors to siphon charge from nearby wires as they transition between digital values. When the capacitors fully charge, they deploy an attack that forces a victim flip-flop to a desired value. We weaponize this attack into a remotely-controllable privilege escalation by attaching the capacitor to a wire controllable and by selecting a victim flip-flop that holds the privilege bit for our processor. We implement this attack in an OR1200 processor and fabricate a chip. Experimental results show that our attacks work, show that our attacks elude activation by a diverse set of benchmarks, and suggest that our attacks evade known defenses.
This is an extremely dangerous attack, since it involves only intercepting a finalized chip design on its way from the back-end house to the fab and injecting an almost undetectable change to the design. The result is a chip that passes all the necessary tests but can be compromised by an attacker who can run user-level code on it.

Open-Source Hardware

The chip design and fabrication process can be analogized to the software development and deployment process. It consists of developing source code (in a Register Transfer Language or a Hardware Description Language), compiling it into binary (typically polygons in GDS II), and writing the result to a write-once medium (silicon). To what extent could the techniques of Securing The Software Supply Chain be used to secure it?

Source
My list of what it would take to secure CPUs in this way is:
  • Open source CPU designs: Several such designs exist, perhaps the most prominent being RISC-V which is now used, for example, by Western Digital for the CPUs in their disk drives. It has gained enough momentum to force MIPS to open source its instruction set and R6 core. But these designs are typically for small system-on-chip CPUs suitable for the Internet of Things. Western Digital's design is somewhat slower than a low-end Intel Xeon. It has taken ARM three decades to evolve up from IoT-level CPUs to server CPUs; it isn't clear when, if ever, there would be a competitive open source server CPU.
  • Open source tooling: Again, at least one complete open source toolchain exists, but as Chinmay Tongale reports they aren't competitive with commercial tools:
    You can design fairly complex chips in these tools (if not industry standard). I have designed (RTL to GDS2) 16 bit RISC Processor Chip using these.
  • Reproducible tooling: all tools in the chain would have to generate reproducible outputs.
  • Bootstrapped tooling: all tools in the chain would have to be built with bootstrapped compilers.
  • A transparency overlay: the hashes of all tools in the chain, and all their inputs and outputs would have to be secured by a transparency overlay analogous to Certificate Transparency.
Of course, system-level security would require this process not just for the system CPU, for which at least some open source designs are available, but also for the BMC and the I/O controllers, for which open source designs are hard to find, and whose business models would likely not support the additional cost.

While a design and fabrication process secured in this way is conceivable, it is hard to see such a fundamental transformation of the chip business, which jealously guards its intellectual property, being feasible.

Side-Channels

With the many variants of Spectre and Meltdown, 2018 was the year of the side-channel attack. Fundamentally, these attacks are enabled by two things:
  • Multiple processes (an attacker and a target) sharing the same underlying hardware resources.
  • Performance optimizations using hardware resources that may or may not be available.
The attacks manipulate the availability of the hardware resources and use timing to detect whether or not the optimization occurred. From these timings the attacker can infer information about the target.

Presumably, this could allow a chip-level attack to be disguised as a fully functional performance optimization, just one that enabled information leakage via a side-channel. Such an attack would be hard to detect in an audit, since it would have a genuine justification.

The Hardware Supply Chain

Riverloop Security describes the general hardware supply chain with these stages:
  1. Design Time First, a specification is developed – this can be checked for backdoors or weaknesses prior to manufacture. Subverting this stage to introduce a backdoor provides the greatest access.
  2. Hardware Manufacturing Manufacturing is often subcontracted to a third party and is not easy to check. Manufacturers frequently substitute parts due to availability and cost constraints. Small malicious changes are possible at this stage. The ease of doing so depends on the device and format of the plans the attacker can access and modify.
  3. Third Party Hardware & Firmware Integration Manufacturers frequently act as integrators and subcontract manufacture of subcomponents. This third-party integration leaves room a malicious actor to introduce backdoors or exploitable flaws into the system.
  4. Supply Distribution Time By the time the manufactured device reaches distribution, the company and consumers have little ability to verify the device matches the specification as originally designed. Devices can be replaced wholesale with counterfeit devices or modified to include additional malicious components.
  5. Post Deployment In the final stage, defense depends largely on the end customer’s physical security and processes.
And Supermicro's supply chain in more detail thus:
Super Micro contracts manufacturing and supply chain logistics. They use Ablecom Technology: a company which manufactures and provides warehousing before international shipments (US, EU, Asia). Ablecom is a private Taiwanese company run by the brother of Super Micro’s CEO, largely owned by the CEO’s family and is critical to Super Micro’s business processes. They rely on them to accurately forecast and warehouse parts from various contract manufacturers to be able to create their products.

If we attempt to simplify the above supply chain, Super Micro’s public disclosures would suggest that the steps applied to their case is, at a high level, as follows:
  1. Super Micro designs product
  2. Ablecom coordinates manufacturing with contract manufacturers; contract manufacturer produces and ships to Ablecom for warehousing
  3. Ablecom ships to Super Micro facility (San Jose, Netherlands, or Taiwan) for final assembly
  4. Super Micro ships to distributor, OEM, or customer
  5. Customer utilizes product
They sum up the problem thus:
Although it is extremely difficult to be assured that something you purchase from a source you do not fully and totally control is trustworthy, there are a number of steps companies can take to make them more difficult to target. These include implementing a supply chain security program, which can involve obfuscating end users of purchases from manufacturers, buying directly from authorized vendors, verifying parts, doing randomized in-depth inspections, and more.
In 2015, Darren Pauli reported for The Register that:
Cisco will ship boxes to vacant addresses in a bid to foil the NSA, security chief John Stewart says.

The dead drop shipments help to foil a Snowden-revealed operation whereby the NSA would intercept networking kit and install backdoors before boxen reached customers. ... Stewart says the Borg will ship to fake identities for its most sensitive customers, in the hope that the NSA's interceptions are targeted.

"We ship [boxes] to an address that's has nothing to do with the customer, and then you have no idea who ultimately it is going to," Stewart says.

"When customers are truly worried ... it causes other issues to make [interception] more difficult in that [agencies] don't quite know where that router is going so its very hard to target - you'd have to target all of them. There is always going to be inherent risk."
Clearly, TAO's operations are targeted. the NSA doesn't want to compromise all Cicso routers then have to figure out which are the small proportion of interest. They want to compromise just the ones being shipped to targets, which will then "phone home" and identify themselves.

Redesigning Hardware

Riverloop Security suggests that hardware supply chain security:
also can be aided by designing systems which can tolerate, contain, or detect compromise.
But how to do that?

In List of criteria for a secure computer architecture and A computer architecture with hardwarebased malware detection, Igor Podebrad et al's goals are:
to derive a hardware architecture which provides much more security features in comparison to current architectures. to:
  • support antivirus agents
  • disable typical malware properties (infection, stealth mechanism etc.)
  • support sensing of attacks
  • support forensic analysis to analyse successful attacks
They proposed, and prototyped in an FPGA, an interesting if impractical system architecture. It was a Harvard architecture (separate code and data address spaces) machine with a "Security Core" that loaded code and detected bad behavior. It isn't clear how the "Security Core" was to be maintained securely, given that its security was based on being also a Harvard architecture machine with code in ROM.

In effect, Podebrad et al take a similar, but in my view much less practical, approach to hardware as the Bootstrappable Builds project described in Securing The Software Supply Chain takes to software; starting from a kernel "secure by inspection" building up to a useful system. The reason I think that Bootstrappable Builds is practical is that it is conservative, not trying to change the way an entire industry works. Podebrad et al's approach is radical, throwing away a half-century of experience and investment in optimizing CPU design and imposing a severe performance penalty in the name of security. History, even as recent as Spectre and Meltdown, shows that this is a very difficult sell.

Dover Microsystems takes a less impractical approach, applying the concept of a security core to a conventional CPU architecture. Their CoreGuard Policy Enforcer monitors writes from the CPU cache to memory against a set of policies based on metadata extracted from the compilation process, and traps on violations. Their technique costs area on the die, which could have been used to enhance performance, and it isn't clear how the metadata on which the mechanism depends can be protected from an attacker.

Co-evolution Of Attack & Defense

It is important to view the security of IT systems not as a single, solvable problem, but as a system in which attacks and defenses co-evolve, just as diseases evolve drug resistance, and research creates new drugs. Lets take Return-oriented programming (ROP) as an example cycle in this co-evolution. The attack was published in 2007 and is described in detail by Erik Buchanan et al's Black Hat talk from 2008, Return-oriented Programming: Exploitation without Code Injection (53-slide PDF).

As Chris Williams explained in 2016's RIP ROP: Intel's cunning plot to kill stack-hopping exploits at CPU level, ROP was the attack's response to improved defenses against earlier attacks:
Once upon a time, you could – for example – find a memory buffer in some software and inject more data into it than the array could hold, thus spilling your extra bytes over other variables and pointers. Eventually you could smash the return address on the stack and make it point to a payload of malicious code you smuggled into the gatecrashing data. When the running function returns, the processor wouldn't jump back to somewhere legitimate in the software, instead it will jump to wherever you've defined in the overwritten stack – ie: your malicious payload.

Voila, deliver this over a network, and you've gained arbitrary code execution in someone else's system. Their box is now your box.

Then operating systems and processors began implementing mechanisms to prevent this. The stack is stored in memory marked in the page tables as data, not executable code. It is therefore easy to trap these sorts of attack before any damage can be done: if the processor starts trying to execute code stored in the non-executable, data-only stack, an exception will be raised. That's the NX – no-execute – bit in the page tables; Intel, AMD, ARM etc have slightly different official names for the bit.
Williams explains ROP in simple terms:
Now, here comes the fun part: return-orientated programming (ROP). Essentially, you still overwrite the stack and populate it with values of your choosing, but you do so to build up a sequence of addresses all pointing to blocks of useful instructions within the running program, effectively stitching together scraps of the software to form your own malicious program. As far as the processor is concerned, it's still executing code as per normal and no exception is raised. It's just dancing to your tune rather than the software's developer

Think of it as this: rather than read a book the way the author intended – sentence by sentence, page by page – you decide to skip to the third sentence on page 43, then the eight sentence on page 3, then the twelfth sentence on page 122, and so on, effectively writing your own novel from someone else's work

That's how ROP works: you fill the stack with locations of gadgets – useful code in the program; each gadget must each end with a RET instruction or similar. When the processor jumps to a gadget, executes its instructions, and then hits RET, it pulls the next return address off the stack and jumps to it – jumps to another gadget, that is, because you control the chain now.
Note that since ROP doesn't involve executing data, the Harvard-architecture CPUs envisaged by Podebrad et al would be vulnerable, illustrating the difficulty of radical changes to address these security issues.

In 2016 Baiju Patel's Intel Releases New Technology Specifications to Protect Against ROP attacks described how Intel, working with Microsoft, introduced new hardware defenses against ROP called Control-flow Enforcement Technology (CET):
CET defines a second stack (shadow stack) exclusively used for control transfer operations, in addition to the traditional stack used for control transfer and data. When CET is enabled, CALL instruction pushes the return address into a shadow stack in addition to its normal behavior of pushing return address into the normal stack (no changes to traditional stack operation). The return instructions (e.g. RET) pops return address from both shadow and traditional stacks, and only transfers control to popped address if return addresses from both stacks match. There are restrictions to write operations to shadow stack to make it harder for adversary to modify return address on both copies of stack implemented by changes to page tables. Thus limiting shadow stack usage to call and return operations for purpose of storing return address only. The page table protections for shadow stack are also designed to protect integrity of shadow stack by preventing unintended or malicious switching of shadow stack and/or overflow and underflow of shadow stack.
CET also protects against a variant of ROP, Jump Oriented Programming, by ensuring via Indirect Branch Tracking (IBT) that all valid targets of jumps or indirect branch instructions are labeled as such:
The ENDBRANCH instruction is a new instruction added to ISA to mark legal target for an indirect branch or jump. Thus if ENDBRANCH is not target of indirect branch or jump, the CPU generates an exception indicating unintended or malicious operation. This specific instruction has been implemented as NOP on current Intel processors for backwards compatibility (similar to several MPX instructions) and pre-enabling of software.
In Williams' words:
What CET does here is ensure that, when returning from a subroutine, the stack hasn't been tampered with to hijack the flow of the software. No ROP, no working exploit, no malware infection.
He suggests a possible direction for the next stage of co-evolution:
The shadow stack can't be modified by normal program code. Of course, if you can somehow trick the kernel into unlocking the shadow stack, meddle with it so that it matches your ROP chain, and then reenable protection, you can sidestep CET. And if you can do that, I hope you're working for the Good Guys.
Despite the care with which Intel maintained compatibility, it took two more years before , as Jonathan Corbet explained in Kernel support for control-flow enforcement, the Linux kernel developers had figured out how to use it effectively:
Yu-cheng Yu recently posted a set of patches showing how this technology is to be used to defend Linux systems.

The patches adding CET support were broken up into four separate groups: CPUID support and documentation, some memory-management work, shadow stacks, and indirect-branch tracking (IBT). The current patches support 64-bit systems only, and they only support CET for user-space code. Future versions are supposed to lift both of those restrictions.
But support for CET is in its early stages:
there appear to be no concerns ... about the CET features overall. They should make the system far more resistant to some common attack techniques with, seemingly, little in the way of performance or convenience costs. Chances are, though, that this technology won't be accepted until it is able to cover kernel code as well, since that is where a lot of attacks are focused. So CET support in Linux won't happen in the immediate future — but neither will the availability of CET-enabled processors.
Note the long timescales involved. It took nine years from publication of the attack to publication of the defense, and another two years before operating systems were able to start supporting it. It will take many more years before even a majority of the installed base of CPUs implements CET.

Conclusion

In the foreseeable future it doesn't seem likely that it will be possible to build generally useful systems with chips produced via a secure design and fabrication process, and firmware secured via reproducible builds. The foundations of our IT systems will continue to be shaky.

3 comments:

David. said...

Trammell Hudson is co-lead of the LinuxBoot project. His talk Modchips of the State was uploaded to YouTube yesterday. It is a must-watch, fascinating overview of the issue of hardware implants, including a live demo of a 2-pin implant successfully compromising a motherboard's BMC. His conclusions stress the importance of openness and transparency for hardware security. The questions at the end are good, too.

David. said...

First-Ever UEFI Rootkit Tied to Sednit APT by Tom Spring reports on motherboard firmware implants in the wild:

"Researchers hunting cyber-espionage group Sednit (an APT also known as Sofacy, Fancy Bear and APT28) say they have discovered the first-ever instance of a rootkit targeting the Windows Unified Extensible Firmware Interface (UEFI) in successful attacks.

The discussion of Sednit was part of the 35C3 conference, and a session given by Frédéric Vachon, a malware researcher at ESET who published a technical write-up on his findings earlier this fall (PDF). During his session, Vachon said that finding a rootkit targeting a system’s UEFI is significant, given that rootkit malware programs can survive on the motherboard’s flash memory, giving it both persistence and stealth."

David. said...

Side channel attacks aren't just for hardware. Thomas Calburn's New side-channel leak: Boffins bash operating system page caches until they spill secrets reports on a side channel attack that uses the operating system page cache as its shared resource:

"We present a set of local attacks that work entirely without any timers, utilizing operating system calls (mincore on Linux and QueryWorkingSetEx on Windows) to elicit page cache information," wrote the researchers. "We also show that page cache metadata can leak to a remote attacker over a network channel, producing a stealthy covert channel between a malicious local sender process and an external attacker."