CS3235 Computer Security

Pinned resources

Textbooks:
- OSTEP on operating systems
- Goodrich, "Introduction to Computer Security"

Honestly a pretty good lecture on first principles.

Also the lecturer opened the floor to suggest topics for security stuff. Infrastructure security module come AY2023/24 Sem 2.

Week 13

Virtualization

Main motivation is to subvert almost all forms of malicious threats (even up to kernel rootkits) by isolating into a sandbox.

The concept of virtual machines actually came out from 1960s to share machines between users (e.g. IBM). The late 1990s focused on RISC/CISC virtualization to mitigate under-utilization, ease-of-maintenance, security (e.g. VMWare). Today the virtualization model is heavily utilized in cloud computing.

Two-fold security goals for virtualization, specifically the isolation of code, data and resources between:

Guest VM and Host VMM
Between guest VMs

assuming that the TCB (i.e. Host OS and VMM) are bug-free, and that guest VMs may contain arbitrary malware. These translate to the following enforcement goals for a VMM:

Security goals:
- Complete mediation
- Trap on all MMU, DMA, I/O accesses
- Transparency (to avoid VM detection by malware)
Commercial goals:
- Performance
- Compatibility to run on commodity OSes

Although virtualization was historically designed for performance rather than for security, there are some main security uses of virtualization, including:

VM isolation: Red-Green systems, dynamic analysis/malware containment
VM introspection: running an anti-virus in VMM and introducing malware in VM

Virtualization techniques

The guest OS, from its perspective, runs in (privileged) ring 0, but actually runs in ring 1 on the host OS. The VMM, which actually runs in ring 0, traps the instructions called by the guest and emulates these instructions (e.g. accesses to page table). This is known as trap-and-emulate.

There is a problem: x86 architecture has a few instructions that are privileged but do not generate traps, e.g. cli (clear interrupt) traps, but popf (pop flag) does not trap for performance reasons.

This can be resolved using direct binary translation (DBT), which has the VMM inspecting all privileged x86 instructions to be executed and replacing non-trapping instructions with explicit traps. This was first implemented in Dynamo that led to the foundation of VMWare. See link. Transparency is not violated, since the memory accesses itself can also be trapped.

To get rid of the significant overhead from instruction inspection, other techniques include:

Paravirtualization
- Introduces direct hooks to the VMM, but requires pretranslation into paravirtualization-aware OS
- e.g. DIAG instruction that calls hypervisor
- Implementations: Xen, HyperV
Hardware-assisted virtualization
- Intel VT-x
  - MMU virtualization using EPT
  - Nested virtualization - VMCS
  - I/O virtualization - IOMMU
- Intel VT-d: DMA remapping

VMM attack surface

Commercial VMMs are not fully transparent, which makes it possible to detect (e.g. for malware to use anti-VM techniques, or copy protection via VM-duplication). Known as the red-pill attack, and there are many vectors:

Detection of emulation of old chipsets, e.g. VMWare emulating i440bx
Measurement of virtualization timing latencies
Other measurement channels: HotOS07

Despite inter-VM containment, because shared states are utilized, convert channels will exist. One example is via the use of shared cache latencies (e.g. sender deterministically performs a random memory access, and the receiver records bit 1 if long read time for fixed memory location, and 0 otherwise - CCS'09). Other channels via disk, I/O, virtualization latencies, exist as well.

Trusted Execution Environments

To thwart the possibility of malicious OS layers loading before user OS layers, we can implement a hardware root-of-trust via the use of Trusted Execution Environments (TEEs), to allow assumptions of malicious software.

Examples of TEE implementations: TPM 1.2 (2004), TrustZone (2005), Intel TXT (2006), TPM 2.0 (2014), Intel SGX (2016).

Remote attestation is the first primitive implemented in TEE. During loading of each OS layer, a measurement is triggered on them and hashes are derived from them and stored in the TPM registers. The CPU cannot tamper with the TPM directly. All these hashes are digitally-signed by the TPM using a hardware AIK signing key, and then sent to a remote server for hash validation.

Loading of hardware key

There are methods where the manufacturer cannot tell what the signing key is, using physically-uncloneable functions (PUFs), e.g. via randomness in CMOS manufacturing process.

Trusted boot is enforced by implementing this remote attestation with a local secure storage.

Data sealing can also be performed by computing the hash of the software requesting the seal, and storing both the hashes and the disk encryption key is both stored in the secure storage. Unseals only if hash matches.

With these primitives, we can construct root-of-trust applications, e.g. Windows BitLocker performs block-level encryption of volumes, Linux TrustedGrub, eXtensible Modular Hypervisor Framework.

As opposed to static RTM which performs verification at load time, dynamic RTM is also possible using Intel TXT SKINT and SENTER instructions. See also the use of enclaves in iOS.

Week 12

Access control

Some definitions:

Resource Objects: targets for protection
Authorities / Principals: subjects accessing the resources
Permissions: "access rights"
Isolation Environment: "domain in which program runs, including other programs run under the same privileges"

"Protection and access control in operating systems," Lampson72, defines the use of an explicit access control matrix to assign rights between users and resources. But this is not the only way to think about access control.

Android Platform Security Model:
- Each application is a unique authority, which has a unique signature for each binary generated by the developer.
- Applications are isolated via OS processes, vis a vis Apple Safari exploit by Miller in 2008 leading to a compromise of the entire system
  - (somehow in contrast, Windows allows injection of DLL into other processes, through well-crafted exploits)

Problems with this OS model is with the presence of programs with ambient authority, e.g. cp program that is used by almost every user and has the authority to write to any file on the Android system, which then violates the principle of least privilege.

Capabilities

A different way of implementing access rights is to use a capability model, which is a (pointer, metadata) tuple embedding access rights. These are sufficient and necessary for access to objects, and can be created or modified only by querying the security monitor. Such capabilities however must be unforgeable.

cp < foo.txt > bar.txt
     |         +--- capability 2
     +--- capability 1

Note that while the security monitor is effectively an ambient authority, the security monitor can be embedded in hardware or enclaves for stronger protection guarantees.

A direct comparison:

	Access control	Capabilities
Pre-specified policies	Needed, implemented centrally by security monitor	None, follows natural flow of access rights (e.g. programs can pass on capabilities)
Revocation	Yes, and easy to implement	Possible by changing the capability on the object, but requires some careful design
Ambient authority	Yes	No
Assumptions	Complete mediation (syscalls must make access checks)	Unforgeability

The bulk of syscalls introduced into later versions of Android seem to come from Java semantics, e.g. for stack inspection, isolation. The base number of syscalls originates from UNIX (BSD-variant) syscalls. In contrast, the ioctl library seems to introduce less syscalls, likely because of the use of the Java higher level language for programs.

Types of access control

Checks for enforcement (e.g. process sandboxing, IRMs, capabilities, virtualization, TEE) come in the form of policy mechanisms. Three main types of access controls:

Discretionary Access Control: Each owner decides access rules
- UNIX filesystems
- Android permissions during application install time
Mandatory Access Control:
- SELinux OS
- For Chrome, the primitives are cookies, etc. (resource objects), web origin (authorities), tabs (isolation environment)
  - No APIs for Chrome to selectively disable CORS
- Security labels, e.g. for military:
  - Bell-LaPadula policy (no-read-up, no-write-down) ensures confidentiality of higher-security objects
  - Biba policy (no-read-down, no-write-up) ensures integrity of higher-security objects
Role-Based Access Control
- Linux access mechanism

Difficulties in defining security policies

Confused deputy problem

Week 11

Reference monitors

Let's talk about the second line of defense, which is to minimize the impact of attacks that we know will happen. Two distinct methods:

Reference monitors: A piece of code that checks all references to an object (e.g. memory space). These are typically inline, i.e. compiled/instrumented into the program, and thus share the same privileges as the program.
System call sandboxing: A reference monitor with kernel privileges for protecting OS resource objects from an app.

Some checks that an inline reference monitor can check for includes:

Complete memory safety - "access memory objects in intended way"
Fault isolation - "each module only accesses specified data/code"
No foreign code - "execute only specified code"
Control flow integrity - "control transfers to legitimate points only"
System call sandboxing
(Code) pointers / data integrity
Data flow integrity

Software Fault Isolation (SFI)

We illustrate one such inline reference monitor, whose goal is fault isolation (also known as "address sandboxing").

Threat model: Attacker controls all memory values in M.

Main idea is to insert some inline instrument inside the program (e.g. via recompiling), such that all memory accesses are forced to point to the memory region M only.

; Original code
mov	(ebp),eax
 
; Modified (instrumented) code
call	(InRange(ebp))	; checks if in defined memory range
jz	error_label	; jumps if Z flag set
mov	(ebp),eax	; move memory pointed by ebp

The checking function call can be made faster, by constraining the possible values of the memory address (i.e. coerce rather than check). This is known as Fast-SFI.

; Original code
mov	(ebp),eax
 
; Faster instrumented code
and	ebp,0x00ff
or	ebp,0xbe00
mov	(ebp),eax

This is still problematic, since one can force instructions to jump directly to the mov (ebp),eax. This can be prevented by using dedicated registers, e.g. reg1, that is outside the control of the attacker - the argument is that regardless of which program instruction you jump to, the memory addressed by reg1 will *always* be safe.

mov	ebp,reg1
and	reg1,0x00ff
or	reg1,0xbe00
mov	(reg1),eax

This simplifies the verification for Fast-SFI, which only involves checking three points:

IRM instructions (after the coercion) exist before memory access
All memory accesses uses the dedicated register
Dedicated registers are only used in IRM instructions

Rule of thumb: We need to minimize the size of the TCB (trusted computing base) that we need to trust in order to verify the correctness of the program, and/or the security properties of the program.

Security principle 1: Minimize TCB

Reduce what one needs to trust, e.g., by separating verifier from the enforcement.

and a collorary:

Security principle 2: Separation of concerns

Separate the policy from its enforcement.

syscall sandboxing

Involves using the higher privileges of the OS/kernel to define syscall policies, e.g. "no exec() system calls", or even an exploit pattern "no exec()-after-read() system calls". This also has standard enumerations in Linux:

Linux seccomp: no syscalls, except exit(), sigreturn(), as well as read() and write() on already-open file descriptors
Linux seccomp-bpf: configurable policies, including syscall
Linux security modules (frameworks for defining policies have been becoming richer, including policies on syscall arguments)

Security principle 3: Principle of Least Privilege

Each component should only have smallest set of necessary privileges.

A collorary implies that allowlist policies (e.g. "seccomp") are preferred over denylist policies (e.g. "no exec-after-read") which are much broader.

Privilege separation

The problem in many software involves the philosophy of bundling functionality. We want to avoid cascading failures when one component fails:

SSH:
- Old SSH servers which has network logic (such as SSL tunneling) which when compromised, will yield filesystem access logic to the attacker).
- OpenSSH separates the two functionalities into two separate processes, such that one cannot directly funnel read files back through the pipe between network-files.
Drive-by download

The solution is the use of privilege separation, i.e. compartmentalization and assignment of least privilege. This principle is used in the Google Chrome browser:

Firefox uses a single-process
- One vulnerability leads to accessing all origins
Google Chrome separates the filesystem from the web code
- One process for the browser kernel (that handles the UI, filesystem access, etc.), and one process for each rendering engine per webpage (that handles HTML, CSS parsing, etc.)
- Security Architecture of Chromium browser

Week 10

Recall the goals of memory safety:

Create memory pointers via permitted operations (as per usual)
Only access memory allocated to pointer, with spatial (allocated memory range) and temporal (memory in scope) bounds

Implementation of spatial memory safety

We can achieve this either by compilation, binary rewriting, or more simply, insertion of memory tracking metadata and inline monitors (assuming such metadata can only be accessed by monitors, and cannot be corrupted).

Referent objects

Use of referent objects implementation for spatial memory safety involves using shadow memory, e.g. JK-Tree, Baggy BC.

This is achieved by creating a metadata entry on each object allocation - marking a valid bounds for each address, then checking each pointer arithmetic operation. Pointers are marked as unsafe if the pointer falls outside this bound (out-of-scope).
- Note that unsafe pointers can be made safe later (by shifting it back to within bounds).
- Implemented by checking during every pointer arithmetic operation, and assigned a bit to designate whether unsafe.
- This makes pointer dereferencing cheap.
Since the bounds is associated directly with an object address, pointers to different objects at the same address will share bounds
- This method does not fully ensure memory safety in aggregate language structures (e.g. classes, objects)

struct { char str[8]; void (*func)(); } node;
char* ptr = node.str;
strcpy(ptr, "overflow...");  // bound for 'node' used for 'ptr'

Fat pointers

The problem of overlapping objects in assigning bounds is avoided by using fat pointers, i.e. pointers that directly store the bounds metadata. One such implementation is Softbound.

Bounds are tracked on allocation, and bounds are checked only during dereference.
- No checks during pointer arithmetic (remember out-of-scope pointers are allowed under C spec)
Typecasts are allowed, bounds preserved by copying pointer metadata to new pointer.
Spatial safety also preserved for arbitrary type casting (e.g. badcasts)

Basic implementation for fat pointer

// allocation
ptr = malloc(size);
ptr_base = ptr;
ptr_bound = ptr + size;
if (ptr == NULL) ptr_bound = NULL;
 
// dereferencing
void _check(ptr, base, bound, size) {
    if ((ptr < base) || (ptr+size > bound)) abort();
}
_check(ptr, ptr_base, ptr_bound, sizeof(*ptr));
value = *ptr;

Unsafe typecasting under fat pointers

class Base { int x; }
class Derived: public Base { int y; }
 
int main() {
    Base* b = new Base();
    Derived* d = static_cast<Derived*>(b);  // badcast
    d->y = 0;  // bounds checked on 'd' are of 'b', so abort!
}
 
/*    Memory
    +---------+  <-- b_base
    |    x    |
    +---------+  <-- b_bound, d_base
    |    y    |  // y is not allocated to 'b'
    +---------+
*/

In both approaches, runtime overhead is rather high, up to twice as long.

Implementation of temporal safety

We also need to track the creation and destruction of pointers, to ensure that deallocated pointers are not accessed. Simple mechanisms like a quick and dirty pointer nullification does not achieve temporal safety (there can be other objects that reference the same memory space, e.g. char* p = (char*) malloc(SIZE); char* q = p;).

Lock-and-key

An alternative implementation is to have a matching lock and key values iff the memory object is temporally safe to access. An overview implementation is shown below. Note that this mechanism provides complete temporal safety, so long as program has spatial safety (to ensure data does not overflow).

Basic implementation of lock-and-key

// Allocation
ptr = malloc(size);
   ptr_key = next_key++;
   ptr_lock_addr = allocate_lock();
   *(ptr_lock_addr) = ptr_key;
   freeable_ptrs_map.insert(ptr_key, ptr);
 
// Assignment
newptr = ptr + offset;
   newptr_key = ptr_key;
   newptr_lock_addr = ptr_lock_addr;
 
// Dereferencing
   if (ptr_key != *ptr_lock_addr) abort();
value = *ptr;
 
// Deallocation
   if (freeable_ptrs_map.lookup(ptr_key) != ptr) abort();
   freeable_ptrs_map.remove(ptr_key);
free(ptr);
   *(ptr_lock_addr) = INVALID_KEY;
   deallocate_lock(ptr_lock_addr);

Performance overhead is still close to 200%, due dynamic checking.

Implementation of full memory safety

We can incorporate the concepts from both lock+key and fat pointers for full memory safety, but this incurs high performance overheads.

Smart pointers

This is the basis for C++17 smart pointers, and in Rust. Such smart pointers wrap pointer objects, with automatic allocation and deallocation, as well as preserve metadata about the resource object. Smart pointers own the resource object (and so can be solely responsible for freeing objects when the pointer goes out of scope), performs bounds-checking.

In C++17, this is implemented as:

unique_ptr: pointers that can be moved but cannot be copied.
- This prevents issues when two pointers point to the same memory, and one deallocates, results in a dangling pointer => temporal error.
- Having only one owner (unique accessor) at any time avoids this issue
shared_ptr: includes reference count metadata, and frees only when number of pointers pointing to resource goes to zero.

In Rust, smart pointers are used by default. This allows for static rule baked into the language to avoid runtime checking.

Ownership and move semantics of smart pointers
For concurrent memory safety:
- Owner can pass immutable reference to borrower function/block
  - Borrower only has read-access to data, to avoid data race
  - Borrower returns the reference back to the owner once done
- Owner can also pass mutable reference to borrower function/block
  - Borrower only has write-access (no read nor deallocate rights)
  - Rust dictates that the borrower cannot have any immutable references (borrower is the only writer of a single reference)

Week 9

Temporal memory safety

Week 8

Spatial memory safety

Memory safety, with some useful references:

Buffer overflows

Format string vulnerabilities

In the code below, because printf first argument is treated as a format string, the user can inject format strings to read directly from the stack, i.e. AAAA %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x.

#include <stdio.h>
 
int main() {
    srand(time(NULL));
 
    char localStr[100];
    int magicNumber = rand() % 100;
    int userCode = 0xBBBBBBBB;
 
    printf("Username? ");
    fgets(localStr, sizeof(localStr), stdin);
 
    printf("Hello ");
    printf(localStr);
    printf("What is the access code? ");
 
    scanf("%d", &userCode);
    if (userCode == magicNumber) {
        printf("You win!\n");
    }
}

Other integer overflow bugs

void f(char *input) {
    char dest_buffer[32];
    char input_len = strlen(input);  // integer overflow possible
 
    if (input_len < 32) {
        strcpy(dest_buffer, input);
        ...
    }
}

as well as the following:

int compare(void) {
    long a = -1;
    unsigned int b = 1;
    return a > b;
}
 
/*
N.B. Comparisons are performed on the same types, which then require
implicit type casting. On x86-64, since long is wider than unsigned int,
the function converts to signed long and returns zero. On GCC x86, long
is defined as the same size as unsigned int, so long is promoted into
an unsigned int, and the function returns 1.
*/

Because C tries to be as generic as possible with data widths, implementation details are often left to the compilers, which can have different behavior depending on the platform. For integers, since the use of two's complement allows the use of the same operation,

Code injection

There are usually constraints to modifying the underlying binary, e.g. read-execute-only. Separate channels exist to modify writable sections of the running code at run time, e.g. by modifying the stack. Suppose an attacker wants to inject a payload that forks a new bash shell:

int main(int argc, char *argv[]) {
    char *sh;
    char *args[2];
 
    sh = "/bin/bash";
    args[0] = sh;
    args[1] = NULL;
    execve(sh, args, NULL);
}

which translates into the following shell code

During the buffer overflow earlier, we:

inject said shellcode to store as instructions, and
hijack the control flow by overriding the return address to the start of the shellcode.

Since it is difficult to determine the correct offset to reach the start of the shellcode, one can prepend a series of NOP instructions to the shellcode (execution will reach shellcode as long as return address points to somewhere within this NOP sled).

return-to-libc attack

Sometimes we don't even require the attack payload, such as in the return-to-libc attack that changes the return address to the execve line within the C library.

Week 6

How to attack system?

HTTP downgrade attacks

Mixed content
- Loading of HTTP sub-resources in a HTTPS page
- Loading of HTTPS sub-resources in a HTTP page (susceptible to MITM replacement https -> http)
  - iframes over HTTP is also a similar vulnerability
Web cookies need to sent with a Secure keyword so that it is sent only over HTTPS
- Only confidentiality is preserved (attackers cannot read), but integrity is not (cross-origin resources)
Cross-site scripting (XSS) by injecting Javascript into security

UI confusion attacks

Use of X509 certificate with example.com\0.evil.com as the common name (string comparison in vulnerable SSL library)
IDN homograph attacks: Replacement of ASCII characters with similar-looking Unicode character
Drawing of misleading elements over IE status bars
Clickjacking using iframes: CSS transparency with opacity: 0.1; pointer-events: none

CA / certificate attacks

Obtaining leaf certificates can be done for free using Let's Encrypt
Detection and rejection of compromised CAs
- Use of web of trust (e.g. PGP) and the concept of emergence to build trust
- Detection of malicious certificates using an append-only Certificate Transparency log
- Fingerprinting of certificates to domain
- List of root CAs trusted
- Revocation of certificates using OCSP (susceptible to privacy issues, timing window of certificate expiry), OCSP stapling

Note

Interesting exercise hosted by Prateek: Ask how compromised CAs can be detected and avoided, then reflect and expand on suggestions.

Slides are a little old...? Certificate pinning is not commonly supported today.

Side-channel leakage

Side-channels are information leakage which are not intended, e.g. consider the RSA exponentiation algorithm:

$$$$ y = g^k \text{mod}\;N $$$$

can be decomposed into squaring-and-multiplying operations.

r0 = 1; r1 = g
for j = len(s)

Another example is the "Lucky Thirteen".

Compromise of crypto primitives

Concurrent encryption and MAC
- Encrypt-and-MAC (SSH): Susceptible to leakage of replayed messages
- MAC-then-Encrypt (SSL): Encryption is malleable, which can be susceptible with padding oracle attacks (e.g. AES-CBC)
- Encrypt-then-MAC (IPsec): Provably secure

This is a pretty good resource to start getting into some legacy attacks:

Week 5

Computational hardness

Perfect secrecy is impractical due to the large key size requirement. We reduce the threat model into an attacker that is computationally bounded, i.e. the adversary can use all deterministic, efficient algorithms that run in polynomial time. This allows for constructions using PRGs that are computationally hard,

$$$$ \text{Enc}(k, m) = \text{PRG}(k) \oplus m $$$$

We can show that pseudorandom generators (stream ciphers) exist from the existence of one-way functions.

Side-note on the use of hard one-way functions:

Large number factoring: RSA
Discrete logarithm: ElGamal
- Requires appropriately chosen multiplicative cyclic group G, otherwise the discrete log can be efficiently solved using Pohlig-Hellman algorithm

Key exchange protocols

Two pioneers in key exchange:

Diffie-Hellman: $$ k = g^{ab}\;(\text{mod}\;p) = g^{ba}\;(\text{mod}\;p) $$
Merkle

Perfect forward secrecy achieved by discarding messages that performs the key generation (what about an on-path attacker that can passively log all transactions?).

Note that Diffie-Hellman (DH) is not a secure key exchange protocol, since it is susceptible to an MitM attack. Adding of signatures in DH Station-to-Station (STS) to sign the messages being sent, we can achieve an authenticated key exchange.

With computational hardness, we can distill more primitives, i.e. digital signatures, and authenticated key DHKE + signatures.

SSL/TLS and HTTPS

A high level overview of SSL/TLS goes roughly as follows:

Negotiation phase (determine ciphersuite to use, making this protocol rather 'meta')
Key exchange phase (distill share symmetric key)
Symmetric key session generation
(re-negotiation)

This is a nice diagrammatic overview:

Some notes:

For the web itself, the root CAs contain roughly 38% GeoTrust, 20% Comodo, 13% GoDaddy, 10% GlobalSign.
Let's Encrypt has trust via better governance.
Append-only log via certificate transparency, to deter misbehaving CAs from issuing fake certificates.

When performing security analysis, we state the assumptions in the threat model (which should be likely to be true), and then formulate security arguments. In HTTPS, the assumptions of the threat model are:

User is using a secure channel

Week 4

How to define a threat model?

Instead of having an attack-centric viewpoint, we focus on the capabilities of the attacker instead.

Strong threat model of network attacker: All traffic between parties can be intercepted. Additionally, eavesdropper can listen in, while malicious actor can modify as well.

Basic cryptographic primitives are elements like encryption, MAC and digital signatures. These can, in turn, provide security goals such as confidentiality, integrity, and authenticity.

To discuss security, we can adopt the following procedure:

Define the (1) setup, (2) adversary capability, (3) security goal
Construct a situation that satisfy the definition

To ensure a secure channel, we need to distill some sort of advantage over any network attacker, either:

by assuming the eavesdropper has a poorer network
by sharing some pre-shared information (and use probabilistic mechanisms)

Symmetric key encryption

Correctness is straightforward to define -> decryption of encrypted message returns the original message. Security on the other hand needs to be less dependent on attacker capabilities, e.g. "attacker cannot guess key from ciphertext" is dependent on attacker prior knowledge (such as cryptanalysis methods).

Threat model:

Adversary knows all algorithms required to establish the secure channel
Adversary knows any distribution of the message set M

In the context of a chosen plaintext attacker (CPA), the security goal is then defined as perfect secrecy, where probability of guessing message is independent of the given ciphertext:

$$$$ Pr[m_{guess}|c] = Pr[m_{guess}] $$$$

Please read the work of Shannon's.

We first consider a minimal one-bit encryption scheme, where $$ M, K, C \in \{0,1\} $$, to derive a scheme that exhibits both correctness and perfect secrecy. The space of possible functions is $$ 2^4 $$, but not all share correctness nor perfect secrecy:

Function	Correctness	Perfect secrecy
AND	No: given c=0, m=?	No: given c=1, we know m=1
c=m	Yes	No: given c=x, we know m=x
c=k	No: given c=x, m=?	Yes
random	No	Yes
XOR	Yes: bijective function	Yes: with uniform k, each value of m equally probable with each value of c

Main limitation is that the key space must be larger than message space.

Key expansion

In practical deployment, we run key expansion over a much smaller pre-shared key. This no longer fulfills the perfect secrecy condition (why?).

In a weaker threat model where the attacker does not have infinite computation power (i.e. bounded by computation in polynomial time), a weaker form of secrecy can still be guaranteed.

Message Authentication Code (MAC)

Integrity is provided by MAC, which has the following definition over the message, key and tag space $$ (K,M,T) $$:

A signing algorithm $$ S(k,m) \rightarrow t \in T $$
A verification algorithm $$ V(k,m,t) \rightarrow \{"yes", "no"\} $$

The goal of the attacker is existential forgery under a chosen message attack (CMA), i.e. guess tag t from message m. The security goal is tend to

Consider again the one-bit MAC, where the tag space $$ |T| = 2^n = 1 $$. The signing algorithm cannot be XOR (as in OTP), because the key can be identified under a CMA (supply m = 0).

Note that the publicly available information now spans the space $$ (M,T) $$, which is two bits. The secret required to have a perfectly secure MAC is then to also have a two-bit secret. We can define the following signing algorithm with two one-bit secrets $$a$$ and $$b$$, and this falls under the family of universal hashes:

$$$$ t = (a\cdot{}m + b) \text{mod} 2 $$$$

Note again that if two combinations of (m,t) are generated with the same secrets a and b, then the secret can be extracted -> randomness over choice of (a,b).

Proof for \

This is the definit

Need to read through this again.

Week 2

Attacks on TCP

To elaborate on the securities afforded by the TCP protocol. Overview of TCP handshake - to authenticate the client to the server.

Some vulnerabilities:
1. Open multiple TCP connections to flood server memory (which needs to store sequence numbers)
2. Predictable / Intercepted sequence number, e.g. by spoofing the source IP and predicting the server's sequence number, the server is misled to think it's communicating with another client
Cannot rely on TCP for IP-authentication, but commonly used, e.g. in /etc/hosts

Need to distinguish what the threat model is when looking at vulnerabilities of the protocol, for example the TCP protocol is not designed to provide authentication on the OSI layer.

Attacks on DNS

Some vulnerabilities:

Modification of in-transit packets
DNS cache poisoning: Possible to also send DNS response packets for caching on local DNS
- Implementation of DNS uses predictable sequence number QID, so can send spoofed DNS response packets
- Attack flow: Allow user to send a DNS request to malicious website to retrieve the QID, then upon second request to another DNS server, reply with a spoofed NS/A IP record. This results in a race condition between malicious actor and authoritative server. Consider flushing the cache by making multiple DNS requests to the DNS resolver.
- Can be mitigated by using a cryptographically secure PRNG

Here's a nice resource on security of DNS.

Overview of firewalls

A stateless firewall inspects packets and checks if they match rules. These devices need to have high overall throughput. Note that the device on which the firewall sits must also be able to support the TCP reconstruction process (why?). Other types of firewalls include stateful firewalls and application-layer firewalls.

On Linux, firewall is implemented within the netfiler framework (iptables), with different hooks for packets, e.g. INPUT, FORWARD, POST-ROUTING.

Some vulnerabilities:

DDoS by routing many requests to the firewall
Does not protect against internal network requests

Some mitigations:

Fingerprinting of packet payloads - counterattack by splitting packets (and use of subsignatures, etc.)

Firewall threat model

Adversary capabilties:

Adversary can send malicious network packets
Adversary is outside the network perimeter

Assumptions:

The network perimeter is correctly defined, which is not necessarily true in the context of "Bring-Your-Own-Device"
Firewall is not compromised
Firewall sees the same data as the endpoint, so for deeper packet inspection, the firewall itself needs to perform packet reconstruction as well (there is still always a semantic gap between firewall and end application)
Defender's policy can distinguish good from bad traffic

Weakness is this threat model: defender needs to know every specific attack pattern for setting policy, and adversary can easily evade these assumptions.

The point of this section is to emphasize that thinking of the threat model conceptually is important for defending against attacks.

Week 1

Module admin

Lecturer: Prateek Saxena (consider the following publications: HTML5 privilege separation (Dropbox), Finding and preventing script injection vulnerabilities).

Content focuses on exploring the attack surface across multiple OSI layers, examples of attacks on each, and more importantly, the defenses that can be deployed. Different threat models can be explored¹⁾.

Assumed knowledge:

Memory safety exploits
Web security exploits
Programming in C
Access control in OS

We need to address not the perceived security risks (e.g. infant abduction vs infant mortality), but address the real security risks instead. Q: What security measure is worth deploying?

Identify the weakest links (i.e. attack surface)
Perform cost-benefit analysis (e.g. ease of deployment vs ease of bypassing security)

Other important information:

Lecture slides are noisy facts - there are some errors
Readings: Goodrich, Computer Security
Project: Implement a multi-threaded program using Rust

Introduction

The internet consisting of autonomous systems (AS) which function like a cloud of routers, that can intercommunicate using the BGP protocol (seems like an internal routing table within routers, with inter-router communication over 179/tcp). This is related to the concept of gateways.

Attacks on BGP

A good high-level overview of BGP. Some vulnerabilities:

BGP hijacking either by advertising smaller more-specific subnets or faster routes. This additionally allows traffic sniffing on that gateway router.
Swamping a BGP link and force traffic onto a different AS
Re-advertise withdrawn routes

There are mitigations, from Cloudflare advertises its nodes as a single ASN globally, then

Attacks on IP Procotol

No integrity: Data and headers can be polluted
- Compromising the source IP address can cause denial-of-service
  - Anonymous infections can be performed (e.g. Slammer worm) by source IP randomization
  - Smurf attack can also be performed by modifying the source IP in ICMP ping request
- The checksum is used for error correction, i.e. the threat model is not adversarial
No confidentiality (packet sniffing can be performed)

¹⁾

Threat model defines the desired security property / goal, the capabilities of the attacker, as well as assumptions of the setup