QT5104 Topics in Quantum Information Theory

A collection of notes manually scribed for AY2024/2025 Sem 2.

Stopped attending due to personal commitments and holidays :)

Week 7.1

BBM-92 QKD security proof

Quick description of protocol:

State preparation (physics)
Randomization
Measurement (physics)
Parameter estimation
Error correction
Privacy amplification (randomness extraction)

During state preparation, Alice and Bob initially shares (also with Eve) a state $$\rho_{A^mB^mE}$$, where $$m$$ is the number of qubits/rounds $$A^m = (A_1,A_2,A_3,\cdots{},A_m)$$. Without loss in generality, we assume the state to be prepared by Eve. The ideal state would be $$\rho_{A^mB^m} = |\Psi_0\rangle\langle\Psi_0|_{AB}^{\otimes{}m}$$. Other necessary parameters:

$$\delta\in(0,1)$$ (threshold for parameter estimation, to accommodate for errors)
$$\text{synd}: \{0,1\}^n \rightarrow \{0,1\}^r$$ where $$r < n \equiv m - k$$.
$$t,e$$ (output sizes in bits of $$H_\text{ec},H_\text{pa}$$)

Randomization involves choosing random seeds (e.g. by Alice and shared with Bob (and Eve)):

$$\Phi\in\{0,1\}^m$$ (measurement basis, must be shared after state distribution)
$$\Pi\in[m]$$ (subset of rounds for use in parameter estimation, usually $$|\Pi|=k\ll m$$)
$$H_{\text{ec}}, H_{\text{pa}}$$ (random hash function for error correction and privacy amplification)

During measurement, for each round $$i = \{1,\cdots{},m\}$$,

$$\Phi_i$$	Measure $$A_i,B_i$$ in	Record $$A_i$$ outcome	Record $$B_i$$ outcome
0	$$\{\|0\rangle,\|1\rangle\}$$	$$V_i$$ if $$i \in \Pi$$ else $$X_i$$	$$W_i$$ if $$i \in \Pi$$ else $$Y_i$$
1	$$\{\|+\rangle,\|-\rangle\}$$	$$V_i$$ if $$i \in \Pi$$ else $$X_i$$	$$W_i$$ if $$i \in \Pi$$ else $$Y_i$$

Parameter estimation involves calculating the Hamming distance i.e. $$\sum_{i\in\Pi} \mathbf{1}\{v_i\neq{}w_i\} / k \le\delta$$ and abort if larger than $$\delta$$. More bookkeeping with an additional register that stores $$\phi$$ symbol if the test passes, or $$\perp$$ if test fails.

Error correction is mostly a classical information theory thing. In a basic description, Alice shares a error syndrome of size $$r$$ with Bob, whom then distills a approximate $$\hat{X}$$. It can be shown that $$r \approx nh(\delta)$$. The result of which is then mapped to $$T$$ and $$T'$$ respectively for Alice and Bob, using the hash function $$H_\text{ec}$$ which is used for checking the error correction was successful. More bookkeeping with flag register to verify $$T=T'$$.

Associated lemma where the probability of $$X\neq\hat{X}$$ in cases where $$F^\text{ec}=\phi$$, can be bounded above by $$2^{-t}$$.
- This is determined using the two-universal property of some class of hash functions.

Privacy amplification then finally uses $$H_\text{pa}$$ to extract keys $$K_a$$ and $$K_b$$.

Assumptions

Finite-dimensional quantum mechanics
Sealed laboratories
Random seeds
Authenticated classical channels (in practice, mostly a one-way communication from Alice to Bob, except for a potential abort flag from Bob)
Deterministic detection (always get a measurement outcome)
Commuting measurements (independent subsystems)
Complementarity of measurements (small overlap between measurement bases)

Security and robustness

Goal is to have $$||\omega_{K_aK_bSCFE|F=(\phi,\phi)} - \chi_{K_AK_B}\otimes{} \omega{}_{SCFE|F=(\phi,\phi)}|| \le \epsilon $$.

$$K_a, K_b$$ are the distilled keys
$$S$$ are all of the seeds, i.e. $$(\Phi,\Pi,H_\text{ec},H_\text{pa})$$
$$C$$ is a transcript of all the communication, i.e. $$(V^k, Z, T)$$
$$F$$ is the flag register when Bob accepts the key, i.e. $$(\phi,\phi)$$ since we don't care if the key is discarded
$$E$$ is the eavesdropper's quantum state
$$\chi_{K_AK_B}$$ is the perfectly correlated classical state shared between Alice and Bob

Important, we can never ensure the trace distance is small, in the case where we condition on $$F=(\phi,\phi)$$ (since we have a chance that Eve chooses an eigenstate of one of the measurement basis, and Alice and Bob measures only in that basis), i.e. we cannot say that the key is secure conditioned on passing all the checks. So a more realistic goal is,

$$$$ Pr[F=(\phi,\phi)]\cdot{}||\omega_{K_aK_bSCFE|F=(\phi,\phi)} - \chi_{K_AK_B}\otimes{} \omega{}_{SCFE|F=(\phi,\phi)}|| \le \epsilon $$$$

and we need to choose $$l$$ such that,

\begin{align*} l &\approx H_\text{min}^\epsilon (X^n | SCFE) \\ &\ge{} H_\text{min}^\epsilon (X^n|E\Phi\Pi) - r - t - k \\ &\ge{} -H_\text{max}^\epsilon (\bar{X}^n | B^m \Phi\Pi) + n - r - t - k \\ &\ge{} -H_\text{max}^\epsilon (\bar{X}^n | \hat{\bar{X}}^n) + n - r - t - k \\ &\ge{} (1-h(\delta))n - r - t -k \end{align*}

i.e. the min-entropy is reduced using the side-information $$S,C$$, but we can bound this using Eve's knowledge of $$\Phi$$ and $$r$$ and $$t$$. Note that Eve can potentially design the state such that there are correlations between $$V$$ and $$X$$.

The second-last bound uses the relation $$H_\text{min}^\epsilon + H_\text{max}^\epsilon \ge{} n$$. This uses the idea of the hypothetical guesses of what Alice could have measured (in the complementary basis, e.g. $$X$$ basis if key extraction was performed in the $$Z$$ basis), using the information that Bob could have produced.
Instead of looking at Eve's state, we look at correlations between Alice and Bob.

Week 6.2

This might be a recap: The goal now is to map from $$\rho_{XB}$$ to $$\rho_{ZB} \equiv \rho_{f(X)B}$$ where B is some independent subsystem, so as to extract randomness.

Explicitly, $$\rho_{XB} = \sum_x p_x |x\rangle{}\langle{}x|\otimes{} \rho_B^x$$
We have the initial guarantee that $$H_{\text{min}}(X|B) \ge{} \ldots{}$$

In order to make the process independent of $$\rho_{XB}$$, we use random functions. Given a seed $$S$$ with some pmf distribution $$p_S$$ (usually taken to be uniform), we apply the function $$f_S$$ to $$X$$,

The goal is to have $$\Delta{}(\rho_{ZBS}, \Pi_Z \otimes{}\rho_B \otimes{} \rho_S) = E_S [\Delta{}(\rho_{f_S(X)B}, \Pi_Z\otimes{} \rho_B)]\le{} \epsilon{}$$.
$$\rho_{XBS} = \rho_{XB} \otimes{} \sum_S p_S |s\rangle{}\langle{}s| \le{} \epsilon $$
$$\rho_{ZBS} = \rho_{f_S(S)BS} = \sum_{x,s} |f_S(x)\rangle\langle{}f_S(x)|_z \otimes{} \rho_B^x \otimes{} |s\rangle\langle{}s|$$
Directly outputting $$S$$ as the new random is kind of cheating, so we include it as part of the random function instead.

A definition of two-universal family of functions, given as the small collision probability $$Pr_S[f_S(x) = f_S(x')] \le{} d_Z^{-1} \quad{}\forall{} x \ne x'$$, given a family of random functions $$f_S$$ and $$p_S$$ such that the function maps from X to Z.

Using this, we can choose from a much smaller family of random functions (so that the constraint on the size of the seed can be relaxed a little).
- Equivalent to $$E_S[\mathbf{1}\{f_S(x) = f_S(x')\}] \le{} d_Z^{-1}$$.
Consider the finite field $$\mathbf{F}_{2^n}$$ and non-zero $$s$$ from the field. We can consider $$f_S: \{0,1\}^n \rightarrow \{0,1\}^l, \quad{} l\le{}n$$.
- In other words, $$x \mapsto{} x\cdot{}s (\text{mod} 2^n)$$, so we have $$f_S(x) = f_S(x') \Rightarrow{} (x-x')s = n2^l$$ given natural number $$n$$
- We can iterate through all possible values of $$s$$ and we can retrieve back the whole field (except zero), i.e. this condition is satisfied $$(2^{n-l} - 1)$$ times.
- Therefore, $$Pr[f_S(x)=f_S(x')] = \frac{2^{n-l} - 1}{2^n - 1} \le{} 2^{-l}$$
The size of this family is roughly $$(2^l)^{2^n}$$ where $$S$$ is exponential in $$n$$, i.e. $$\text{log}|S| = n$$

Using the Holdr inequality, we have another inequality $$||ABC||_1 \le{} ||A||_4 ||B||_2 ||C||_4$$. This allows us to bound the trace distance:

\begin{align*} 2\Delta{}(\rho_{f_S(X)B}, \Pi_Z\otimes{} \rho_B) &= || (\rho_B^{1/4}\rho_B^{-1/4}) (\rho_{f_S(X)B} - \Pi_Z \otimes{} \rho_B) (\rho_B^{-1/4}\rho_B^{1/4})||_1 \\ &\le{} ||I_Z \otimes{} \rho_B^{1/4} ||_4 ||\rho_B^{-1/4} (\rho_{f_S(X)B} - \Pi_Z \otimes{} \rho_B) \rho_B^{-1/4} ||_2 \cdots{} \\ &= \sqrt{d_Z tr(\rho_B^{-1/2} (\rho_{f_S(X)B} - \Pi_Z \otimes{} \rho_B)^2 \rho_B^{-1/2} )} \\ &= \sqrt{d_Z tr(\rho_B^{-1/2} \rho_{f_S(X)B} \rho_B^{-1/2} \rho_{f_S(X)B} ) - 1} \end{align*}

Some notes:

Something about the mixed states yielding -1 and something else gives +1, and the trace over $$(\Pi_B\otimes{}\rho_B)$$ gives $$-1/d_Z$$, which then finally pops up outside the trace.
Again, $$\rho_{f_S(X)B} = \sum_z \sum_x \mathbf{1}\{f_S(x)=z\}\cdots{}p_z |z\rangle\langle{}z| \otimes{} \rho_B^x$$
Applying trace will force all the z to be equal.

\begin{align*} &E_S[tr(\rho_B^{-1/2} \rho_{f_S(X)B} \rho_B^{-1/2} \rho_{f_S(X)B} ) ] \\ &= E_S[\sum_z \sum_{x,x'} \rho_x\rho_{x'} \mathbf{1}\{ f_S(x) = z \}\mathbf{1}\{ f_S(x') = z \} tr(\rho_B^{-1/2}\rho_B^x \rho_B^{-1/2}\rho_B^{x'})] \\ &= E_S[\sum_{x,x'} \rho_x\rho_{x'} \mathbf{1}\{ f_S(x) = f_S(x') \}tr(\rho_B^{-1/2}\rho_B^x \rho_B^{-1/2}\rho_B^{x'})] \\ &= \sum_{x} p_x^2 tr(\rho_B^{-1/2}\rho_B^x \rho_B^{-1/2}\rho_B^{x'}) + \sum_{x\ne{}x'} p_x p_{x'} E_S[\mathbf{1}\{f_S(x) = f_S(x')\}] tr(\rho_B^{-1/2}\rho_B^x \rho_B^{-1/2}\rho_B^{x'}) \\ &\le{} \sum_{x} p_x^2 tr(\rho_B^{-1/2}\rho_B^x \rho_B^{-1/2}\rho_B^{x'}) + \frac{1}{d_Z} \sum_{x\ne{}x'} p_x p_{x'} tr(\rho_B^{-1/2}\rho_B^x \rho_B^{-1/2}\rho_B^{x'}) \\ &= tr(\rho_B^{-1/2}\rho_{XB}\rho_B^{-1/2}\rho_{XB}) + \frac{1}{d_Z} \\ &\le{} 2^{-H_\text{min}(XB)_\rho} + \frac{1}{d_Z} \end{align*}

We can use the upper bound because every diagonal term is positive.
By definition of marginals $$\rho_B = \sum_x p_x \rho_B^x$$, we have $$\sum_{x,x'} p_x p_{x'} tr(\rho_B^{-1/2}\rho_B^x \rho_B^{-1/2}\rho_B^{x'}) = 1$$. We initially split into the diagonal and off-diagonal.

\begin{align*} E_S[2\Delta{}(\rho_{f_S(X)B}, \Pi_Z\otimes{} \rho_B)] &\le{} \sqrt{d_Z tr(\rho_B^{-1/2} \rho_{f_S(X)B} \rho_B^{-1/2} \rho_{f_S(X)B} ) - 1} \\ \le{} \sqrt{d_Z 2^{-H_\text{min}(XB)_\rho}} \end{align*}

The first inequality is from Jensen's inequality.
Using definition of min-entropy.

Interpretation: we want $$\frac{1}{2}\sqrt{d_Z 2^{-H_\text{min}(X|B)}} \le{} \epsilon{}$$.

We obtain the sufficient condition for the randomness extractor to work: $$\log(d_Z) \le{} H_\text{min}(X|B)_\rho{} - 2 \log{\frac{1}{2\epsilon{}}}$$
- The term $$\log{\frac{1}{\epsilon}}$$ tends to be small relatively.
We can also show that this is optimal.

We use the smooth-min entropy $$H_\text{min}^\epsilon{}(A|B)_\rho = \max_{\tilde{\rho} \approx{}_\epsilon{} \rho} H_\text{min} (A|B)_{\tilde{\rho}}$$

We define $$\tilde{\rho} \approx{}_\epsilon{} \rho$$ means $$P(\tilde{\rho},\rho)\le{} \epsilon{}$$
- We use purified distance because it behaves the nicest in this context.
In the context of QKD, the term on the right is usually not calculated exactly, because we need to find a bound without knowing what state Eve could have.
- In practice, the smooth-min entropy (and not the min entropy itself) can be bounded nicely.

\begin{align*} E_S[2\Delta{}(\rho_{f_S(X)B}, \Pi_Z\otimes{} \rho_B)] &\le{} \sqrt{d_Z 2^{-H_\text{min}(X|B)_\rho}} \\ &= 2\epsilon{} + \sqrt{d_Z 2^{-H^\epsilon{}_\text{min}(X|B)_\rho}} \end{align*}

$$\epsilon{}$$ is the quality of the randomness generator, while the new $$\epsilon{}'$$ chosen is optimized over, to generate a tighter bound.
- We can choose $$\epsilon{}' = \epsilon{}/4$$, then we have $$\log(d_Z) \le{} H_\text{min}^{\epsilon/4} (X|B) - 2 \log{\frac{1}{\epsilon}}$$
We can apply the lemma that the entropy after applying the random function increases (harder to guess the input given an output, than the other way around): $$H_\text{min}^\epsilon{}(f(X)|B) \le{} H_\text{min}^\epsilon{}(X|B)$$
- In the case of $$\epsilon = 0$$, $$H_\text{min}(Z|B) = \log{d_Z}$$.
- Now $$H_\text{min}^{\epsilon'}(Z|B) \le{} \log(d_Z)$$ because the product state is close to $$\rho_{ZB}$$ since $$\frac{1}{2}|| \rho_{ZB} - \rho_Z \otimes{} \rho_B || \le{} \epsilon{}$$
- This somehow provides a somewhat tight lower bound.
In a very large system, we can apply iid...?

After the break, we shall learn how to bound the smooth-min entropy, introduce some QKD protocols, and also bound that. SO COOL.

Week 6.1

Missed!

Week 5.2

Missed!

Week 5.1

A nice reference: Watrous's.

Surprisal

We define surprisal $$s(p_x)$$ with the following desirable properties, given a random variable $$X \in \mathcal{X}$$ and pmf $$p_x = P_X(x)$$.

$$s(1) = 0$$, monotonically decreasing with $$p_x$$ and additivity $$s(p_xp_y) = s(p_x) + s(p_y)$$
- $$s(p_x) = \log \frac{1}{p_x}$$
- We can also define the random variable $$S(X) = \log \frac{1}{P_X(X)}$$
This naturally leads to a formulation of Shannon entropy as $$H(x) = E[S(X)] = \sum_x p_x \log \frac{1}{p_x}$$
- Entropy of fair coin is 1, and this is the maximal value for a general biased coin.

Entropy

The quantum entropy generalizes Shannon entropy for classical states, and should hold unitary invariance $$H(\rho) = H(U\rho{}U^\dagger{})$$.

We can use an eigenstate (where the coefficients represent probabilities) $$\rho = \sum_i \lambda_i |i\rangle\langle{}i|$$ (and if not, rotate into them). It then follows $$H(\rho) = \sum_i \lambda_i \log \frac{1}{\lambda_i}$$ as well.
- Note the bounds $$0\le{} H(\rho)\le{} \log d$$.
Monotone under (unital) noise channels $$\epsilon(I) = I$$, e.g. depolarizing channels, i.e. $$H(\rho) \le{} H(\epsilon(\rho))$$
- If one adds noise to a state, the uncertainty about it increases.
- Amplitude-damping channels do not fall under this, since it pushes the state to 0 instead of identity.
It is convenient to represent the entropy of the (sub-)system rather than specific states itself: $$H(A)_\rho \equiv H(\rho_A)$$

There are von Neumann measurements, i.e. (Rank-1) projective measurements:

$$\mathcal{M}_{A\rightarrow{}X}(\cdot{}) = \sum_x |x\rangle{}\langle{}\phi_x|(\cdot{})|\phi_x\rangle\langle{}x|$$
- The states form an ONB.
- This measurement checks if it belongs to the state, and transfers to a register. Since this is projective, this is considered rank-1.
- $$\mathcal{M}_{A\rightarrow{}X}(I_A) = I_X$$
$$H(A)_\rho = \min_{A\rightarrow{}X}H(X)$$ achieved when measuring in the eigenbasis.

Interesting point: in classical systems, the entropy increases with the number of systems, e.g. $$H(XY) \ge{} H(X)$$, whereas in the case of purification, the larger system (of a singular pure state) has an entropy of 0, even though the smaller system can have non-zero entropy.

Given pure state $$|\rho\rangle{}_{AB}$$ with $$\rho_A,\rho_B$$ marginals, we have $$H(A) = H(B) \equiv H(\rho_B)$$ and $$H(AB) = 0$$.

Proof: We can always decompose $$|\rho_{AB}\rangle = \sum_i \sqrt{\lambda_i} |\phi_i\rangle \otimes{} |\theta_i\rangle$$ where $$\lambda_i$$ are eigenvalues of both $$\rho_A$$ and $$\rho_B$$ given their corresponding eigenvectors.
- Can be seen as SVD after applying vector isomorphism, so we can always find such a decomposition where the eigenvalues are the same.
- Since the two states have the same eigenvalues, the spectrum (and hence entropies) are the same.

Conditional entropy

We define the conditional entropy $$H(A|B) := H(AB) - H(B)$$,

$$H(A|B)$$ can be negative! So the conditional entropy is better described as the amount of entropy to add to obtain the entropy of the joint system.
- $$H(A|B) \ge{}0$$ if $$\rho_{AB}$$ is separable, i.e. $$H(A|B)\lt{}0$$ is a witness of entanglement.
This can also be seen as a measure of uncertainty about $$A$$ from the perspective of an observer with access to $$B$$.
- $$B$$ in this context is called side information.

The data-processing inequality (DPI) is the statement: $$H(A'|B')_{\epsilon\otimes{}\mathcal{F}(\rho)} \ge{} H(A|B)$$, for a unital quantum channel $$\epsilon_{A\rightarrow{}A'}$$ and quantum channel $$\mathcal{F}_{B\rightarrow{}B'}$$.

DPI represents monotonicity, with the former taking a more operational perspective.
Applying a quantum channel may gradually make it more separable, so $$H(A|B)$$ becomes more non-negative.
We also have $$H(A|BC) \le{} H(A|B)$$, which is DPI on $$tr_C$$.
- Another perspective: If we drop some side-information on $$A$$, our uncertainty on $$A$$ increases.
- Can also be interpreted as strong sub-additivity (joint entropy is smaller than sum of its parts) and is equivalent (although this was rather hard to proof formally):

\begin{align*} H(A|BC) &\le{} H(A|B) \\ H(ABC) - H(BC) &\le{} H(AB) - H(B) \\ H(AC|B) &\le{} H(A|B) + H(C|B) \end{align*}

If $$\rho_{ABC}$$ is pure, we get the duality $$H(A|B) = -H(A|C)$$. The proof uses the idea that $$H(AB) = H(C)$$ since both parts complete the joint state: $$H(A|B) = H(AB) - H(B) = H(C) - H(AC) = -H(A|C)$$.

Interpretation: the more information $$B$$ has on $$A$$, then $$C$$ has a greater lack of information on $$A$$. This uses the extension that C creates a purification $$\rho_{ABC}$$ from the state $$\rho_{AB}$$.

We also define mutual information $$I(A:B) = H(A) - H(A|B)$$, which is a measure of correlation (applies to both classical and quantum).

Bound: $$0\le{} I(A:B) \le{} 2\log d$$
We also have another DPI, where correlations decrease after applying local operations $$\epsilon_{A\rightarrow{}A'}, \mathcal{F}_{B\rightarrow{}B'}$$: $$I(A':B')_{\epsilon\otimes{}\mathcal{F}(\rho)} \le{} I(A:B)_\rho$$

Relative entropy

For two states $$\rho$$ and $$\sigma$$, we define the relative entropy $$D(\rho||\sigma) = tr(\rho(\log \rho - \log \sigma))$$.

More generally, only $$\sigma \ge{}0$$ needs to hold.
This allows us to derive all the data-processing inequalities: $$D(\rho||\sigma) \ge{} D(\epsilon(\rho)||\epsilon(\sigma))$$ for any quantum channel $$\epsilon$$

Thoughts

Week 4.2

CHSH game

CHSH quantum strategy: $$\rho_{A'B'} = |\Psi_0\rangle{}\langle{}\Psi_0|$$, $$M^0 = \{|0\rangle{}\langle{}0|, |1\rangle{}\langle{}1|\}, M^1 = \{|+\rangle{}\langle{}+|,|-\rangle{}\langle{}-|\}$$, $$N^0 = \{|\pi/8\rangle{}\langle{}\pi/8|, |5\pi/8\rangle{}\langle{}5\pi/8|\}, N^1 = \{|-\pi/8\rangle{}\langle{}-\pi/8|,|3\pi/8\rangle{}\langle{}3\pi/8|\}$$

Given Alice $$\alpha$$ and Bob $$\beta$$, $$tr(|\alpha\rangle\langle{}\alpha|\otimes|\beta\rangle\langle{}\beta| \cdot{} |\Psi_0\rangle\langle{}\Psi_0|) = \frac{1}{2}|\langle{}\alpha|\beta\rangle|^2 = \frac{1}{2}\cos^2(\alpha-\beta)$$
- This can be solved simply by parameterizing all of the terms into the computational basis, or alternatively writing the Bell state in terms of the $$\{|\alpha,\alpha\rangle{},|\alpha+\pi/2\rangle,\alpha+\pi/2\rangle\}$$ basis.
- Even more cool way of solving this, by rewriting $$|\beta\rangle{}$$ as its adjoint, then the trace becomes $$\frac{1}{2}tr(|\alpha\rangle{}\langle{}\alpha|\beta\rangle{}\langle{}\beta|)$$.
- This makes intuitive sense: if both angles are the same, then the two are perfectly correlated (and perfectly anti-correlated with a $$\pi/2$$ difference).
$$p_{\text{win}}^q(\text{CHSH}) = \cos^2(\pi/8) = \frac{1}{2}+\frac{1}{2\sqrt{2}} \approx{} 0.85\ldots{}$$

We can quickly show this is the supremum using Naimark dilation: Given a POVM $$\{M_x\}_x$$ on A, there exists a projective POVM (PVM) $$\{P_x\}_x$$ and an isometry $$U:A\rightarrow{}AX$$ such that $$\langle{}M_x,A\rangle = \langle{}P_x,U\rho{}U^\dagger{}\rangle{}$$.

The isometry takes the state into a larger space.
- More physically, the isometry introduces an ancilla and then a rotation, on which we then perform a projective measurement.
- This allows us to
One such construction: $$P_x = I_A\otimes{}|x\rangle{}\langle{}x|$$ and $$U= \sum_x \sqrt{M_x} \otimes{} |x\rangle{}$$.
- We easily see $$P_x$$ is a projector. We also verify the isometry $$U^\dagger{}U = \sum_{x,x'} \sqrt{M_x}\sqrt{M_{x'}}\langle{}x'|x\rangle{} = \sum_x M_x = I$$.
- Writing the inner product, $$\langle{}P_x,U\rho{}U^\dagger{}\rangle{} = \langle{}U^\dagger{}P_xU,\rho\rangle{} = \langle{}M_x,\rho\rangle{}$$.
This implies that the CHSH game has an additional structure: $$p_\text{win}^q$$ can be achieved only with pure states and projective measurements, so the winning probability calculated is already optimal (without having processed all possible strategies).

The upper bound can be derived and is given by Tsirelson's bound, using $$P(a,b|x,y) = \langle{}\phi|P_{A'}^x\otimes{}Q_{B'}^x|\phi\rangle{}$$:

In the case of $$x=y=0$$, the winning probability is $$\sum_{a,b}\mathbf{1}\{a=b\}P(a,b|0,0) = \frac{1}{2}+\frac{1}{2}\sum_{a,b} (-1)^{a+b} P(a,b|0,0) \equiv \frac{1}{2} + \frac{1}{2}\langle{}\phi|C^x \otimes{}D^y|\phi\rangle{}$$
- $$C^x \equiv \sum_a (-1)^a P_a^x$$ by expanding the exponents, and this is called an observable (with +1 and -1 eigenvalues, and $$(C^x)^2 = I$$

The winning probability then becomes:

\begin{align*} p_\text{win}^q &= \frac{1}{2} + \frac{1}{8}\langle{}\psi|C^0\otimes{}D^0 + C^0\otimes{}D^1 + C^1 \otimes{}D^0 - C^1 \otimes{}D^1|\psi\rangle \\ &\equiv{} \frac{1}{2} + \frac{1}{8}\langle{}\psi|B|\psi\rangle \\ &\le{} \frac{1}{2} + \frac{1}{8}||B||_\infty{} \\ &= \frac{1}{2} + \frac{1}{8}\sqrt{||B||^2_\infty{}} \end{align*}

B is called the Bell observable.
$$B^2 = 4I + [C^0,C^1]\otimes{}[D^0,D^1]$$
- Using the triangle inequality ($$||AB||_\infty{} \le{} ||A||_\infty{}||B||_\infty{}$$) gives us $$||[C^0,C^1]||_\infty{} \le{} 2||C^0||_\infty{}||C^1||_\infty{} = 2$$
- Using the fact $$||A\otimes{}B||_\infty{} = ||A||_\infty{} ||B||_\infty$$, we finally get $$||B^2||\le{} 4 + 4 = 8$$
This also gives the probability $$p_\text{win}^q \le{} \frac{1}{2} + \frac{1}{8}\sqrt{||B^2||_\infty} \le{} \frac{1}{2} + \frac{1}{2\sqrt{2}}$$.

Mermin magic square game

The game follows $$X=Y=\{1,2,3\}, A=B = \{-1,1\}^{\otimes{}3}$$ and $$V(a,b,x,y) = \mathbf{1}\{a_1a_2a_3=1, b_1b_2b_3=-1, a_y = b_x\}$$. This is better visualized as a 3x3 square, where the Alice prepares a row and Bob prepares a column such that their intersection is the same.

The optimal classical strategy is to fill the square:

x \ y	1	2	3
1	1	1	1
2	-1	1	-1
3	1	-1	?

where the last ? cannot be agreed on between Alice and Bob. This thus gives $$p_\text{win}^{\text{cl.}} = \frac{8}{9}$$.

For the quantum game, we rely on:

Commuting (Hermitian) observables $$H,K$$ such that $$[H,K]=0$$, there exists an ONB $$\{|\psi_i\rangle{}\}_i$$ with $$H|\psi_i\rangle = \lambda_i|\psi_i\rangle{}$$ and $$K|\psi_i\rangle = \mu_i|\psi_i\rangle{}$$.
- The idea is for each row entry should commute with each other. The set of $$I\otimes{}Z$$, $$Z\otimes{}I$$ and $$Z\otimes{}Z$$ all commute with each other to give $$I$$.
- This means whichever state we measure, the product of the observables along the row will always give $$I$$ (the parity is always 1).
Filling the rest of the square gives the following, where the parity of each row is $$I$$ and column is $$-I$$:

x \ y	1	2	3
1	$$I\otimes{}Z$$	$$Z\otimes{}I$$	$$Z\otimes{}Z$$
2	$$X\otimes{}I$$	$$I\otimes{}X$$	$$X\otimes{}X$$
3	$$-X\otimes{}Z$$	$$-Z\otimes{}X$$	$$Y\otimes{}Y$$

The joint eigenbasis gives the eigenvectors in the computational basis, i.e. $$|\theta_0\rangle{} = |00\rangle{}$$, etc. To win the game, Alice and Bob need to share two joint maximally-entangled states. Let $$O_{xy}$$ represent the observable in cell (x,y), and the state (on either Alice or Bob) be $$|\psi\rangle{} = |\Psi_0\rangle{}\otimes{}|\Psi_0\rangle{}$$, the winning probability is thus,

\begin{align*} p_\text{win}^q &= \frac{1}{2}(1 + \langle{}O_{xy}\otimes{}O_{xy},|\psi\rangle{}\langle{}\psi|\rangle{}) \\ &= \frac{1}{2}(1 + \langle{}O_{xy}O_{xy}^T\otimes{}I,|\psi\rangle{}\langle{}\psi|\rangle{}) \\ &= \frac{1}{2}(1 + \frac{1}{4}tr(O_{xy}O_{xy}^T)) \\ &= 1 \end{align*}

Experimental realization of this has not been performed.

Week 4.1

Quantum formalism recap

POVM: $$\{H_x\}_x$$
Instrument: $$\mathcal{M}_{A\rightarrow{}AX}(\cdot{}) = \sum_x |x\rangle{}\langle{}x| \otimes \sqrt{M_x}(\cdot{})\sqrt{M_x} $$
Measurement: $$tr_A \cdot{} M_{A\rightarrow{}AX} \equiv M_{A\rightarrow{}X}$$
cq-state and post-measurement state

Quantum correlations and games

A two-party game $$G=(X,Y,A,B,P_{XY},V)$$, where $$X,Y,A,B$$ are discrete and finite alphabets (input/output), $$P_{XY}$$ is the pmf and $$V:A\times{}B\times{}X\times{}Y \rightarrow \{0,1\}$$.

Alice gets input $$x\in{}X$$ and outputs $$a\in{}A$$, and correspondingly for Bob. They win if $$V(a,b,x,y) = 1$$.
Example: CHSH game, where $$X=Y=A=B=\mathbf{F}_2$$ and $$P_{XY}(x,y)=\frac{1}{4} \forall{} x,y$$, and $$V(a,b,x,y) = 1 \,\text{if}\,a+b=xy \,\text{else}\,0$$.
- Best classical strategy is for Alice and Bob to always output 0, so the winning probability is 75%. An alternative strategy is where Bob follows the input and Alice always outputs 0.

A classical strategy $$S$$ for Alice and Bob is given by $$Q_{X'Y'}$$ and $$f_A:X\times{}X' \rightarrow A$$ and $$f_B:Y\times{}Y'\rightarrow{} B$$.

Interpretation: Alice and Bob has some shared randomness that can be used to determine Alice's and Bob's outputs.

\begin{align*} P_{\text{win}}^{\text{cl.}}(G,S) &= Pr[V(f_A, f_B, X, Y) = 1] \\ &= \sum_{x,x',y,y'} P_{XY}(x,y)Q_{X'Y'}(x',y') \mathbf{1}\{V(\cdots{})=1\} \\ &= \sum_{x,y,a,b} P_{XY}(x,y)S(a,b|x,y) \mathbf{1}\{V(a,b,x,y)=1\} \end{align*}

$$\mathbf{1}$$ is the indicator function, which is 1 only if the condition specified is satisfied. Similar to the delta function.
$$S(a,b|x,y) = \sum_{x',y'} Q_{X'Y'}(x',y') \cdot{} \mathbf{1}\{f_A(x,x')=a\} \cdot{} \mathbf{1}\{f_B(y,y')=b\}$$
A strategy is deterministic if $$f_A(x,x') = f_A(x) \forall{} x,x'$$.
The classical winning probability of a game is $$p_\text{win}^\text{cl.} (G) = \sup_S p_\text{win}^\text{cl.} (G,S)$$
- Lemma: The maximum is achieved by a deterministic strategy. We rely on the fact that the expectation is always upper-bounded by the maximum.
- Proof: $$p_\text{win} \le{} \max_{x',y'} \sum_{x,y} P_{XY}(x,y) \mathbf{1}\{V(f_A(x,x^*), f_B(y,y^*),x,y)=1\}$$, where $$(x^*,y^*) = \text{argmax} \sum_{x,y} P_{XY}(x,y) \mathbf{1}\{V(f_A(x,x^*), f_B(y,y^*),x,y)=1\}$$

A quantum strategy $$S_q$$ is given by a state $$\rho_{A',B'}$$ and a collection of POVMs $$\{M^x_a\}_a$$ on A' and $$\{N_b^y\}_b$$ on B' for all x,y.

Recall: $$\sum_{a\in{}A} M_a^x = I_A\;\forall{}x$$.
The strategy is now given as $$S_q(a,b|x,y) = tr((M_a^x \otimes{} N_b^y) \rho_{A'B'})$$, with the winning probability $$p_\text{win}^q(G) = \sup_{S_q} p_\text{win}(G,S_q)$$.
- This is the most general case.
- Even if we locally apply channels on the state, we see that $$(M_a^x \otimes{} N_b^y)(\epsilon_{A'\rightarrow{}A'}^x\otimes{} I)\rho_{A'B'} = (((\epsilon_{A'\rightarrow{}A'}^x)^\dagger{}M_a^x) \otimes{} N_b^y) \rho_{A'B'}$$.
- We can show that $$\{(\epsilon_{A'\rightarrow{}A'}^x)^\dagger{}M_a^x\}$$ is also a POVM, by noting that (although cptp maps are not necessarily unital) the adjoint of cptp maps are unital (i.e. preserves the identity $$F: F(I_A) = I_A$$), so we see that $$\sum_a (\epsilon_{A'\rightarrow{}A'}^x)^\dagger{} M_a^x = (\epsilon_{A'\rightarrow{}A'}^x)^\dagger{} (I_A) = I_A$$.
- Proof for unital adjoint: $$\langle{}\epsilon^\dagger{}I, \rho{}\rangle{} = \langle{}I,\epsilon(\rho)\rangle{} = tr(\epsilon(\rho)) = tr(\rho) = \langle{}I,\rho{}\rangle{}$$

We note that $$p_\text{win}^q(G) \ge{} p_\text{win}^{\text{cl.}}(G)$$.

Proof: We can choose no states, i.e. $$\rho_{A'B'} = 1 \in L(\mathbf{C}\times{}\mathbf{C})$$ and the indicator for the POVM $$M_a^x = \mathbf{1}\{f_A(x)=a\}$$.

Current thoughts

Week 3.1

Composite systems

A state $$\rho_{AB}$$ is separable if it can be written as $$\rho_{AB} = \sum_i p_i \rho_A^i \otimes{} \rho_B^i$$, otherwise we say $$\rho_{AB}$$ is entangled.

Note that $$p_i$$ needs to be positive.
Examples of entangled states include Bell states, e.g. $$|\psi_i\rangle{} = (P_i \otimes{} \mathbf{1})(\frac{1}{\sqrt{2}} (|0\rangle{} \otimes{}|0\rangle{} + |1\rangle{}\otimes{}|1\rangle{}))$$
Classical-quantum (cq) states are of the form $$\rho_{XB} = \sum_x p_x | x\rangle{}\langle{}x|_x \otimes{} \rho_B^x$$.
- Taking the partial trace over B gives us the classical state.
- Convention to use X, Y, Z... for classical states, and these are called registers.

Channels

We look at isomorphism from $$L(V,W)$$ to $$V\otimes{}W$$, or in more explicit terms, $$L(L(V),L(W))$$ to $$L(V)\otimes{}L(W) \sim L(V\otimes{}W)$$. This is the Choi-Jamiotkowski isomorphism.

We can vectorize channels: $$\text{vec}(\epsilon_{V\rightarrow{}W}) = (\epsilon_{V\rightarrow{}W} \otimes{} \mathcal{I}) (|\Omega{}\rangle{}\langle{}\Omega{}|) = \sum_{i,j} \epsilon(|i\rangle{}\langle{}j|) \otimes{} |i\rangle{}\langle{}j|$$. This is called a Choi state.

Reminder that $$|i\rangle{}\langle{}j|$$ can be seen as a matrix with the (i,j)-entry as 1, and the rest 0.
Idea is to transform a channel into a linear operator that can be properly characterized.
$$|\Omega{}\rangle{}$$ is similar to the Bell states, with $$|\Omega{}\rangle{} = \sum_i^{d_v} |i\rangle{}_v\otimes{}|i\rangle{}_{v'}$$
- $$V'$$ here is a copy of the $$V$$ field
Theorems:
- $$\epsilon$$ is completely positive iff $$\text{vec}(\epsilon) \ge{} 0$$
  - Forward direction is clear, the reverse not so much.
- $$\epsilon$$ is trace-preserving iff $$tr_W(\text{vec}(\epsilon)) = \mathbf{1}_V$$
  - Proof for forward direction, $$tr_W(\text{vec}(\epsilon)) = \sum_{i,j} tr(\epsilon(|i\rangle{}\langle{}j|)\otimes{}|i\rangle{}\langle{}j|) = \sum_{i,j} tr(|i\rangle{}\langle{}j|\otimes{}|i\rangle{}\langle{}j|) = \sum_i |i\rangle{}\langle{}i| = \mathcal{I}$$.

Examples of other channels:

Transpose channel $$\mathcal{T}(|i\rangle{}\langle{}j|) = |j\rangle{}\langle{}i|$$
- While the adjoint operation does the same thing, that is not considered a channel (because it is not a linear operator).
- e.g. $$ X_{AA'} = (T_A\otimes{}I_A)(|\psi_0\rangle{}\langle{}\psi_0|) = \frac{1}{2} \sum_{i,j} |ji\rangle{}\langle{}ij| $$
  - For qubits, $$X_{AA'} = \frac{1}{2}(|00\rangle{}\langle{}00| + \cdots{} |01\rangle{}\langle{}10| + |10\rangle{}\langle{}01|)$$
  - The inner product with the (anti-symmetric) Bell state $$|\psi_2\rangle{} = \frac{i}{\sqrt{2}} (|01\rangle{} - |10\rangle{})$$ gives $$\langle{}\psi_2|X_{AA'}|\psi_2\rangle{} = -\frac{1}{2}$$.
  - This is not a completely positive map, but still positive (i.e. $$ (T_A\otimes{}I_{A'})$$ is not necessarily positive). While it cannot be a quantum channel, it remains a useful tool.

Norms and metrics on states

Canonical norm: $$||X|| = \sqrt{\langle{}X,X\rangle{}} = \sqrt{\sum_j \langle{}Xv_j,Xv_j\rangle{}} = \sqrt{\sum_j s_j^2} \equiv ||X||_2$$. This is called the 2-norm, or Frobenius norm, or Schatten-2 norm.

We write $$X = \sum_i s_i |w_i\rangle{}\langle{}v_i|$$.
The 2-norm is unitary invariant (i.e. $$||X||_2 = ||UXV^\dagger{}||_2$$ for any isometries $$U,V$$.

The p-norm has the same type of invariance properties and is defined $$||X||_p = (\sum_i s_i^p)^(1/p)$$.

Common ones used:
- $$||X||_1 = tr(|X|) = tr\sqrt{X^\dagger{}X}$$
- $$||X||_\infty{} = s_0$$ (useful for describing effects)
This has a generalized (Holdr) inequality $$|\langle{}X,Y\rangle{}| \le{} ||X||_p||Y||_q$$, where $$\frac{1}{p} + \frac{1}{q} = 1$$.
- $$p=q=2$$ reduces to Cauchy-Schwarz inequality.
- We say $$q=\infty{}$$ is the dual of $$p = 1$$.
  - Born's rule can be recovered: $$\langle{}M,\rho\rangle{} \le{} ||\rho||_1 ||M||_\infty{} \le{} 1$$ since $$0\le{}M\le{}1$$.

Recall that a metric has the properties $$\Delta{}(\rho,\sigma) \ge{} 0$$ with equality iff $$\rho=\sigma$$, satisfies Cauchy-Schwarz inequality, and is symmetric. Some metrics that we can define:

Trace distance $$\Delta{}(\rho,\sigma{}) \equiv{} \frac{1}{2}||\rho-\sigma||_1$$
- $$\Delta{}(\rho,\sigma)\in [0,1]$$.
Fidelity $$F(\rho,\sigma) = ||\sqrt{\rho}\sqrt{\sigma}||^2_1 = (tr(\sqrt{\rho{}}\sqrt{\sigma{}}))^2 = (tr \sqrt{\sqrt{\rho}\sigma\sqrt{\rho}})^2$$.
- With a pure state, we see simply that $$F(\rho,|\psi\rangle{}\langle{}\psi|) = \langle{}\psi|\rho|\psi\rangle{}$$
- $$F(\rho,\sigma)\in [0,1]$$
- Can be seen as a trace distance between purifications of the states.
Purified distance (sometimes known as infidelity): $$P(\rho,\sigma) = \sqrt{1 - F(\rho,\sigma)}$$.
These are related by the Fuchs-van de Groof inequality $$1 - \sqrt{F} \le{} \Delta \le \sqrt{1-F}$$, with $$\Delta = \sqrt{1-F}$$ if $$\rho,\sigma$$ are pure.

Some relations:

Given ctptp $$\epsilon$$, applying them on both states can make them closer (e.g. adding noise makes them more difficult to distinguish). This is embodied in these data-processing inequalities (DP1):
- $$\Delta(\epsilon(\rho),\epsilon(\sigma)) \le{} \Delta(\rho,\sigma)$$
- $$F(\epsilon(\rho),\epsilon(\sigma)) \ge{} F(\rho,\sigma)$$
Uhlman's theorem: for purifications $$|\rho\rangle, |\sigma\rangle$$, we have $$F(\rho,\sigma) = \max_{|\rho\rangle,|\sigma\rangle} F(|rho\rangle\langle\rho|,|\sigma\rangle\langle\sigma|)$$.
- Proof sketch:
  - Using DP1 for partial trace, we have $$F(\rho,\sigma) \ge{} F(|rho\rangle\langle\rho|,|\sigma\rangle\langle\sigma|)$$.
  - For the other direction, we use the purification $$|\rho\rangle = \text{vec}(\sqrt{\rho}) = \sum_i \sqrt{\rho}|i\rangle\otimes{}|i\rangle{}$$ and $$|\sigma{}_u\rangle{} = (I\otimes{}U)\text{vec}(\sqrt{\sigma}) = \text{vec}(\sqrt{\sigma}U^T)$$. This gives the inner product after the isometry, expanded below.
  - This relies on $$|\Omega{}\rangle{} = \sum_i |ii\rangle{}$$ and $$X\otimes{}I|\Omega{}\rangle = I\otimes{}X^T|\Omega{}\rangle$$

\begin{align*} \max_u |\langle{}\rho|\sigma\rangle{}| &= \max_u |\langle{}\sqrt{\rho},\sqrt{\sigma}U^\dagger\rangle{}| \\ &= \max_u | tr\sqrt{\rho}\sqrt{\sigma} U^T |\\ &= \max_u |tr(|\sqrt{\rho}\sqrt{\sigma}| U^T V)| \\ &\ge{} tr|\sqrt{\rho}\sqrt{\sigma}| \\ &= \sqrt{F(\rho,\sigma)} \end{align*}

Week 2.2

Tensor spaces

Given V, W Hilbert spaces on F with inner products and ONBs, we can define a tensor space $$V\otimes{} W = \text{span}\{v_i\otimes{} w_j\}_{i,j}$$.

Their corresponding inner product between each basis vector is as expected: $$\langle{}v_i\otimes{} w_j , v_{i'} \otimes{} w_{j'} \rangle{} = \langle{} v_i,v_{i'}\rangle{}_v \langle{} w_j,w_{j'}\rangle{}_w = \delta_{ii'} \delta_{jj'} $$.
We can also define general vectors in this new basis: $$u = \sum_{i,j} \alpha_{ij} v_i \otimes{} w_j $$, as well as the inner product $$\langle{}u,u'\rangle{} = \sum_{i,j} \overline{\alpha_{ij}} \alpha_{ij}' $$
Tensor spaces retain additive distributivity and multiplicative associativity.
Tensors of linear operators: $$ (L\otimes{} K) (v\otimes{} w) = (Lv)\otimes{} (Kw) $$, so we can define $$ L(V) \otimes{} L(W) \sim{} L(V\otimes{} W)$$

Composite systems

Notation:

We denote $$A\otimes{}B\otimes{}C$$ as $$ABC$$.
$$L(AB)$$ are linear operators on $$AB$$, and $$\text{Herm}(AB)$$ are self-adjoint elements of $$L(AB)$$.
$$\rho_{AB}$$ is a state in $$\text{Herm}(AB)$$ (as usual, $$\rho_{AB}\ge{}0,\,tr[\rho_{AB}] = 1 $$)
$$A'\sim{} A$$ implies isomorphism, i.e. they share the same dimensions, and their basis vectors can be mapped between each other.

Marginal state is given by $$\rho_A = \sum_i (I_A \otimes{} \langle{}i|_B) \rho_{AB} (I_A \otimes{} |i\rangle{}_B) \equiv{} (I_A\otimes{}\text{tr}_B)(\rho_AB) \equiv{} \text{tr}_B(\rho_AB)$$, given any ONB. $$\text{tr}_B$$ is called the partial trace, and can be written as $$tr_B(X) = \sum_i \langle{}i|X|i\rangle{}$$.

Given a state $$\rho_A = \sum_i \lambda{}_i |\psi_i\rangle{}\langle{}\psi_i| $$.

We define $$|\rho\rangle{}_{AA'} = \sum_i \sqrt{\lambda_i} |\psi_i\rangle{}_A \otimes{} |i\rangle{}_{A'}$$. Note that we can use any arbitrary ONB (which has to span $$A'$$ that is isomorphic to $$A$$) on the right side - we use the computational basis here for simplicity.
We compute the partial trace of the state over $$A'$$:

\begin{align*} tr_{A'} (\rho_{AA'}) &\equiv tr_{A'}(|\rho\rangle{}\langle{}\rho|_{AA'}) \\ &= tr_{A'} (\sum_{i,j} \sqrt{\lambda{}_i\lambda_j} |\psi_i\rangle{}\langle{}\psi_j| \otimes | i\rangle{}\langle{}i|) \\ &= \sum_{i,j} \sqrt{\lambda_i\lambda_j} \sum_k |\psi_i\rangle{}\langle{}\psi_j| \cdot{} \langle{}k|i\rangle{}\langle{}j|k\rangle{} \\ &= \sum_k \lambda_k |\psi_k\rangle{}\langle{}\psi_k| \\ &= \rho_A \end{align*}

$$\rho_{AA'} $$ is defined as a purification of $$\rho_A$$: we derived a pure state in the composite space that reduces to $$\rho_A$$ after taking the partial trace on $$A'$$.
Why do a purification?
- In cryptography, we know our state, but we don't know what Eve is doing. The most an eavesdropper can do is to derive a purification of our state. This is an analytical upper bound to what the eavesdropper knows.
- Other contexts include expressing fidelity in terms of purification.

Channels

Channels are linear maps that map between states: $$\epsilon_{A\rightarrow{}B} \in L(\text{Herm}(A),\text{Herm}(B))$$. Some properties we would like it to have:

$$\epsilon_{A\rightarrow{}B}$$ must preserve Hermiticity and trace, i.e. $$tr(\epsilon_{A\rightarrow{}B}(H_A)) = tr(H_A)$$ for all $$H_A \in \text{Herm}(A)$$.
$$\epsilon_{A\rightarrow{}B}$$ must preserve positivity, i.e. $$H_A\ge{}0 \Rightarrow{} \epsilon_{A\rightarrow{}B} (H_A) \ge{} 0 $$
$$\epsilon_{A\rightarrow{}B}$$ must be completely positive, i.e. $$\epsilon_{A\rightarrow{}B}\otimes{} I_C$$ is positive for all $$C$$ (bolded part is the important one).
- Intuitively, it implies being able to perform local operations without transforming other states in a composite system.
Channels are therefore called cptp maps (completely positive trace-preserving).

Examples:

Unitaries are actually unitary channels (without noise), e.g. $$\mathcal{U}_{A\rightarrow{}A}(H_A) = U_A H_A U_A^\dagger{} $$.
- This satisfies complete positivity: $$(\mathcal{U}_{A\rightarrow{}A} \otimes{} I_B) = (U_A \otimes{} I_B) \rho_{AB} (U_A^\dagger{} \otimes{} I_B) $$ and we previously show that $$X\rho_{AB}X^\dagger{} \ge{} 0$$.
Pauli channels: $$ \mathcal{P}_{p_i} (H) = \sum_{i=0}^3 p_i P_i HP_i$$ where $$\sum_i p_i = 1$$
- e.g. depolarizing channels can be written in this form.
Trace (and partial trace) is also a channel: $$tr_A: H (\in \text{Herm}(A))\mapsto{} tr(H) (\in \mathbb{R}) $$
- For partial trace, $$I_A \otimes{} tr_B$$ maps from $$\text{Herm}(AB) \rightarrow \text{Herm}(A)$$

Channels have adjoints, e.g. the adjoint of trace is $$(tr_B)^\dagger{}: H_A \mapsto{} H_A\otimes{} I_B$$. This adjoint is actually pretty conceptually important.

We can derive the adjoint trace: $$\langle{}H_A, (I_A\otimes{}tr_B)(M_{AB})\rangle = tr(H_A \cdot{} tr_B(M_{AB})) = tr(H_A \otimes{} I_B \cdot{} M_{AB}) = \langle{} H_A \otimes{} I_B, M_{AB}\rangle{}$$
- This adjoint allows us to see Born's rule with a POVM $$\{M^i_A\}_i$$ to a POVM that incorporates other states $$\{M^i_A\otimes{}I_B\}_i$$.
$$Pr[X=i] = \langle{}M^i_A\otimes{}I_B,\rho_{AB}\rangle{} = \langle{}M^i_A,tr_B(\rho_AB)\rangle{} = \langle{}M^i_A,\rho_A\rangle{} $$
- i.e. it is sufficient to look at the marginals since we only look at the outcome of Alice's measurement on state (Bob cannot affect Alice's outcome).
- Usually we want to look at the correlations between Alice and Bob's outcomes, so the probability of Alice's outcome conditioned on Bob's outcome will instead be different (and entanglement can strengthen these correlations).

Week 2.1

Every $$ X\in{} L(V,W)$$ can be written in the form $$ X\mapsto{} \sum_{i=0}^{r-1} s_i |w_i\rangle{}\langle{}v_i| $$, where $$ \{|v_i\rangle{}\}_i$$ is orthonormal in V. $$s_i>0$$ are called singular values.

Does not need to span the entire space (i.e. form a basis)
Terminology:
- Rank $$r < \text{min}\{d_{V},d_{W}\}$$
- $$\text{span}\{v_i\}_i$$ is called support (whose orthogonal complement is called kernel), while that for $$w$$ is the image.
We can order the singular values, e.g. $$s_0 \ge{} s_1 \ge{} \cdots{} \ge{} s_{r-1} > 0$$
This matrix picture might make more sense: $$X = U\Sigma{}V^\dagger{}$$

For Hermitian operators, we have the spectral decomposition (eigenvalue decomposition), which boils down to the even more familiar $$H=\sum_i \lambda_i |v_i\rangle{}\langle{}v_i|$$ for $$H\in{} L(V)$$.

As usual, $$\lambda_i \in R$$
The arises from SVD by assigning $$s_i = |\lambda{}_i|$$ for $$i \le{} r$$ (eigenvalues may be non-zero, so the number of non-zero singular values will be less than the rank).
- We can write in the SVD form by moving any negative signs from the eigenvalue to the corresponding image vector, e.g. $$|v_i\rangle{} \rightarrow{} |-v_i\rangle{} \equiv{} |w_i\rangle{}$$, to satisfy the positive singular value condition.

A Hermitian $$H\in{}L(V)$$ is positive semidefinite (psd) matrix if $$\langle{}v,Hv\rangle{}\ge{} 0 \forall{} v\in{}V$$.

We denote (a) $$H\ge{} 0$$ if H is psd (but does not mean the entries of H are necessarily non-negative!)
- This is equivalent to (b) $$H =M^\dagger{}M$$ for some $$M\in{}L(V)$$, as well as © the eigenvalues satisfy $$\lambda_i = 0$$.
- Proof:
  - (a) -> © by using eigenvectors in the psd definition;
  - © -> (b) by having $$M = \sqrt{\Sigma{}}U^\dagger{}$$ where $$\sqrt{\Sigma{}}$$ is the diagonal matrix with $$\sqrt{\lambda{}_i}$$ entries.
  - (b) -> (a) by $$\langle{}v,Hv\rangle{} = \langle{}Lv,Lv\rangle{} \ge{} 0$$
Lemma: $$M\ge{}0$$ implies $$X^\dagger{}MX\ge{}0$$ (use the same proof mechanism as above).
Projectors are psd operators $$P$$ with $$P^2 = P$$.

For a function $$f:(0,\infty{}) \rightarrow{} R$$, we define $$f(H)$$ as $$\sum_{i=0,\lambda_i>0}^{r-1} f(\lambda-i)|v_i\rangle{}\langle{}v_i|$$ for any psd $$H$$.

Note the non-zero eigenvalues is a convenience thing.
Example using $$f:t\mapsto{}1/t$$ gives the Penrose generalized inverse (the same one used in computing a best fit), which then gives us $$H^{-1}H = \sum_{i=0,\lambda_i>0}^{r-1} |v_i\rangle{}\langle{}v_i|$$ as the projector onto the support of H.

We can define a partial order on Hermitian operators $$ A\ge{}B$$ iff $$A-B\ge{}0$$, e.g. $$1\ge{} M \ge{} 0$$ if eigenvalues of M are [0,1].

Lemma: $$\langle{}M,N\rangle{} \ge{} 0$$ for psd M, N.
- Proof: Use eigenvalue decomposition, then do the ket to bra flip trick (used above as well).
- This implies $$\text{tr}MN\ge{}0$$ and $$A\ge{}B$$ implies $$\text{tr}AN \ge{} \text{tr}BN$$.

Quantum formalism

Quantum states (or density operators) are psd operators $$\rho,\sigma,\omega,\pi$$ with unit trace.

These can be seen as a generalization of probability mass functions (since $$\sum_i \lambda_i = 1$$).
Pure states are states with rank 1, i.e. $$\Psi = |\psi\rangle{}\langle{}\psi{}|$$ with $$\lambda\in{}\{0,1\}$$, so these are projectors as well.
- As usual, we prefer the density operator formalism to avoid reliance on pure states.
The Hadamard $$H$$ and Pauli operators $$\sigma_i$$ / $$P_i$$ are also psd operators!
- For the Pauli, this set corresponds to $$\{\frac{1}{\sqrt{2}} P_i \}_i$$ in order to cancel out the trace of the identity
  - This forms an ONB of Hermitian operators on qubits, i.e. $$\frac{1}{2}\langle{}P_i,P_j\rangle{} = \delta_{ij}$$
- e.g. we write the general operator in terms of $$P_i$$, with entries ¹⁾. If H is a state, then we can recover the Bloch state representation:
  1. $$\text{tr}\rho = 1$$, implying $$\rho=\frac{1}{2}(\bf{1} + r_iP_i)$$
  2. $$\rho \ge{} 0$$, implying $$\text{det}\rho \ge{} 0$$, so we get $$||\bf{r}|| = \sum_i r_i^2 \le{} 0$$.

A positive operator-valued measure (POVM) is a set $$\{M_x\}_{x\in{}X}$$ such that $$M_x\ge{}0$$ and $$\sum_x M_x = \bf{1}$$.

Terminology:
- The set $$\chi$$ is called alphabet
- $$M_i$$ are called effects.
We say it is projective if the effects are also projective, i.e. $$M_x = M_x^2$$.
We use Born's rule to associate probabilities to states/effects:
- $$p_x = \text{Pr}[X=x] = \langle{}\rho,M_x\rangle{}$$
  - $$p_x \ge{} 0$$ since inner product of psds is also psd.
  - $$\sum_x p_x = \langle{}\rho,\sum_x M_x\rangle{} = \langle{}\rho,\bf{1}\rangle{} = \text{tr}(\rho) = 1$$

Using the formalism (this involves multiple perspectives!),

\begin{align*} p_x &= \langle{} \Psi{}, M_x \rangle{} \\ &= \text{tr} (\Psi{}M_x) = \text{tr} (|\psi{}\rangle{}\langle{}\psi| M_x)\\ &= \langle{}\psi|M_x|\psi{}\rangle{} \\ &= \langle{} \psi{}, M_x\psi{} \rangle{} \end{align*}

Week 1.2 Linear algebra recap

Generally a recap of linear algebra basics.

Vector spaces

We denote a field $$F$$ (either the real field $$R$$ or complex field $$C$$). A vector space on F is a set V that is closed under addition $$V\times V \rightarrow V$$ and scalar multiplication $$F \times V \rightarrow V$$. This has the zero element 0.

A Hilbert space is a (finite dimensional) vector space with inner product $$\langle{}\cdot{},\cdot{}\rangle{}: V \times F \rightarrow F$$.

This inner product is linear in the second argument (Physics convention), i.e. $$\langle{}w,\alpha{}v\rangle{} = \alpha{}\langle{}w,v\rangle{} $$
Has conjugate $$ \langle{}w,v\rangle{} = \overline{\langle{}v,w\rangle{}} $$
Positive definite: $$\langle{}v,v\rangle{} \ge{} 0$$ with equality iff v=0
Canonical norm: $$||v|| = \sqrt{\langle{}v,v\rangle{}} $$
Cauchy-Schwarz inequality: $$|\langle{}v,w\rangle{}| \le ||v||\cdot{}||w||$$
Triangle inequality: $$||v+w|| \le ||v|| + ||w||$$
$$v$$ and $$w$$ are orthogonal if $$\langle{}v,w\rangle{} = 0$$
$$\{v_i\}_i$$ is orthonormal if $$\langle{}v_i,v_j\rangle{} = \delta{}_{ij}$$
Orthonormal basis (ONB) of V if $$V = \text{span}\{v_i\} = \{v: v=\sum_i a_i v_i, a_i \in F\}$$
Dimension of V is number of elements in ONB.

One example are pure states $$|\psi{}\rangle{} \in H$$, with the computational basis defined as $$\{|i\rangle{}\}_{i=0}^{d-1}$$. The inner product is defined $$\langle{}\theta{}|\phi{}\rangle{} \equiv{} \langle{}|\theta{}\rangle{},|\phi{}\rangle{}\rangle{} = \sum_i \overline{\theta}_i \cdot{} \phi_i $$. This is analogous to the matrix representation of dot product between column and row vectors.

Linear operators

Linear operators $$L(V,W)$$ contains all linear maps from V to W.

$$L(v) := L(V,V)$$
$$L(V,W)$$ is a vector space over same field (for convenience) as V and W, e.g. $$v\in V: (a_1X_1 + a_2X_2)v = a_1X_1v + a_2X_2v \in W$$.
- Note the action of X is determined by its action on ONB ($$Xv = \sum_i a_i X v_i$$)
Adjoint of $$ X\in L(V,W)$$ is $$ X^\dagger{} \in L(W,V) $$ satisfying $$\langle{}w,Xv\rangle{} = \langle{}X^\dagger{}w, v\rangle{} $$, where $$ (X^\dagger{})_{ij} = \overline{X_{ji}}$$.
- Note $$(XY)^\dagger{} = Y^\dagger{} X^\dagger{} $$, and also $$(X^\dagger{})^\dagger{} = X$$.
Identity operator on V is $$1_V \in L(V)$$, where $$1_V: v\mapsto v$$, or for any general ONB $$1_V: v \mapsto \sum_{i=0}^{d-1} v_i \langle{} v_i,v\rangle{}$$.
- If $$v_i$$ are kets, we get the more familiar outer product notation, but this becomes a bit limited when representing matrices.
An isometry $$U\in L(V,W) $$ satisfies $$U^\dagger{}U = 1_V$$.
A unitary $$U\in L(V)$$ is a stronger isometry where the adjoint is its inverse, i.e. $$U^\dagger{}U = UU^\dagger{}$$.
- Isometries preserve inner product: $$\langle{}Uv,Uw\rangle = \langle{} v,U^\dagger{}Uw\rangle{} = \langle{} v,w\rangle{} $$.
- Unitaries are maps that map from one ONB to another, i.e. $$ U: v\mapsto{} \sum_i w_i \langle{} v_i,v\rangle{} $$, for a map from ONB $$\{v_i\}_i$$ to ONB $$\{w_i\}_i$$, or in other words $$Uv_i = w_i \forall{} j$$
An operator H is self-adjoint or Hermitian if $$H^\dagger{} = H$$, e.g. density operators, states, effects
- This must be defined on a real field.
We can also define an inner product on $$L(V,W)$$, where $$\langle{} X,Y\rangle{} = \sum_i \langle{} Xv_i, Yv_i\rangle{}$$ taking any ONB of V.
- This is also defined as the Hilbert-Schmidt inner product $$\langle{}X,Y\rangle{} \sim \sum_i \langle{} X|i\rangle{} ,Y|i\rangle{} \rangle{} = \sum_i \langle{} i|X^\dagger{}Y|i\rangle{} = \text{tr}(X^\dagger{}Y)$$, in the matrix picture.
  - Doesn't have to be the diagonal in the computation basis per se, can be any arbitrary inner product, which we can see in the lemma $$\langle{} X,Y\rangle{} = \langle{}Y^\dagger{},X^\dagger{}\rangle{}$$ (proof in handwritten lecture notes).
- Can be shown to be positive definite by considering $$\langle{}X,X\rangle{} \ge{} 0$$ where equality iff $$X=0$$.

We can use the unitary to define further:

\begin{align*} \langle{}X,Y\rangle{} &= \sum_i \langle{}Xw_i,Yw_i\rangle{} \\ &= \sum_i \langle{}XUw_i,YUw_i\rangle{} \\ &= \langle{}XU,YU\rangle{} \\ &= \langle{}U^\dagger{}Y^\dagger{},U^\dagger{}X^\dagger{}\rangle{} \\ &= \langle{}Y^\dagger{},UU^\dagger{}X^\dagger{}\rangle{} \\ &= \langle{}Y^\dagger{},X^\dagger{}\rangle{} \end{align*}

A small note: vectors can be seen also as a map from the field to the vector space, e.g. $$ V\sim{} L(C,V)$$. We can generalize even further into channels, which map from some Hilbert space / set of linear operators to itself or another, e.g. $$ \langle{}\epsilon^\dagger{}(X),Y\rangle = \langle{} X,\epsilon{}(Y)\rangle{}$$ which can be seen as analogous to the Heisenberg vs Schrondinger picture.

Additional notes: to be pedantic, $$|\psi{}\rangle{}\alpha = \alpha{}|\psi\rangle{}$$, the left $$|\psi\rangle{} \in L(\bf{C},V)$$ while the right $$\alpha|\psi{}\rangle{} \in L(V)$$.

Week 1.1 Administrative

A list of names of students in the class. Scribing for lecture notes and a final exam.

¹⁾

a,c-id),(c+id,b

Table of Contents

QT5104 Topics in Quantum Information Theory

Week 7.1

BBM-92 QKD security proof

Assumptions

Security and robustness

Week 6.2

Week 6.1

Week 5.2

Week 5.1

Surprisal

Entropy

Conditional entropy

Relative entropy

Week 4.2

CHSH game

Mermin magic square game

Week 4.1

Quantum formalism recap

Quantum correlations and games

Week 3.1

Composite systems

Channels

Norms and metrics on states

Week 2.2

Tensor spaces

Composite systems

Channels

Week 2.1

Quantum formalism

Week 1.2 Linear algebra recap

Vector spaces

Linear operators

Week 1.1 Administrative