Jekyll2018-02-25T17:33:19+01:00https://luongo.pro//returnlambdaThe level of achievement that you have in anything, is a reflection of how well you were able to focus on it (Steve Vai)
Hamiltonian Simulation2018-02-18T00:00:00+01:002018-02-18T00:00:00+01:00https://luongo.pro/2018/02/18/Hamiltonian-simulation<p>These are my notes are on Childs (n.d.).</p>
<h1 id="introduction">Introduction</h1>
<p>The only way possible to start a chapter on Hamiltonian simulation would
be to start from the work of Feynman, who had the first intuition on the
power of quantum mechanics for simulating physics with computers. We
know that the Hamiltonian dynamics of a closed quantum system, weather
its evolution changes with time or not, is give by the
Schr<span>ö</span>dinger equation:</p>
<script type="math/tex; mode=display">i\hbar \frac{d}{dt}\ket{\psi(t)} = H(t)\ket{\psi(t)}</script>
<p>Given the initial conditions of the system (i.e. $\ket{\psi(0)} $ ) is
it possible to know the state of the system at time
$t: \ket{\psi(t)} = e^{-i (H_1t/m)}\ket{\psi(0)}$.</p>
<p>As you can imagine, classical computers are suppose to struggle
simulating the system to get $ \ket{\psi(t)}$, since this equation
describes the dynamics of any quantum system, and we don’t think (hope
:D ) classical computer can simulate that efficiently. But we know that
quantum computers can help “copying” the dynamic of another quantum
system. Why would you be bothered?</p>
<p>Imagine you are a quantum machine learning scientist, and you have just
found a new mapping between an optimization problem and an Hamiltonian
dynamics, and you want to use quantum computer to perform the
optimization Otterbach et al. (2017). You expect a quantum computers to
run the Hamiltonian simulation for you, and then sample useful
information from the resulting quantum sate. This result might be fed
again into your classical algorithm to perform ML related task, in a
virtuous cycle of hybrid quantum-classical computation.</p>
<p>Or imagine you that you are a chemist, and you have developed an
hypothesis for the Hamiltonian dynamics of a chemical compound. Now you
want to run some experiments to see if the formula behaves according to
the experiments. Or maybe you are testing properties of complex
compounds you don’t want to synthesize. We can formulate the problem of
HS in this way:</p>
<p><span>Hamiltonian simulation problem</span>: Given a state
$\ket{\psi(0)}$ and an Hamiltonian $H$, obtain a state $\ket{\psi(t)}$
such that $\ket{\psi(t)}:=e^{-iHt}\ket{\psi(0)}$ and
$|\ket{\psi(0)} - \ket{\tilde{\psi(t)}}| < \varepsilon$ for some norm
(usually trace norm).</p>
<p>Which leads us to the definition of efficiently simulable Hamiltonian:</p>
<p><span>Efficient Hamiltonian simulation</span> Given a state
$\ket{\psi(0)}$ and an Hamiltonian $H$ acting on $n$ qubits, we say $H$
can efficiently simulated if,
$\forall t \geq 0, \forall \varepsilon \geq 0$, there is a quantum
circuit such $U$ that $||U - e^{-iHt} || < \varepsilon$ using a number
of gates that is polynomial in $n,t, 1/\varepsilon$.</p>
<p>In the following, we suppose to have a quantum computer and quantum
access to the Hamiltonian $H$. Te importance of this problem might not
be immediately clear to a computer scientist. But if we think that every
quantum circuit is described by an Hamiltonian dynamic, being able to
simulate an Hamiltonian is like being able to have virtual machines in
our computer. (This example actually came from a talk at IHP of Toby
Cubitt!) Remember that there’s a theorem that says that for an
Hamiltonian simulation problem, the number of gates is $\omega{t}$, and
this Theorem goes under the name of No fast-forward Theorem. <br>
But concretely? What does it means to simulate an Hamiltonian of a
physical system? Let’s take the Hamiltonian of a particle in a
potential: <script type="math/tex">H = \frac{p^2}{2m} + V(x)</script> We want to know the position of
the particle at time $t$ and therefore we have to compute
$e^{-iHt}\ket{\psi(0)}$</p>
<h2 id="some-hamiltonians-we-know-to-simulate-efficiently">Some Hamiltonians we know to simulate efficiently</h2>
<ul>
<li>
<p>Hamiltonians that represent the dynamic of a quantum circuits (more
formally, where you only admit local interactions between a constant
number of qubits). This result is due to the famous
Solovay-Kitaev Theorem. That says that there exist an efficient
compiler from an architecture that use a set of gates $\mathbb{S_1}$
and another quantum computer that uses a set of universal gates
$\mathbb{S_2}$.</p>
</li>
<li>
<p>If the Hamiltonian can be efficiently applied for a basis, then also
$UHU$ can be efficiently applied. Proof:
$e^{-iUHU^\dagger t} = Ue{-iH t}U^\dagger $.</p>
</li>
<li>
<p>If $H$ is diagonal in the computational basis and we can compute
efficiently $\braket{a||H|a}$ for a basis element $a$. By linearity:
<script type="math/tex">\ket{a,0} \to \ket{a, d(a)} \to e^{-itd(a)} \otimes I \ket{a,d(a)t} \to e^{-itd(a)}\ket{a,0} = e^{-itH}\ket{a,0}</script></p>
<p>(In general: if we know how to calculate the eigenvalues, we can
apply an Hamiltonian efficiently.)</p>
</li>
<li>
<p>The sum of two efficiently simulable Hamiltonians is efficiently
simulable using Lie product formula
<script type="math/tex">e^{-i (H_1 + H_2) t} = lim_{m \to \infty} ( e^{-i (H_1t/m)} + e^{-i (H_2t/m) t} )^m</script>
We chose $m$ such that
<script type="math/tex">|| e^{-i (H_1 + H_2) t} - ( e^{-i (H_1t/m)} + e^{-i (H_2t/m) t} )^m || \leq</script>
and this gives $m=(vt^2/\varepsilon)$ and
$v=\max{ ||H_1||, ||H_2||}$. Using higher order approximation is
possible to reduce the dependency on $t$ to $O(t^1+\delta)$ for a
chosen $\delta$. (wtf!)</p>
</li>
<li>
<p>This facts can be used to show that the sum of polynomially many
efficiently simulable Hamiltonians is simulable efficiently.</p>
</li>
<li>
<p>The commutator $[H_1, H_2]$ of two efficiently simulable Hamiltonian
can be computed efficiently because:
<script type="math/tex">e^{-i[H_1, H_2]t} = lim_{m\to \infty} (e^{-iH_1\sqrt[]{t/m}}e^{-iH_2\sqrt[]{t/m}}e^{H_1\sqrt[]{t/m}}e^{H_1\sqrt[]{t/m}})^m</script>
which we believe, without having idea on how to check it. :/</p>
</li>
<li>
<p>If the Hamiltonian is sparse, it can be efficiently simulated. The
idea is to pre-compute a edge-coloring of the graph represented by
the adjacency matrix of the sparse Hamiltonian. (For each $H$ you
can consider a graph $G=(V, E)$ such that its adjacency matrix $A$
is $a_{ij}=1$ if $H_{ij} \neq 0$ ).</p>
</li>
</ul>
<p>Recalling the example of a particle in a potential energy: its momentum
<script type="math/tex">\frac{p^2}{2m}</script> is diagonal in the fourier basis (and we know how to
do a QFT), and the potential $V(x)$ is diagonal in the computational
basis, thus this Hamiltonian is easy to simulate.</p>
<p>Exercise/open problem: do we know any algorithm that might benefit the
efficient simulation of $[H_1, H_2]$? Childs in Childs (n.d.) claims he
is not aware of any algorithm that uses that.</p>
<div id="refs" class="references">
<div id="ref-childs">
Childs, Andrew. n.d. “Lecture Notes in Quantum Algorithmics.”
</div>
<div id="ref-otterbach2017unsupervised">
Otterbach, JS, R Manenti, N Alidoust, A Bestwick, M Block, B Bloom, S
Caldwell, et al. 2017. “Unsupervised Machine Learning on a Hybrid
Quantum Computer.” *ArXiv Preprint ArXiv:1712.05771*.
</div>
</div>scinawaThese are my notes are on Childs (n.d.).Storing Data In A Quantum Computer2018-02-03T00:00:00+01:002018-02-03T00:00:00+01:00https://luongo.pro/2018/02/03/Storing-data-in-a-quantum-computer<p>We are going to see what does it mean to store/represent data on a
quantum computer. Is very important to know how, since knowing what are
the most common ways of encoding data in a quantum computer might pave
the way for the intuition in solving new problems. Let me quote an
article of 2015: Schuld, Sinayskiy, and Petruccione (2015): <em>In order to
use the strengths of quantum mechanics without being confined by
classical ideas of data encoding, finding “genuinely quantum” ways of
representing and extracting information could become vital for the
future of quantum machine learning</em>. Usually we store information in a
classical data structure, and the assume to have quantum access to it.
In general, this quantum access consist of a query: an operation
$U\ket{i}\ket{0}\to \ket{i}\ket{\psi_i}$, where the first is called the
index register, and the second is a target register that holds the
information that you requested. To get an intuition of what the previous
sentence means, I borrow an intuitive example that I stole from a
youtube video of Seth Lloyd. Imagine that you have a source of photons -
which represent your query register - and you send one towards a CD. Due
to the duality wave-particle, you are actually hitting your CD with a
“thing” that is not anymore located deterministically as a single
particle in the space, but behaves as a wave. When the wave hits the
surface of the CD, it gets all the information stored in the little
holes of $0$s and $1$s, and gets reflected carrying on this information.
This wave represent the output of your query. (Sure, we assume the
interaction between the wave and the CD does not make the wave-function
collapse).\
Let’s start. As good computer scientist, let’s organize what we know how
to do by data types.</p>
<h1 id="scalars">Scalars</h1>
<h2 id="integer-mathbbz">Integer: $\mathbb{Z}$</h2>
<p>Let’s start with the most simple “type” of date: the integers. Let
$m \in \mathbb{N}$. We take the binary expansion of $m$, and set the
qubits of our computer as the binary digits of the number. As example,
if your number’s binary expansion is $0100\cdots0111$ we can create the
state:
$\ket{x} = \ket{0}\otimes \ket{1} \ket{0} \ket{0} \cdots \ket{0} \ket{1} \ket{1} \ket{1}$.
Formally, given $m$:</p>
<script type="math/tex; mode=display">\ket{m} = \bigotimes_{i=0}^{n} m_i</script>
<p>Using superposition of states like these we might create things like
$\frac{1}{\sqrt{2}} (\ket{5}+\ket{9})$ or more involved convex
combination of states.\
The time needed to create this state is linear in the number of
bits/qubits. It might be used to get speedup in the number of query to
an oracle, like in (<span class="citeproc-not-found" data-reference-id="Wiebe0QuantumModels"><strong>???</strong></span>), or in general
where you aim at getting a speedup in oracle complexity using amplitude
amplification and similar. For negative integers, we might just use a
qubit more for the sign. (Don’t be tempted into saying that
$\ket{3}+\ket{3}=\ket{6}$. It’s not!)\</p>
<h2 id="rational-mathbbq">Rational: $\mathbb{Q}$</h2>
<p>As far as I know, in quantum computation / quantum machine learning,
there are some register with rational numbers, usually as $n$-bit
approximation of a reals between $0$ and $1$. In that case, just take
the binary expansion and use the previous encoding.</p>
<h2 id="reals-mathbbr">Reals: $\mathbb{R}$</h2>
<p>As before, if the number is between $0$ and $1$, use the previous
encoding. It’s pretty rare to store just a single number in
$\mathbb{R}$, and usually real numbers are encoded into amplitudes and
used when dealing with vectors in $\mathbb{R}^n$.</p>
<h1 id="vectors">Vectors</h1>
<h2 id="binary-vectors-01n">Binary vectors: ${0,1}^n$</h2>
<p>Let $\vec{b} \in {0,1}^n$. As for the encoding used for the integers:</p>
<script type="math/tex; mode=display">\ket{b} = \bigotimes_{i=0}^{n} b_i</script>
<p>As an example, suppose you want to encode the vector
$[1,0,1,0,1,0] \in {0,1}^6$, which is $42$ in decimal. This will
correspond to the $42$-th base of the Hilbert space where our qubits
will evolve. In some sense, we are not fully using the $C^{2^{n}}$
Hilbert space: we are only mapping a binary vector in a (canonical)
base. As a consequence, distances between points in the new space are
different.\
We can imagine some other encodings. For instance we can map a $0$ into
$1$ and $1$ as $-1$ (even if I don’t know how it might be used nor how
to build it).
<script type="math/tex">\ket{v} = \frac{1}{\sqrt{2^n}} \sum_{i \in \{0,1\}^n} (-1)^{b_i} \ket{i}</script>\</p>
<h2 id="real-vectors-mathbbrn">Real vectors: $\mathbb{R}^n$</h2>
<p>Maybe you are used to see Greek letters inside a ket to represent
quantum states, and use latin letters to represent quantum states that
use binary expansion to hold classical data. The following is a very
common encoding in quantum machine learning. For a vector
$\vec{x} \in \mathbb{R}^{2^n}$, we can build:</p>
<script type="math/tex; mode=display">\ket{x} = \frac{1}\sum_{i=0}^{N}\vec{x}_i\ket{i} = |\vec{x}|^{-1}\vec{x}</script>
<p>Note that to span a space of dimension $N=2^n$, you just need $log_2(n)$
qubits: we encode each component of the classical vector in the
amplitudes of a state vector. Ideally, we know from Grover and Rudolph
(2002) how to create quantum states that corresponds to vector of data
(i.e. “efficiently integrable probability distribution”). We miss an
important ingredient. This encoding might not be enough if you have to
manipulate “many” vectors, as in some sense what you are creating is
vector with unitary norm. What if we want to build a superposition of
two vectors? Well, might expect to be able to create a state
$\frac{1}{\sqrt{N}} \sum_{i} \ket{x_i}$, but there’s a problem. Imagine
to do it with just two vectors: $x_1 = [-1, -1, -1]$ and
$x_2 = [1,1,1]$. Well, their (uniform) linear combination is the vector
$[0,0,0]$. What does this means? that to make a unitary vector out of
it, we need a exceptionally small normalizing factor. Usually this kind
of superpositions are obtained as a result of a measurement on an
ancilla qubit. The measurement has a probability that is proportional to
the norm of the vectors. Therefore, to be able to build this state we’re
gonna need an intolerable number of trial and error in building this
state. This problem can be amended by adjoining an ancilla register, as
we see now.</p>
<h1 id="matrices">Matrices</h1>
<p>Imagine to store your vectors in the rows of a matrix. Let
$X \in \mathbb{R}^{n \times d}$, a matrix of $n$ vectors of $d$
components. We will encode them using $log(d)+log(d)$ qubits as the
states:</p>
<script type="math/tex; mode=display">\frac{1}{\sqrt{\sum_{i=0}^n {\left \lVert x(i) \right \rVert}^2 }} \sum_{i=0}^n {\left \lVert x(i) \right \rVert}\ket{i}\ket{x(i)}</script>
<p>Or, put it another way:</p>
<script type="math/tex; mode=display">\frac{1}{\sqrt{\sum_{i=0}^n} {\left \lVert x(i) \right \rVert}^2} \sum_{i,j} X_{ij}\ket{i}\ket{j}</script>
<p>The problem is how to build it this state. We are going to need a very
specific oracle (which we call QRAM, even if there is ambiguity in
literature on that). A QRAM gives us access to two things: the norm of
the rows of a matrix and the rows itself. Calling the two oracles
combined, we can do the following mapping:</p>
<script type="math/tex; mode=display">\sum_{i=0}^{n} \ket{i} \ket{0} \to \sum_{i=0}^n {\left \lVert x(i) \right \rVert}\ket{i}\ket{x(i)}</script>
<p>Basically, we use the superposition in the first register to select the
rows of the matrix that we want, and after the query we have them in the
second register. A QRAM is a tree-like classical data structure that
offer quantum access in an oracular way to a data structure like this.
You can think of a QRAM as a circuit that encodes your matrix. Note that
using this nice encoding, the ratios between the distances between
vectors is the same as in the Hilbert space. Also note that once the
vector is created, the only way to recover $x$ from $\ket{x}$ is to do
quantum tomography (i.e. destroying the state with a measurement). The
cost (in term of time and space) of creating this data structure is a
little bit more than linear: $O(nd log (nd))$ but it pays by giving a
access time for a query that is $O(log(nd))$. (An example of QRAM can be
found in Kerenidis and Prakash (2017), and will obviously covered in
this blog in the next posts. Yes, I know. It might be difficult with the
physical implementation of QRAM, but I have faith the experimental
physicists. :)</p>
<h1 id="graphs">Graphs</h1>
<p>For specific problems we can even change the computational model (i.e.
no more gates on wires used to describe computation). For instance,
given a graph $G=(V,E)$ we can encode it as a state $\ket{G}$ such that:
<script type="math/tex">K_G^V\ket{G} = \ket{G} \forall v \in V</script> where
$K_G^v = X_y\prod_{u \in N}(v)Z_u $, and $X_u$ and $Z_u$ are the Pauli
operators on $u$. The way of picture this encoding is this. Take as many
qubits in state $\ket{+}$ as nodes in the graph, and apply controlled
$Z$ rotation between qubits representing adjacent nodes. There are some
algorithms that use this state as input, for instance in Zhao,
Pérez-Delgado, and Fitzsimons (2016), where they even extended this
definition.\</p>
<h1 id="conclusions">Conclusions</h1>
<p>Te precision that we can use for specifying the amplitude of a quantum
state might be limited in practice by the precision of our quantum
computer in manipulating quantum states (i.e. development in techniques
in quantum metrology and sensing). Techniques that use a certain
precision in the amplitude of a state might suffer of initial technical
limitations of the hardware. As a parallel, think of what’s happening
with CPUs where we had 16, 32 and now 64 bits of precision.\</p>
<div id="refs" class="references">
<div id="ref-Grover2002">
Grover, Lov, and Terry Rudolph. 2002. “Creating superpositions that
correspond to efficiently integrable probability distributions.”
</div>
<div id="ref-kerenidis2017quantum">
Kerenidis, Iordanis, and Anupam Prakash. 2017. “Quantum Gradient Descent
for Linear Systems and Least Squares.” *ArXiv Preprint
ArXiv:1704.04992*.
</div>
<div id="ref-schuld2015introduction">
Schuld, Maria, Ilya Sinayskiy, and Francesco Petruccione. 2015. “An
Introduction to Quantum Machine Learning.” *Contemporary Physics* 56
(2). Taylor & Francis: 172–85.
</div>
<div id="ref-zhao2016fast">
Zhao, Liming, Carlos A Pérez-Delgado, and Joseph F Fitzsimons. 2016.
“Fast Graph Operations in Quantum Computation.” *Physical Review A* 93
(3). APS: 032314.
</div>
</div>scinawaWe are going to see what does it mean to store/represent data on a
quantum computer. Is very important to know how, since knowing what are
the most common ways of encoding data in a quantum computer might pave
the way for the intuition in solving new problems. Let me quote an
article of 2015: Schuld, Sinayskiy, and Petruccione (2015): In order to
use the strengths of quantum mechanics without being confined by
classical ideas of data encoding, finding “genuinely quantum” ways of
representing and extracting information could become vital for the
future of quantum machine learning. Usually we store information in a
classical data structure, and the assume to have quantum access to it.
In general, this quantum access consist of a query: an operation
$U\ket{i}\ket{0}\to \ket{i}\ket{\psi_i}$, where the first is called the
index register, and the second is a target register that holds the
information that you requested. To get an intuition of what the previous
sentence means, I borrow an intuitive example that I stole from a
youtube video of Seth Lloyd. Imagine that you have a source of photons -
which represent your query register - and you send one towards a CD. Due
to the duality wave-particle, you are actually hitting your CD with a
“thing” that is not anymore located deterministically as a single
particle in the space, but behaves as a wave. When the wave hits the
surface of the CD, it gets all the information stored in the little
holes of $0$s and $1$s, and gets reflected carrying on this information.
This wave represent the output of your query. (Sure, we assume the
interaction between the wave and the CD does not make the wave-function
collapse).\
Let’s start. As good computer scientist, let’s organize what we know how
to do by data types.Swap Test For Distances2018-01-29T00:00:00+01:002018-01-29T00:00:00+01:00https://luongo.pro/2018/01/29/Swap-test-for-distances<h1 id="intro-to-swap-test">Intro to swap test</h1>
<p>What is known as <em>swap test</em> is a simple but powerful circuit used to
measure the “proximity” of two quantum states (cosine distance in
machine learning). It consists in a controlled swap operation surrounded
by two Hadamard gates on the controlling qubit. Repeated measurements of
the ancilla qubit allows us to estimate the probability of reading $0$
or $1$, which in turn will allow us to estimate $\braket{\psi|\phi}$.
Let’s see the circuit:</p>
<p><img src="/assets/swap_distances/swap_test.png" alt="image" /></p>
<p>It is simple to check that the state at the end of the execution of the
circuit is the following:</p>
<script type="math/tex; mode=display">\Big[\ket{\psi} \ket{\phi} + \ket{\phi} \ket{\psi} \Big]\ket{0} +\Big[\ket{\psi} \ket{\phi} - \ket{\psi} \ket{\phi} \Big] \ket{1}</script>
<p>Thus, the probability of reading a $0$ in the ancilla qubit is:
<script type="math/tex">P (\ket{0}) = \left( \frac{1+|\braket{\psi|\phi}|^2}{2} \right)</script> And
the probability of reading a $1$ in the ancilla qubit is:
<script type="math/tex">P (\ket{1}) = \left( \frac{1-|\braket{\psi|\phi}|^2}{2} \right)</script></p>
<p>This means that if the two states are completely orthogonal, we will
measure an equal number of zero and ones. On the other side, if
$\ket{\psi} = \ket{\phi}$, then the probability amplitude of reading
$\ket{1}$ in the ancilla qubit is $0$. Repeating this operation a
certain number of time, allows us to estimate the inner product between
$\ket{\psi},\ket{\phi}$. Unfortunately, at each measurement we
irrevocably destroy the states, and we need to recreate them in order to
perform again the swap test. This is not much of a problem, if we have
an efficient way of creating $\ket{\psi}$ and $\ket{\phi}$. We can
informally state what the swap test consist with the following theorem.</p>
<p>[Swap test for inner products] Suppose you have access to unitary
$U_\psi$ and $U_\phi$ that allows you to create $\ket{\psi}$ and
$\ket{\phi}$, each of them requiring time $T(U_\psi)$ and $T(U_\phi)$.
Then, there is a circuit that allows to estimate inner products between
two states $\ket{x},\ket{y}$ in $O(T(U_\psi)T(U_\phi)\varepsilon^{-2})$
number of operations.</p>
<p>The correctnes of the circuit was sown before. This is the analysis of
the running time. We recognize in the measurement on the ancilla qubit a
random variable $X$ with Bernulli distribution with
$p=(1+|\braket{\psi|\phi}|^2)/2$, and variance $p(1-p)$. The number of
repetitions that are necessary to estimate the expected value $\bar{p}$
of $X$ with relative error $\epsilon$ is bounded by the Chernoff bound.</p>
<h1 id="swap-test-for-distance-between-vector-and-center-of-a-cluster">Swap test for distance between vector and center of a cluster</h1>
<p>Now we are going to see how to use the swap test to calculate the
distance between two vectors. This section is entirely based on the work
of Lloyd, Mohseni, and Rebentrost (2013). There, they explain how to use
this subroutine to do cluster assignment and many other interesting
things in quantum machine learning. This was one of the first paper I
read in quantum machine learning, and I really wanted to understand
everything, so I tried to do the calculation myself. I think I have
found some typos in the original paper, so here you will find what I
think is the correct version. At the bottom of this post you will find
the calculations. In the following section we will assume that we are
given access to two unitaries $U : \ket{i}\ket{0} \to \ket{i}\ket{v}$
and $V : \ket{i}\ket{0} \to \ket{i}\ket{|v_i|} $.\
Let’s recall the relation between inner product and distance of
$\vec{u}, \vec{v} \in \mathbb{R}^n$. The inner product between two
vector is $\braket{ v, u } = \sum_{i} v_i u_i $, and the norm of a
vector is $ |v|= \sqrt{\langle v, v \rangle} $. Therefore, the distance
can be rewritten as:</p>
<table>
<tbody>
<tr>
<td>$</td>
<td>u-v</td>
<td>= \sqrt{ \langle u-v, u-v \rangle } = \sqrt{\sum_{i} (u_i-v_i)^2 } = \sqrt{</td>
<td>u</td>
<td>^2 +</td>
<td>v</td>
<td>^2 -2 \langle u, v \rangle } $</td>
</tr>
</tbody>
</table>
<table>
<tbody>
<tr>
<td>By setting $ Z =</td>
<td>u</td>
<td>^2 +</td>
<td>v</td>
<td>^2 $ it follows that:</td>
</tr>
<tr>
<td>$</td>
<td>u-v</td>
<td>^2 = Z ( 1 - \frac{ 2 \langle u, v \rangle } {Z} ) $.</td>
<td> </td>
<td> </td>
</tr>
</tbody>
</table>
<p>As you may have guessed, to find the distance $|v-u|$ we will repeat the
necessary number of times the swap circuit. The problem now is to find
the right states.\
We first start by creating
$|\psi \rangle = \frac{1}{\sqrt{2}} \Big( \ket{0}\ket{u} + \ket{1}\ket{v} \Big)$
querying QRAM in $O(log(N))$ time, where N is the dimension of the
Hilbert space (the length of the vector of the data).\
Then we proceed by creating
$|\phi\rangle \frac{1}{\sqrt{Z}} \Big( |\vec{v}||0\rangle + |\vec{u}||1\rangle \Big) $
and and estimate $Z=|\vec{u}|^2 + |\vec{v_j}|^2$. Remember that for two
vectors, $Z$ is easy to calculate, while in the case of a distance
between a vector and the center of a cluster then
$Z=|\vec{u}|+\sum_{i \in V} |\vec{v_i}|^2$. In this case, calculating
$Z$ scales linearly with the number of elements in the cluster, and we
don’t want that.</p>
<p>To create $\ket{\phi}$ and estimate $Z$, we have to start with another,
simpler-to-build $\ket{\phi^-}$ and make it evolve to $\ket{\phi}$. To
do so, we apply the following time dependent Hamiltonian for a certain
amount of time $t$ such that $t|\vec{v}|, t|\vec{u}| « 1 $
<script type="math/tex">H = |\vec{u}|\ket{0}\bra{0}+|\vec{v}|\ket{1}\bra{1} \otimes \sigma_x</script>
<script type="math/tex">\ket{\phi^-} = \ket{-}\ket{0}</script></p>
<p>The evolution $e^{-iHt} \ket{\phi^-}$ for small $t$ will give us the
following state:
<script type="math/tex">\Big( \frac{cos(|\vec{u}|t)}{\sqrt{2}}\ket{0} - \frac{cos(|\vec{v}|t)}{\sqrt{2}}\ket{1} \Big) \ket{0} - \Big( \frac{i sin(|\vec{u}|t)}{\sqrt{2}}\ket{0} - \frac{i sin(|\vec{v}|t)}{\sqrt{2}}\ket{1} \Big) \ket{1}</script></p>
<p>Reading the ancilla qubit in the second register, we should read $1$
with the following probability, given by small angle approximation of
the $sin$ function:</p>
<script type="math/tex; mode=display">P(1) = \lvert - \frac{i sin(|\vec{u}|t)}{\sqrt{2}} \rvert^2 + \lvert \frac{i sin(|\vec{v}|t)}{\sqrt{2}} \lvert^2 \approx |\frac{|\vec{u}|t}{\sqrt{2}}|^2 + | \frac{|\vec{v}|t}{\sqrt{2}} |^2 = \frac{1}{2} \Big( |\vec{u}|^2t^2 + |\vec{v}|^2t^2 \Big) = Z^2t^2/2</script>
<p>Now we are almost ready to use the swap circuit. Note that our two
quantum register have a different dimension, so we cannot swap them.
What we can do instead is to swap the index register of $\ket{\phi}$
with the whole state $\ket{\psi}$. The probability of reading $1$ is:</p>
<script type="math/tex; mode=display">\begin{split}
p(1) = \frac{2|\vec{u}|^2 + 2\vec{v}^2 - 4(u,v)}{8Z}
\end{split}</script>
<h1 id="conclusion">Conclusion</h1>
<p>We saw how to use a simple circuit to estimate things like inner product
and distance between two quantum vectors. We have assumed that we have
an efficient way of creating the states we are using, and we didn’t went
deep into explaining how. Given a $\epsilon > 0$, you can repeat the
previous circuit $O(\epsilon^{-2})$ times to have the desired precision.
Note the following thing: while calculating the value of $Z$ for two
vectors is easy, estimating it for calculating the distance between a
vector and the center of a cluster takes time linear in the number of
element in the superposition. Note that we can use amplitude estimation
in order to reduce the dependency on error to $O(\epsilon^{-1})$.</p>
<p>For the records:</p>
<p><span>Chernoff Bounds</span> Let $X = \sum_{i=0}^n X_i$, where $X_i = 1$
with probability $p_i$ and $X_i=0$ with probability $1-p_i$. All $X_i$
are independent. Let $\mu = E[X] = \sum_{i=0}^n p_i$. Then:</p>
<ul>
<li>
<p>$P(X \geq (1+\delta)\mu) \leq e^{-\frac{\delta^2}{2+\delta}\mu} $
for all $\delta > 0$</p>
</li>
<li>
<p>$P(X \leq (1-\delta)\mu) \leq e^{\frac{\delta^2}{2}}$</p>
</li>
</ul>
<p><span>Chebyshev</span> Let $X$ a random variable with $E[X] = \mu$ and
$Var[X]=\sigma_2$. For all $t > 0$:</p>
<script type="math/tex; mode=display">P(|X - \mu| > t\sigma) \leq 1/t^2</script>
<p>If we substitute $k/\sigma$ on $t$, we get the equivalent version that
we use to bound the error:
<script type="math/tex">P(|X - \mu|) \geq k) \leq \frac{\sigma^2}{k^2}</script></p>
<h2 id="calculations">Calculations</h2>
<p>It’s time now to prove that our claim is true and to show some
calculation. After all the previous passages, this is the initial state:</p>
<script type="math/tex; mode=display">\ket{0}\Big( \frac{1}{\sqrt{Z}} \left( |\vec{u}|\ket{0} + |\vec{v}|\ket{1} \right) \otimes \frac{1}{\sqrt{2}} (\ket{0}\ket{u} + \ket{1}\ket{v} ) \Big)</script>
<p>We apply an Hadamard on the leftmost ancilla register:</p>
<script type="math/tex; mode=display">\frac{1}{2\sqrt{Z}} \left[ \ket{0} \Big( \left( |\vec{u}|\ket{0} + |\vec{v}|\ket{1} \right) \otimes (\ket{0}\ket{u} + \ket{1}\ket{v} ) \Big) +
\ket{1} \Big( \left( |\vec{u}\ket{0} + \vec{v}\ket{1} \right) \otimes (\ket{0}\ket{u} + \ket{1}\ket{v} ) \Big) \right] =</script>
<script type="math/tex; mode=display">\begin{split}
= \frac{1}{2\sqrt{Z}} \Big[ \ket{0} \Big( |u|\ket{00u} + |u|\ket{01v} + |v|\ket{10u} + |v|\ket{11v} \Big) \\
\ket{1} \Big( |u|\ket{00u} + |u|\ket{01v} + |v|\ket{10u} + |v| \ket{11v}
\Big) \Big]
\end{split}</script>
<p>Controlled on the ancilla being $1$, we swap the second and the third
register:</p>
<script type="math/tex; mode=display">\begin{split}
= \frac{1}{2\sqrt{Z}} \Big[ \ket{0} \Big( |u|\ket{00u} + |u|\ket{01v} + |v|\ket{10u} + |v|\ket{11v} \Big) \\
\ket{1} \Big( |u|\ket{00u} + |u|\ket{10v} + |v|\ket{01u} + |v| \ket{11v} \Big) \Big]
\end{split}</script>
<p>Now we apply the Hadamard on the ancilla qubit again:</p>
<script type="math/tex; mode=display">\begin{split}
= \frac{1}{2^{3/2}\sqrt{Z}} \Big[
|u|\ket{000u} + |u|\ket{001v} + |v|\ket{010u} + |v|\ket{011v} \\
+|u|\ket{100u} + |u|\ket{101v} + |v|\ket{110u} + |v|\ket{111v} \\
+|u|\ket{000u} + |u|\ket{010v} + |v|\ket{001u} + |v|\ket{011v} \\
-|u|\ket{100u} - |u|\ket{110v} - |v|\ket{101u} - |v|\ket{111v}
\Big]
\end{split}</script>
<p>And now we check the probability of reading $\ket{1}$.</p>
<script type="math/tex; mode=display">\begin{split}
p(1) = \frac{1}{2^{3}Z} (u\bra{v10} + v\bra{u01} - u\bra{v01} - v\bra{u10})\\
(u\ket{01v} + v\ket{10u} - u\ket{10v} - v\ket{01u})
\end{split}</script>
<script type="math/tex; mode=display">\begin{split}
p(1) = \frac{2u^2 + 2v^2 - 4(u,v)}{8Z}
\end{split}</script>
<p>Thanks to IK and AG who checked :)</p>
<div id="refs" class="references">
<div id="ref-Lloyd2013QuantumLearning">
Lloyd, Seth, Masoud Mohseni, and Patrick Rebentrost. 2013. “Quantum
algorithms for supervised and unsupervised machine learning.” *ArXiv*
1307.0411 (July): 1–11. <http://arxiv.org/abs/1307.0411>.
</div>
</div>scinawaIntro to swap testSpace Estimation Of Hhl2017-12-27T00:00:00+01:002017-12-27T00:00:00+01:00https://luongo.pro/2017/12/27/Space-estimation-of-HHL<p>Let’s imagine that we are given a quantum computer with 100 logical
qubits, and let’s also assume that we have high gate fidelity (i.e.
applying a gate won’t introduce major errors in our computation). This
means that we can run all the algorithm that we want. An idyllic
situation like this won’t probably happen in the near future (let’s say
5 years). Even if now we have the first prototypes of quantum computers
with the first dozen of quibts, those qubits are not stable, and
therefore the computation we can do is pretty limited: in fact, these
prototypes aren’t able to perform error-free computations (there’s no
error correction yet), and that the computation won’t be “long” as we
want: we will be able to apply a limited number of gate before the
system decohere.</p>
<p>The question is the following: can we compete in solving linear system
of equations with a classical computer? Can we use it to run HHL
algorithm? Let’s recall that for HHL we need 1 qubit register for the
ancilla, a register for the output of phase estimation in the
Hamiltonian simulation (that will store the superposition of the
eigenvalues) and the rest of the qubit can store the input register. We
assume to have logical qubits in our comparison.</p>
<p>We’ll see what happen when we change precision for floating point
operations: 32, 64bit. The sparsity of the matrix is supposed to be
small. Since we want to be as close as possible as real cases, let’s
take a famous example of matrix considered sparse. The product-user
matrix that websites like Amazon, or Netflix use to run Recommendation
Algorithms: rows represent the users of the service, while columns are
the products. The rows are empty except where an user purchased a
specific product or watched a particular movie. Let’s say that an
educated guess for the sparsity of the matrix is $100$.</p>
<p><img src="/assets/HHL_resource_estimation/space_resource_estimation.png" alt="image" /></p>
<p>The upper horizontal line is an estimate of the space in TB of the
hard-disks for the whole Google (Cirrusinsight, n.d.) (13 EB), while the
lower one is an estimate for the storage need to store the images of
Google Maps (Mesarina, n.d.) (43 PB).</p>
<p>Let’s do an example to show what the software is plotting for $100$
qubits. In HHL we need an ancilla qubit. So we have 99 qubits. To get
$64$ bit precision, we need to allocate 64 qubits: this is the phase
estimation of the Hamiltonian simulation step. So now we are left with
just $35$ qubits. With $35$ qubits we can span a Hilbert space of
dimension $2^{35}$: this allow us to encode a vector of data with the
same number of components. Suppose our vector of known values has $64$
bits floating points: classically, the cost for storing this amount of
data we need $2^{35}*64$ bits, which are $0.549.5$TB (Terabytes).
Summing up the cost for storing a $2^{35} \times 2^{35}$ matrix with
sparsity $100$, we get $27$ TB.</p>
<p>Remember that each component of the vector will be encoded as
probability amplitude in our quantum register. This imply that our
precision in modifying a qubit need to grow, along with the number fo
qubit and the fidelity of the gates. Here we just focused on the
computational capabilities of a small quantum computer with respect to
HHL algorithm. Don’t forget that for HHL we will need to store the
matrix to invert as in the classical case (in form of QRAM or other
oracle). [Here](<a href="https://github.com/Scinawa/space_estimation_hhl">https://github.com/Scinawa/space_estimation_hhl</a>) the
code for generating the plot.</p>
<div id="refs" class="references">
<div id="ref-exagoogle">
Cirrusinsight. n.d. “How Much Data Does Google Store?”
</div>
<div id="ref-gmaps">
Mesarina, Malena. n.d. “How Much Storage Space Do You Need for Google
Maps?”
</div>
</div>scinawaLet’s imagine that we are given a quantum computer with 100 logical
qubits, and let’s also assume that we have high gate fidelity (i.e.
applying a gate won’t introduce major errors in our computation). This
means that we can run all the algorithm that we want. An idyllic
situation like this won’t probably happen in the near future (let’s say
5 years). Even if now we have the first prototypes of quantum computers
with the first dozen of quibts, those qubits are not stable, and
therefore the computation we can do is pretty limited: in fact, these
prototypes aren’t able to perform error-free computations (there’s no
error correction yet), and that the computation won’t be “long” as we
want: we will be able to apply a limited number of gate before the
system decohere.Rewriting Swap Test2017-12-04T00:00:00+01:002017-12-04T00:00:00+01:00https://luongo.pro/2017/12/04/Rewriting-Swap-Test<h1 id="rewriting-the-swap-test">Rewriting the swap test</h1>
<p>Some weeks ago I stumbled upon a nice paper by Garcia-Escartin and
Chamorro-Posada (2013). There, they show the equivalence between a
widely used circuit in quantum information, called the swap test, and a
phenomena that goes under the name of Hong-Ou-Mandel effect. In their
work, they rewrote the circuit of the swap test using less gates and
with no ancilla qubit. I find this fact pretty interesting, and I think
it’s worth sharing it with you. It gives us a new interpretation of the
swap test, that is possible to prove with very simple gate’s
manipulations. More specifically here we show the equivalence of the
swap test and the circuit we use to measure in the Bell’s bases (an
Hadamard on the first qubit and a CNOT) - the same used to create an EPR
pair. Here we will work with single qubit register, but the result can
be extended to register with multiple qubits. Assuming to work with one
qubit register, let’s recall the original circuit of the swap test:
<img src="/assets/rewriting_swap_test/Fig1.PNG" alt="image" /></p>
<p>The probability of reading $1$ is $ \frac{1 - \braket{a | b} }{2} $ and
the probability of reading $0$ is $ \frac{1 + \braket{a | b} }{2} $.</p>
<p>The probability of reading $1$ in the ancilla qubit of the original swap
test is equal to the probability of reading $11$ in the modified version
of the swap test in Figure 7.</p>
<p>Here is the proof. We start by rewriting the swap test as a series of
controlled not operations. Note that $x \oplus x = 0 $ and that
$x \oplus 0 = x$. It’s very simple to show that the swap gate can be
replaced with a series of CNOTs:
<script type="math/tex">\ket{x}\ket{y} \to \ket{x}\ket{x \oplus y} \to \ket{x \oplus (x \oplus y)}\ket{x \oplus y} \to \ket{y}\ket{x}</script></p>
<p><img src="/assets/rewriting_swap_test/Fig2.PNG" alt="image" /></p>
<p>We know that A NOT gate is just a $Z$ gate on another rotation axis. The
rotation axis can easily be changed by two surrounding Hadamard’s gate.</p>
<p><img src="/assets/rewriting_swap_test/Fig3.PNG" alt="image" /></p>
<p>The CCZ gate is pretty agnostic with respect to the target or control
qubit, so we can put the $Z$ rotation on any of the control qubit.</p>
<p><img src="/assets/rewriting_swap_test/Fig4.PNG" alt="image" /></p>
<p>In this circuit we note that some of the gates we are applying are
actually useless for the measurement on the ancilla qubit, and we can
remove them from the circuit.</p>
<p><img src="/assets/rewriting_swap_test/Fig5.PNG" alt="image" /></p>
<p>Again, we use the equivalence between the $X$ gate $HZH$ gate.</p>
<p><img src="/assets/rewriting_swap_test/Fig6.PNG" alt="image" /></p>
<p>We note that we could remove the ancilla qubit, and measure the other
two qubits instead. This is possible thanks to the principle of deffered
measurement. The probability of reading $1$ in the ancilla qubit is
equivalent to the probability of reading $1$ in both the qubit $\ket{a}$
and $\ket{b}$. We don’t mind measuring the two qubit since after
measuring the ancilla qubit we cannot use $\ket{a}$ and $\ket{b}$
nevertheless. So here we get the final circuit.</p>
<p><img src="/assets/rewriting_swap_test/Fig7.PNG" alt="image" /></p>
<p>This equivalence might be useful when we need to optimize a circuit and
we have to reduce both the number of gates and the number of ancilla
qubit. This result gives us a nice intuition on the behavior of the swap
test when the two qubits are entangled. But beware of remote hacking! Be
sure to run the swap test only on non entangled data, otherwise you
might get unexpected results! Take for example the Bell’s sates. $1$ of
the four Bell’s basis for 2 qubit gates will pass the test
($\frac{\ket{01} - \ket{10}}{\sqrt{2}}$), but it’s pretty
counterintuitive, since the first and the second qubit are always
different. Therefore, we should use the swap test only with
non-entangled data! Entanglement, along with its usefulness in quantum
protocols and computation brings much troubles. Indeed it’s because of
entanglement that bit commitment is not possible using quantum
resources, and it’s because of entanglement that we can attack position
based encryption schemes. And that’s it. Hope you enjoyed it as much as
I did. To extend this equivalence to multi-qubit register, look at the
paper!</p>
<div id="refs" class="references">
<div id="ref-garcia2013swap">
Garcia-Escartin, Juan Carlos, and Pedro Chamorro-Posada. 2013. “Swap
Test and Hong-Ou-Mandel Effect Are Equivalent.” *Physical Review A* 87
(5). APS: 052330.
</div>
</div>scinawaRewriting the swap testThe Hhl Algorithm2017-11-21T00:00:00+01:002017-11-21T00:00:00+01:00https://luongo.pro/2017/11/21/The-HHL-Algorithm<h1 id="the-hhl-algorithm">The HHL algorithm</h1>
<p>A linear system of $N$ equations can be represented in matrix form as
$A\vec{x}=\vec{b}$. Its solution is defined as $\vec{x}=A^{-1}\vec{b}$.
This tells us that if we want to get the solution vector $\vec{x}$, we
should be able to invert the matrix $A$. Classically inverting a matrix
can be done in polynomial time, usually with algorithms that scale
between the square and the cube in the dimension of the system. HHL is a
quantum algorithm that allows to create a quantum state proportional to
the solution $\ket{A^{-1}\vec{b}}$ in time $polylog(N)$. This will give
us an exponential speedup with respect to classical algorithms, but it
will introduce a time dependency on other factors, such as the sparsity
of the matrix or the conditioning number.\
Let’s recall some notions from linear algebra. Given a Hermitian matrix,
we are also given a<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> set of eigenvectors ${ \vec{\varphi_i} }$
which happen to be a base for the space. We can thus express the vector
$b$ as a linear combination of eigenvectors:
$\vec{b}=\sum_{j}\beta_j\vec{\varphi_j}$. The solution of the system is
therefore $\vec{x} = \sum_j \beta_j \lambda_j^{-1} \vec{\varphi}_j$. The
idea for the quantum algorithm is to relay on this observations and get
the state:
<script type="math/tex">\ket{x} = \sum_{j} \beta_j \lambda_j^{-1} \ket{ \varphi_j} = \ket{A^{-1}\vec{b}}</script></p>
<p>If our matrix $A$ is not Hermitian, we can associate an Hermitian
matrix: <script type="math/tex">% <![CDATA[
A' =
\begin{bmatrix}
0 & A \\
A^T & 0
\end{bmatrix} %]]></script> In this form, the multiplying a vector $x$ can be done
in this way: $(Ax, 0) = A’ (0, x)$.\
A few initial remarks now: creating a quantum state
$\ket{x} \in \mathbb{R}^{2^n}$ (where $n$ is the number of qubits) does
not means that we are “solving” the linear system of equation, as we
don’t have classical access to the solution. Indeed, doing quantum
tomography for recovering $\ket{x}$ would cost us $O(N)$, forcing us to
lose the exponential speedup. But we are just getting as an output a
normalized version of $\vec{x}$, that is $\ket{x}$. Indeed, there are a
few assumption which is better to state explicitly: Aaronson (2015)</p>
<ol>
<li>
<p>On the input vector we have the following restrictions:</p>
<ol>
<li>
<p>there must be a fast way of getting $|b\rangle$ from the
classical input vector $b$. If $b$ is made of $N$ components,
than we will need $log_2(N)$ qubits to express
$|b\rangle = \sum_{i=0}^{N-1} b_i |i\rangle$. Practically, we
assume the existence of QRAM: an operator
$B: |0\rangle \to |b\rangle$. I hope to write more about
this soon.</p>
</li>
<li>
<p>The initial vector $b$ should be relatively uniform, otherwise
it will contradict the impossibility of an exponential speedup
for black-box quantum search Aaronson (2015).</p>
</li>
</ol>
</li>
<li>
<p>The matrix $A$ should be $s$-sparse on rows (or other
efficiently-simulable kind of Hamiltonians). This is needed because
Hamiltonians with a sparse matrix can be efficiently simulated, and
the amount of time needed to simulate them grows linearly with the
sparsity $s$.</p>
</li>
<li>
<p>The conditioning number
$\kappa = \frac{\lambda_{max}} { \lambda_{min}} = $ should be
low, i.e. the matrix should be robustly invertible, because the
asymptotic complexity grows linearly with $\kappa$. Singular values
of $A$ should lie between $1/\kappa$ and $1$.</p>
</li>
<li>
<p>We are not interested in the values of $x$ itself (i.e. the
probability amplitudes of $\ket{x}$, but just in a measurement in a
basis of choice: $ \braket{x|M|x} $.</p>
</li>
</ol>
<p>There are many enhancements of HHL, where we can solve rectangular
matrices, over-determined and under-determined systems of equations,
dense matrices, and speedups obtained by applying amplitude
amplification techniques.</p>
<h2 id="first-step-hamiltonian-simulation-and-phase-estimation">First step: Hamiltonian simulation and phase estimation</h2>
<p>The first step of HHL is the eigenvalue estimation of the unitary matrix
associated to the linear transformation $U_A=e^{iA}$. Remember that
given an hermitian matrix $A$, there is an isomorphism between it’s
(real) eigenvalues and the eigenvalues of the unitary matrix $U=e^{iA}$.
For each eigenvalue $\lambda_i$ of A there is an eigenvalue
$e^{i\lambda_i}$. Hamiltonian simulation is a procedure that, given a
time $t$ and an Hamiltonian $H$, allows us to perform a time unitary
evolution $U$ associated to $H$ for a given state $|\psi\rangle$ for a
given time $t$:</p>
<script type="math/tex; mode=display">U|\psi{0\rangle} = e^{-iHt}|\psi(0)\rangle = |\psi(t)\rangle</script>
<p>Controlled operations of $U$ allows us to write the eigenvalues of $U$
in the phase of our quantum computer. As usual, we use an index register
$\sum_{i}^{O(\epsilon)} |i\rangle$ with a uniform superposition and use
this register the apply the controlled $U$. Let’s imagine
$K \propto O(\varepsilon)$ is the chosen precision for phase estimation.
That will allow us to build the following mapping:</p>
<script type="math/tex; mode=display">\sum_{k \in [K]} \sum_{j \in [N]} |k\rangle \beta_j|\varphi_j\rangle \to \sum_{k \in [K] } \sum_{j \in [N]} e^{2\pi i k \lambda_i/N}|k\rangle\beta_j|\varphi_j\rangle</script>
<p>Using QFT$^{-1}$ we perform a phase estimation as usual:
<script type="math/tex">\sum_{k \in [K]}\sum_{j \in [N]} e^{2\pi i k \lambda_i/N}|k\rangle\beta_j|\varphi_j\rangle \to \sum_{j \in [N]}\beta_{j}|\varphi_j\rangle|\tilde{\lambda_j\rangle}</script></p>
<p>The idea is to use quantum phase estimation allows to calculate
eigenvalues and eigenvector of the Hermitian operator associated to a
matrix $A$. We need Hamiltonian simulation in order to encode
efficiently the eigenvalues as a phase. Phase estimation is applied
next, writing an approximation of the eigenvalues in a register:
$|\tilde{\lambda_j\rangle}$.</p>
<h2 id="second-step-controlled-rotations">Second step: controlled rotations</h2>
<p>The second step is where the magic happen. Note that multiplying each
eigenvector by its the inverse of an eigenvalue is not unitary
transformation, so we have to find some trick. The problem is solved in
by introducing the right non linear operation: a measurement on an
ancilla qubit. We adjoin a single qubit register, and we perform an
operation controlled on the eigenvalues estimated in the previous step.
Here, $C=\lambda_{min}$:</p>
<script type="math/tex; mode=display">|\lambda_j\rangle|0\rangle \to |\lambda_j\rangle\otimes \left( \sqrt{1-(\frac{C}{\lambda_j})^2}|1\rangle + \frac{C}{\lambda_j}|0\rangle \right)^A</script>
<p>Measuring the $A$ register, we make the rest of the state into the
subspace consistent with our observation. That’s a neat trick that allow
us to “move out” a value inside a ket in the “outer world” in a
meaningful way. To get rid of the register with the eigenvalues
$|\lambda_j\rangle$, we run eigenvalue estimation in reverse, in order
to be left with the state
<script type="math/tex">\sum_{j} \beta_j C\lambda_j^{-1} |\psi_j\rangle |1\rangle +|G\rangle|0\rangle = \sum_{j} C\beta_j U_A^{-1} |\psi_j\rangle \ket{1} + \ket{G}\ket{0} = C\ket{x}\ket{0} +\ket{G}\ket{1}.</script></p>
<p>Now we measure the ancilla qubit.</p>
<ol>
<li>
<p>If we observe $|1\rangle$, the new state of the system is
<script type="math/tex">\sum_{j}\beta_jC\lambda_j^{-1}|\varphi_j\rangle \propto |x\rangle</script>
The probability of observing $|1\rangle$ is:
$ p(|1\rangle) =C^2 |||x\rangle||^2$. In this step we have
introduced a dependency on the conditioning number:
<script type="math/tex">p(|1\rangle) = \sum_{j} |\frac{\beta_jC}{\lambda_y}|^2 \geq |\frac{C}{\lambda_{max}}|^2= \frac{1}{\kappa^2}</script></p>
</li>
<li>
<table>
<tbody>
<tr>
<td>If we observe $</td>
<td>0\rangle$ we start again from step 1.</td>
</tr>
</tbody>
</table>
</li>
</ol>
<p>It is possible to use amplitude amplification on $|1\rangle$ to increase
by a factor of a square root the dependency on $\kappa^{-2}$.</p>
<h1 id="complexity-analysis">Complexity analysis</h1>
<p>The cost of eigenvalue estimation with error less than $\epsilon$, is
$O(log(N)s^{2} + log(1/\varepsilon))$. To read $|1\rangle$ in the second
step, we should repeat the algorithm $O(\kappa^2)$ times. I extend a
little bit <strong>???</strong>, which list the complexity of all the version of
quantum algorithm for solving linear system of equations.</p>
<ul>
<li>
<p>The original version Aram W. Harrow, Hassidim, and Lloyd (2009), we
have: <script type="math/tex">O(\kappa^{2}log(N)s^{2} / \epsilon)</script>
<script type="math/tex">\tilde{O}(\kappa T_B + log (N) s^2 \kappa^2 T_A / \varepsilon)</script></p>
</li>
<li>
<p>In Ambainis (2010), where he used a technique called variable time
amplitude amplification in order to decrease the running time, by
applying a particular flavor of Grover algorithm in order to reduce
the dependency on $\kappa$ from $\kappa^3$ to $\kappa^2$.
<script type="math/tex">\tilde{O}(\kappa T_B + log(N)s^2 \kappa T_A / \varepsilon^3)</script>
This results is based upon a previous work of Childs, Kothari, and
Somma (2015) in precision for simulation sparse Hamiltonians. I
think results might have changed since Hao Low and Chuang (2016):
the last result I am aware of on Hamiltonian simulation.</p>
</li>
<li>
<p>In Childs, Kothari, and Somma (2015) they improved the dependency on
the error: going from a polynomial dependency to a polylog. The idea
is to avoid the QFT, whose dependency on error cannot be improved.</p>
</li>
<li>
<p>In Wossnig, Zhao, and Prakash (2017), instead of using Hamiltonian
simulation, they use singular value estimation: a procedure that use
QRAM to create a register proportional to the superposition of the
singular values of a matrix. Their complexity is
$O(\kappa^2\sqrt{N}\times polylog(N)/\varepsilon)$ for dense
matrices, and $O(\kappa^2||A||_F \times polylog(N)/\varepsilon)$.
This idea is based on the result of Kerenidis and Prakash (2016).</p>
</li>
</ul>
<h1 id="conclusions">Conclusions</h1>
<p>This was a pretty significant result in quantum algorithmic, that made
possible many of the first results in quantum machine learning. For
instance, this algorithm can be used to performing least-squares
estimation of a model Aram W Harrow (2014). You are given a matrix
$A \in R^{n \times p}$ with $n \geq p$, and $b \in \mathbb{R}^n$. You
are asked to find the best solution $x$ with the constrain that
<script type="math/tex">\operatorname*{arg\,min}_{xc \in \mathbb{R}^p} ||Ax -b||.</script> To relate
linear system of equations and function minimization, consider the
following function $f(x) = x^TAx-x^Tb $. Then, the gradient of the
function is $\nabla f(x) = Ax-b $ therefore finding the minimum of the
function (with additional hypotesis of ML, such as convexity,
smoothness, etc..) reduces to solving a linear system of equations. This
is a well-known in Machine Learning, and some qML algorithms started to
use this subrutine soon after it has been crated. For instance in data
fitting: Wiebe, Braun, and Lloyd (2012), Rebentrost, Mohseni, and Lloyd
(2014) and other things like differential equations Berry (2014). For an
useful answer on stack-overflow, refer to Vega (n.d.).\
It’s worth adding that in the original paper, they use a particular
initial state for the control register, in order to minimize some error
studied in the paper’s appendix. Many useful information can be found
in: Melkebeek (2010), Aram W. Harrow, Hassidim, and Lloyd (2009), and
Lloyd (n.d.).</p>
<div id="refs" class="references">
<div id="ref-Aaronson2015ReadPrint">
Aaronson, Scott. 2015. “Read the fine print.” *Nature Physics* 11 (4):
291–93. doi:[10.1038/nphys3272](https://doi.org/10.1038/nphys3272).
</div>
<div id="ref-Ambainis2010VariableEquations">
Ambainis, Andris. 2010. “Variable time amplitude amplification and a
faster quantum algorithm for solving systems of linear equations.”
</div>
<div id="ref-berry2014high">
Berry, Dominic W. 2014. “High-Order Quantum Algorithm for Solving Linear
Differential Equations.” *Journal of Physics A: Mathematical and
Theoretical* 47 (10). IOP Publishing: 105301.
</div>
<div id="ref-Childs2015QuantumPrecision">
Childs, Andrew M, Robin Kothari, and Rolando D Somma. 2015. “Quantum
linear systems algorithm with exponentially improved dependence on
precision.”
</div>
<div id="ref-HaoLow2016HamiltonianQubitization">
Hao Low, Guang, and Isaac L Chuang. 2016. “Hamiltonian Simulation by
Qubitization.”
</div>
<div id="ref-Harrow2014ReviewEquations">
Harrow, Aram W. 2014. “Review of Quantum Algorithms for Systems of
Linear Equations,” December, 2–4. <http://arxiv.org/abs/1501.00008>.
</div>
<div id="ref-Harrow2009QuantumEquations">
Harrow, Aram W., Avinatan Hassidim, and Seth Lloyd. 2009. “Quantum
Algorithm for Linear Systems of Equations.” *Physical Review Letters*
103 (15): 150502.
doi:[10.1103/PhysRevLett.103.150502](https://doi.org/10.1103/PhysRevLett.103.150502).
</div>
<div id="ref-Kerenidis2016QuantumSystems">
Kerenidis, Iordanis, and Anupam Prakash. 2016. “Quantum Recommendation
Systems.”
</div>
<div id="ref-lloydyoutube">
Lloyd, Seth. n.d. “Quantum Algorithm for Solving Linear Equations.”
</div>
<div id="ref-Melkebeek2010Lecture12Equations">
Melkebeek, Instructor Dieter Van. 2010. “Lecture12 : Order Finding
‘Solving’ Linear Equations,” 1–4.
[http://pages.cs.wisc.edu/{\\\~{}}dieter/Courses/2010f-CS880/Scribes/12/lecture12.pdf](http://pages.cs.wisc.edu/{\~{}}dieter/Courses/2010f-CS880/Scribes/12/lecture12.pdf).
</div>
<div id="ref-Rebentrost2014QuantumClassification">
Rebentrost, Patrick, Masoud Mohseni, and Seth Lloyd. 2014. “Quantum
support vector machine for big data classification.” *Physical Review
Letters*.
doi:[10.1103/PhysRevLett.113.130503](https://doi.org/10.1103/PhysRevLett.113.130503).
</div>
<div id="ref-otherHHLuse">
Vega, Juan Bermejo. n.d. “Applications of Hhl’s Algorithm for Solving
Linear Equations.”
</div>
<div id="ref-Wiebe2012QuantumFitting">
Wiebe, Nathan, Daniel Braun, and Seth Lloyd. 2012. “Quantum Algorithm
for Data Fitting.” *Physical Review Letters* 109 (5): 050505.
doi:[10.1103/PhysRevLett.109.050505](https://doi.org/10.1103/PhysRevLett.109.050505).
</div>
<div id="ref-wossnig2017quantum">
Wossnig, Leonard, Zhikuan Zhao, and Anupam Prakash. 2017. “A Quantum
Linear System Algorithm for Dense Matrices.” *ArXiv Preprint
ArXiv:1704.06174*.
</div>
</div>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>very convenient - since they are orthogonal. Generalized
eigenvectors are not orthogonal <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>scinawaThe HHL algorithmTransavia is not recommended for travelling musicians2017-01-06T16:16:00+01:002017-01-06T16:16:00+01:00https://luongo.pro/2017/01/06/transavia-not-recommended<p>A couple of days ago I traveled with my guitar from VCE to ORY.
For the safety of my dear guitar, <em>much care was taken in order to find a trustworthy airline that allow to carry guitars as hand luggages</em>.</p>
<p>I made sure that my case was largely <em>within</em> the allowed dimensions for musical instrument, as <a href="http://service-en.transavia.com/app/answers/detail/a_id/454/~/is-it-possible-to-take-a-musical-instrument-along-with-me-into-the-cabin%3F">stated in their website</a>.</p>
<p>While checking othoer luggage, I asked for a confirmation that my guitar will be taken as my hand luggage. The lady at the checking told me that since my guitar has a hard case, she wasn’t 100% sure I could load it into the cabin.</p>
<p>Unfortunately, the website doe <em>not</em> speak <em>at all</em> about the kind of cover that your guitar should have.</p>
<p>Thanks to a delay of 30 minutes, I asked on the twitter profile of transavia some clarification about the behaiour of the crew. I received only useless information back:</p>
<p><img src="/assets/twitter_transavia.png" alt="alt text" title="Useful conversation on twitter" /></p>
<p>After having stated clearly that I was not going to take any responsability for any damaged eventually caused, a gentlemen of the land crew - to whom I owe a lot of gratitude - spoke with the cabin crew and a person in charge of luggages and I was allowed to place my guitar in a service closet.</p>
<p>Long story short, this time I was luky. But if you are a musician, I wouldn’t recommend you to fly with transavia: people cannot rely on the kindness of flight attendant. If you do, always ask at the check-in a confirmation that your instrument will be taken as hand luggage, and always be prepared for the worst (like double packing your guitar with bubble wrap).</p>
<p>It would be desirable if Transavia starts to use social media in a more meaningful way, and to have a clear policy on hand luggages (of better trained flight assistent).</p>
<p>Anyway, this was the condition of my other checked in luggage after the trip. Not bad for the first flight of my new suitcase.</p>
<p><img src="/assets/baggage.png" alt="alt text" title="My first trip with my new suitcase" /></p>scinawaA couple of days ago I traveled with my guitar from VCE to ORY.
For the safety of my dear guitar, much care was taken in order to find a trustworthy airline that allow to carry guitars as hand luggages.My i3 configuration for Qubes-OS2017-01-06T16:16:00+01:002017-01-06T16:16:00+01:00https://luongo.pro/2017/01/06/my-i3-config<p>Here you can find some useful tips on the configuration of i3 (a windows manager) and it’s integration with qubes.
This post is mostly a reference for random walkers on google who happend to search the right keyword, and me, since I tend to forget quickly how I achieved certain configuration which is edited for:</p>
<ol>
<li>Toggle/untoggle keyboard backlight with $mod+n</li>
<li>Exec i3lock with $mod+b</li>
<li>Open a terminal in qubes appvm with $mod+t</li>
<li>Up & Down volume keys (keyboard dependent)</li>
<li>I can name my workspaces with $mod+y. If the name start with a number, than I can treat them like they were just numbered and switch between workspaces and move container as before. Last but not least, I have enabled shortcut to move between previously used workspaces.</li>
</ol>
<p>Ready?</p>
<div class="highlighter-rouge"><pre class="highlight"><code>set $mod Mod1
# Font for window titles. Will also be used by the bar unless a different font
# is used in the bar {} block below.
# This font is widely installed, provides lots of unicode glyphs, right-to-left
# text rendering and scalability on retina/hidpi displays (thanks to pango).
font pango:DejaVu Sans Mono 10
# Before i3 v4.8, we used to recommend this one as the default:
# font -misc-fixed-medium-r-normal--13-120-75-75-C-70-iso10646-1
# The font above is very space-efficient, that is, it looks good, sharp and
# clear in small sizes. However, its unicode glyph coverage is limited, the old
# X core fonts rendering does not support right-to-left and this being a bitmap
# font, it doesn’t scale on retina/hidpi displays.
# Use Mouse+$mod to drag floating windows to their wanted position
floating_modifier $mod
# start a terminal in the domain of the currently active window
bindsym $mod+Return exec qubes-i3-sensible-terminal
# kill focused window
bindsym $mod+Shift+q kill
# start dmenu (a program launcher)
bindsym $mod+d exec --no-startup-id i3-dmenu-desktop --dmenu="dmenu -nb #d2d2d2 -nf #000000 -sb #63a0ff"
# change focus
bindsym $mod+j focus left
bindsym $mod+k focus down
bindsym $mod+l focus up
bindsym $mod+ograve focus right
bindsym $mod+t exec "qvm-run Home gnome-terminal"
# alternatively, you can use the cursor keys:
bindsym $mod+Left focus left
bindsym $mod+Down focus down
bindsym $mod+Up focus up
bindsym $mod+Right focus right
# move focused window
bindsym $mod+Shift+j move left
bindsym $mod+Shift+k move down
bindsym $mod+Shift+l move up
bindsym $mod+Shift+ograve move right
# alternatively, you can use the cursor keys:
bindsym $mod+Shift+Left move left
bindsym $mod+Shift+Down move down
bindsym $mod+Shift+Up move up
bindsym $mod+Shift+Right move right
# split in horizontal orientation
bindsym $mod+h split h
# split in vertical orientation
bindsym $mod+v split v
# enter fullscreen mode for the focused container
bindsym $mod+f fullscreen
# change container layout (stacked, tabbed, toggle split)
bindsym $mod+s layout stacking
bindsym $mod+w layout tabbed
bindsym $mod+e layout toggle split
# toggle tiling / floating
bindsym $mod+Shift+space floating toggle
# change focus between tiling / floating windows
bindsym $mod+space focus mode_toggle
# focus the parent container
bindsym $mod+a focus parent
# focus the child container
#bindsym $mod+d focus child
# switch to workspace
bindsym $mod+1 workspace number 1
bindsym $mod+2 workspace number 2
bindsym $mod+3 workspace number 3
bindsym $mod+4 workspace number 4
bindsym $mod+5 workspace number 5
bindsym $mod+6 workspace number 6
bindsym $mod+7 workspace number 7
bindsym $mod+8 workspace number 8
bindsym $mod+9 workspace number 9
bindsym $mod+0 workspace number 10
# move focused container to workspace
bindsym $mod+Shift+1 move container to workspace number 1
bindsym $mod+Shift+2 move container to workspace number 2
bindsym $mod+Shift+3 move container to workspace number 3
bindsym $mod+Shift+4 move container to workspace number 4
bindsym $mod+Shift+5 move container to workspace number 5
bindsym $mod+Shift+6 move container to workspace number 6
bindsym $mod+Shift+7 move container to workspace number 7
bindsym $mod+Shift+8 move container to workspace number 8
bindsym $mod+Shift+9 move container to workspace number 9
bindsym $mod+Shift+0 move container to workspace number 10
bindsym $mod+y exec i3-input -F 'rename workspace to "%s"' -P ' New name for this workspace'
# reload the configuration file
bindsym $mod+Shift+c reload
# restart i3 inplace (preserves your layout/session, can be used to upgrade i3)
bindsym $mod+Shift+r restart
# exit i3 (logs you out of your X session)
bindsym $mod+Shift+e exec "i3-nagbar -t warning -m 'You pressed the exit shortcut. Do you really want to exit i3? This will end your X session.' -b 'Yes, exit i3' 'i3-msg exit'"
# resize window (you can also use the mouse for that)
mode "resize" {
# These bindings trigger as soon as you enter the resize mode
# Pressing left will shrink the window’s width.
# Pressing right will grow the window’s width.
# Pressing up will shrink the window’s height.
# Pressing down will grow the window’s height.
bindsym j resize shrink width 10 px or 10 ppt
bindsym k resize grow height 10 px or 10 ppt
bindsym l resize shrink height 10 px or 10 ppt
bindsym ograve resize grow width 10 px or 10 ppt
# same bindings, but for the arrow keys
bindsym Left resize shrink width 10 px or 10 ppt
bindsym Down resize grow height 10 px or 10 ppt
bindsym Up resize shrink height 10 px or 10 ppt
bindsym Right resize grow width 10 px or 10 ppt
# back to normal: Enter or Escape
bindsym Return mode "default"
bindsym Escape mode "default"
}
bindsym $mod+r mode "resize"
# Start i3bar to display a workspace bar (plus the system information i3status
# finds out, if available)
bar {
status_command qubes-i3status
font -misc-fixed-medium-r-normal--13-120-75-75-C-70-iso10646-1
colors {
background #d2d2d2
statusline #00000
#class #border #backgr #text
focused_workspace #4c7899 #63a0ff #000000
active_workspace #333333 #5f676a #ffffff
inactive_workspace #222222 #333333 #888888
urgent_workspace #BD2727 #E79E27 #000000
}
}
# Use a screen locker
exec --no-startup-id "xautolock -detectsleep -time 6 -locker 'i3lock -d -c 008000' -notify 30 -notifier \"notify-send -t 2000 'Locking screen in 30 seconds'\""
# Make sure all xdg autostart entries are started, this is (among other things)
# necessary to make sure transient vm's come up
exec --no-startup-id qubes-i3-xdg-autostart
# bindsym XF86AudioRaiseVolume exec "amixer -q sset Master,0 1+ unmute"
# bindsym XF86AudioLowerVolume exec "amixer -q sset Master,0 1- unmute"
#bindsym $mod+p exec "amixer -q sset Master,0 1+ unmute"
bindsym $mod+m exec "amixer -q sset Master,0 1- unmute"
bindsym XF86AudioMute exec "amixer -q sset Master,0 toggle"
bindsym $mod+Shift+n exec "sudo bash /home/scinawa/.i3/keybacklight.sh"
bindsym $mod+Shift+b exec "i3lock -c 045347"
bindsym $mod+n workspace next
bindsym $mod+p workspace prev
</code></pre>
</div>
<h3 id="bash-script-for-backlight-of-keyboard">Bash script for backlight of keyboard.</h3>
<p>The bash script I’m calling with $mod+n is the following, which is copied from [1]</p>
<div class="highlighter-rouge"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nv">STATUS</span><span class="o">=</span><span class="sb">`</span>xset -q | grep <span class="s2">"LED"</span> | awk <span class="s1">'{print $10}'</span><span class="sb">`</span>
<span class="k">if</span> <span class="o">[</span> <span class="s2">"</span><span class="k">${</span><span class="nv">STATUS</span><span class="k">}</span><span class="s2">"</span> <span class="o">=</span> <span class="s2">"00000000"</span> <span class="o">]</span>
<span class="k">then
</span>xset led 3
<span class="k">else
</span>xset -led 3
<span class="k">fi
</span><span class="nb">exit </span>0
</code></pre>
</div>
<h3 id="creating-shortcut-wise-calls-to-browser-profiles">Creating shortcut-wise calls to browser profiles</h3>
<p>I find cozy to have specific browser shortcut in my i3 menu, where I can lunch specific firefox profile.
In this way, I can call specific browser profiles (banking, social, cloud stuff, etc..) with a name that is faster to type than $appvm-firefox$ and than switching profile.</p>
<p>This is what I did recently to get a fast shortcut for my web app of calendar I’m currently using:</p>
<p>In dom0:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>vim /home/`whoami`/.local/share/applications/$APPVM-name.desktop
</code></pre>
</div>
<p>Edit it and specify:</p>
<ol>
<li>A mnemonic name which is also fast to type, like “CAL”.</li>
<li>The proper link to your appvm menu file: something like <code class="highlighter-rouge">/usr/share/application/calendar.desktop</code></li>
</ol>
<p>In the AppVM you want to lunch the application:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>cd /usr/share/application/
cp firefox.desktop calendar.desktop
vim calendar.desktop
</code></pre>
</div>
<p>Scroll down and modify the exec line like this:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>X-Desktop-File-Install-Version=0.22
[Desktop Action new-window]
Name=Open a New Window
Exec=firefox -P calendar %u
[Desktop Action new-private-window]
Name=Open a New Private Window
Exec=firefox --private-window %u
</code></pre>
</div>
<p>When I type “CAL” in the i3 menu, I open a specific firefox profile with the homepage that I have specified. Voilà!</p>
<h3 id="references">References</h3>
<p>[1] https://m.reddit.com/r/i3wm/comments/3sumks/cant_get_scrolllock_and_mod_key_to_work/</p>scinawaHere you can find some useful tips on the configuration of i3 (a windows manager) and it’s integration with qubes.
This post is mostly a reference for random walkers on google who happend to search the right keyword, and me, since I tend to forget quickly how I achieved certain configuration which is edited for:Migrations and functors2016-08-21T14:12:00+02:002016-08-21T14:12:00+02:00https://luongo.pro/2016/08/21/functors-from-.pro-to-.cat<p>I have recently acquired a .cat domain and I’m long time owner of a .pro domain. If you wonder, .pro is designed to be used by professionals, and .cat is the domain for catalunya. But for me (and me only, probably):</p>
<p>.PrO is also the category of Pre-Orders.
Objects in PrO are preoders, and arrows are homomorphism between preorders.</p>
<p>.Cat is the category of (small) categories.
Objects are categories, and arrows are functors between categories.</p>
<p>This post is to tell you that I have migrated my blog from a .pro domain to a .cat domain. What a good Fun occasion to remember that there is a functor between these categories. (There is also the opposite functor, but that’s for another post. :p )</p>
<p>Theorem:
There is a functor $i : PrO \to Cat$ such that:</p>
<ol>
<li>each $(X, \leq) \in$ Pr0 is an element of Ob(Cat)</li>
<li>
<table>
<tbody>
<tr>
<td>For each $x,y \in Ob(\chi)$, $</td>
<td>Hom_\chi(x,y)</td>
<td>\leq 1$</td>
</tr>
</tbody>
</table>
</li>
</ol>
<p>Moreover, every category with property 2 is in the image of the functor $i$.</p>
<p>Proof:</p>
<ol>
<li>On object part: we have to send each preorder $(X,\leq)$ into a category $\chi=i(X, \leq_X)$ built in this way:
<ol>
<li>Objecs: every element x of the preorder $(X,\leq_X)$ is sent into an object of the category $\chi$. $Ob(\chi)=X$</li>
<li>Morphisms: for every $x,y \in X$ such that $x \leq y$ there is a unique morphism in Hom_\chi(x, y).
By the way, since we are in Cat, this morphism is a functor between category x and y.
For each element $x \in (X,\leq)$, reflexivity becomes the $id_x$ (in this case, the identity functor for each category in Cat)
The transitivity property in preorders is expressed as morphism composition (in this case, functor composition)
Laws of morphism composition are satisfied automatically, because your codomain is already a category.</li>
</ol>
</li>
<li>On morphism part, we are sending each preorder morphism into a functor (aka a morphism in Cat).
The idea is to show that composition of preorder morphisms satisfy the laws of arrows composition.
Given a morphism between preorders $f : (X, \leq_x) \to (Y, \leq_y)$ there is a morphism $i(f): \chi \to \mathcal{Y} \in Hom_Cat(\chi,\mathbb{Y})$. This morphism is a functor, such that:
<ol>
<li>On object: for each $x \in Ob(\chi) = X $ there is an element $y = f(x) \in Ob(\mathbb{Y})=Y$.</li>
<li>On morphism part: if there is an arrow $x \to x’$ in $\chi$, then we know that $x\leq x’$ \in (X, \leq_X). By definition of preorder morphism we know therefore that $f(x) \leq f(x’)$, and so we require a morphism between $f(x) \to f(x’) \in \mathbb{Y}$.</li>
</ol>
<p>It’s possible to see that preorders moprhism are transitive (they form a preorder, the preorder of preorder morphism, lol) and since Cat is a category, if $|Hom_{Cat}(\chi, \mathbb{Y})| > 1$ and $|Hom_{Cat}(\mathbb{Y}, \mathbb{Z})| > 1$, then $|Hom_{Cat}(\chi, \mathbb{Z})| > 1$, and this guaratee us to satisfy the law of morphism composition in Cat.
In PrO, the identity morphism is the identity (it preservs order relations in a preorder), we see that each $id_{Pro}$ is sent into the trivial functor in Cat.</p>
</li>
</ol>
<p>I really like the book, but since I’m a noob in cat and I want things to be as verbose and pedantic as possible, I rewrote the proof there in a more schematic way.</p>
<h5 id="to-recap">To recap:</h5>
<p>As slogan 4.2.1.18[1] say:
A preorder is a category in which every hom-set has either 0 elements or 1 element. A preorder morphism is just a functor between such categories.</p>
<p>Daje</p>
<h4 id="references">References</h4>
<p>[1] Spivak, David I. Category theory for the sciences. MIT Press, 2014.</p>returnlambdaI have recently acquired a .cat domain and I’m long time owner of a .pro domain. If you wonder, .pro is designed to be used by professionals, and .cat is the domain for catalunya. But for me (and me only, probably):A primer on Projective Simulation: a (quantum) ML algorithm.2016-06-13T19:02:00+02:002016-06-13T19:02:00+02:00https://luongo.pro/2016/06/13/a-primer-on-projective-simulation<div class="separator" style="clear: both; text-align: center;"></div>In these days, I had the chance to read about a recent algorithm for (quantum) machine learning. Its name is Projective Simulation (PS), proposed by Briegel and Cuevas [1] in 2012. PS melts together ideas from neural networks (whence takes the idea of a graph-like network of objects to model the used to model the memory used in PS) and reinforcement learning (wherefrom it takes the idea of training the algorithm using rewards and punishments as a function of the environment). PS connect aspects of research such as artificial intelligence, quantum information, and quantum computation.<br /><br /><br />Among the various features that make this algorithm interesting, one of the things I like most is that a steps of PS can be executed on a quantum physical system - such as a quantum computer - even gaining computational efficiency over classical computers using quantum interference [11], or exploiting topological structure of certain kind of graph [12].<br /><br /><br />PS uses a graph-like representation of the memory of an agent. This representation is so general that allows us to think of the neural networks as a <i>physical</i> implementation of the same kind of memory - called episodic and compositional memory - used by PS. In fact, both algorithms share the concepts that any input received is accompanied by a certain spatiotemporal excitation pattern within the nodes of a network, where similar input cause the same excitation. But PS is way more than this...<br /><br /><a name='more'></a><br /><h3>A digression on freedom and information-processing machines</h3><br />PS algorithm was proposed with a broader scope than being a working quantum Machine Learning (ML) algorithm. In fact, PS play a relevant role in the philosophical debate on the existence of free will. Briegel et al. [7], instead of claiming the presence or the absence of free will in mankind, claim that <i>programmable structure, if (a) sufficiently complex and organized, and (b) capable of developing a specific kind of memory, can exhibit behaviors of creativity and free-will</i>. PS, other than competing with other Reinforcement Learning (RL) algorithms, is proposed as a tool: a fundamental <i>information-theoretic concept that can be used to define a notion of freedom (creativity, intelligence), compatible with the laws of physics</i>.<br /><br /><i>If we accept that free will is compatible with physical law, we also have to accept that it must be possible, in principle, to build a machine that would exhibit similar forms of freedom as the one we usually ascribe to humans and certain animals. [7]</i><br /><br /><br />Think of this algorithm as the "proof" of the claims of the authors relatively the existence of complex system exhibiting free will. We will see how with PS we can create agents whose behavior can (hopefully, unarguably) considered by common sense as creative or intelligent.<br /><br /><br />To better understand the design choices behind the algorithm, we will define Intelligence as the capability of an agent to perceive and act on its environment in a way that maximizes its chances of success. Creativity (a manifestation of intelligence) as the capability of an agent to deal with unprecedented situations, and relate a given situation to other conceivable situations [7][1]. Learning is described as a modification of the molecular details of a neural network of an agent due to experience. <br /><br /><br /><i>The definition of learning is purposely similar to what happens in the brain of some animals</i>, as discovered by some recent results in neuroscience [8]. For instance, the behavior of this poor Aplysia <i>can be largely described as a stimulus-reflex circuit, where the structure of the circuit change over time [9]</i>.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://upload.wikimedia.org/wikipedia/commons/e/ef/Aplysia_californica.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" height="231" src="https://upload.wikimedia.org/wikipedia/commons/e/ef/Aplysia_californica.jpg" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">ECM-like memory animal. LOL</td></tr></tbody></table><br />PS can be used to analyze emerging behavior of an agent under specific conditions, so to test its behavior in simple environments, such as very simple games.<br /><br /><br />Authors remark that PS does not try to:<br /><br /> explain how the brain works,<br /> explain the nature of consciousness,<br /> explain the nature of human freedom.<br /><br />Let's see what PS <i>is</i> about now.<br /><br /><h3>Key concepts in PS</h3><br />The model of memory used by the intelligent agents in PS is called <i>Episodic and Compositional Memory</i>. ECM is deliberately similar to other notion of memory in other fields of science. Based on the concept of episodic memory, (see this article by Tulving and Ingvar), ECM is basically a stochastic network of clips. We define:<br /><br /><ul><li> <b>percepts</b>: pieces of data that represent information from the outer world. These are the possible inputs of our algorithm. Formally, a percept is a tuple $( s = (s_1, s_2, ...,s_N) \in S $ , where $S$ is the percept space $S_1 \times ... \times S_n$. Each $s_i = 1, ..., |S_i|$.</li><li> <b>actions</b>: actions the agent can perform in the external world. These are represented by tuples in the action (or actuator) space: $a = (a_1, a_2, ..., A_k) \in A = A_1 \times ... A_k$ where $a_k = 1, ... |A_k| $ . Imagine $a_1$ as the state of "doing a jump", $a_2$ the state of "walking" and so on. These are the output of a single call to EC memory.</li><li> <b>clips</b>: these are nodes in the network of EC memory. A clip is meant to represent the fundamental unit of the episodic memory I told you before. Percept clips are those clips $c$ that gets stimulated by percepts $s$ according to a specific probability distribution $I(c|s)$, while action clips $a$ are clips that, if stimulated, trigger an action. One of the most important features in ECM is the possibility to have remembered or fictitious percepts ⓢ$ := \mu(s) \in \mu(S)$ or actions ⓐ . Fictitious or remembered percepts or actions are stored inside clips, as sequences. Each clip has a length $L \geq 1$, which means that $c$ is composed by of $(c^1, c^2, ..., c^L)$ where each $c^i \in \mu(S) \cup \mu(A)$. (In this article we will only use clips with $L=1$).</li><li> <b>edges</b> : are objects representing directed arc between clips (they have a starting clip and a receiving clip), and they contain data useful to the execution of the algorithm, such as weights, glowing tags, and emotion tags.</li><li> <b>emotions</b>: peace of data (called emotions tag) attached to the edge. Tags are represented as tuples $e = (e_1, e_2, ..., e_k)$ in the emotional space $E \equiv E_1 \times ... E_k = E, e_k = 1, ... |E_k| $. </li></ul><br />The weight of an edge between clip $c_i$ and $c_j$ is at time $t$ is stored in the weight matrix $h^t(c_i, c_j)$. The transition probabilities between clips are directly proportional to the weight of the edges. At the beginning of the execution of the algorithm, every percept clip is directly connected to every action clip. For the first part of our journey into PS, there will be no further connections between clips, and there are no further layers of clips (this will be generalized in the further sections). The reward function is $\Lambda : S \times A \to \mathbb{R}$.<br /><br /><br />The gist of PS is that a percept excites a percept clips, this excitation will start a random walk in the episodic memory, going through a chain of clips, eventually triggering an action clip. The transition probability between states of this stochastic process is obtained as a function of the weight matrix.<br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-aw1ga6m0-pc/V2Zeo2jP_iI/AAAAAAAACOI/Ie-4S8cytj0j2g0zRfMUaMyY_HkY6d-hACLcB/s1600/Fig2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="185" src="https://1.bp.blogspot.com/-aw1ga6m0-pc/V2Zeo2jP_iI/AAAAAAAACOI/Ie-4S8cytj0j2g0zRfMUaMyY_HkY6d-hACLcB/s320/Fig2.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 2 from [1] - EC memory: a network of clips in PS</td></tr></tbody></table><br /><br /><b>Emotions and reflection:</b><br />The state of emotional tags can change during the execution of the algorithm, according to a feedback function. It may seems that this is a similar concept of the reward function being used to update of the transition probability between clips, but it is not. The reward function is defined externally: <i>it is dependent on the external environment</i>. This function instead, is totally independent of it, and it is a parameter of the algorithm. Emotional tags can be thought as remembered rewards for previous actions. Seen from the purpose they serve, they are more similar to clips than to transition probability.<br /><br />Emotions represent a "higher abstraction" over the "lower level" of transition probability. We will use them to introduce the idea of reflection. When the reflection parameter $R$ is $> 1$, the agent keeps computing the output, but in such a way that action clips are detached from their actual execution in the environment. This simulation is repeated until a certain condition on the emotion tags of the selected path has been satisfied, or it is stopped after $R -1 \in \mathbb{N}$ round of simulation into episodic memory. With a name that perfectly aid the intuition, $R$ is called reflection time for the agent. If the agent is unable to select a satisfying action, it executes in the environment one of the previously selected actions.<br /><br /><br />Emotion's space can be as complex as needed, but in the example, I found on literature it is limited to the funny space of $\{$ ☺ ,☹ $\}$.<br /><br /><br />I stress that during reflection, action clips are only virtually excited, and they do not trigger any real action. It is thanks to this capability of the system that during the period of reflection an agent can project itself into "conceivable future situations", before triggering the real actuator, so to "think" their possible outcome. We could say that the emotions attached to the actions represent the state of belief of the agent for the right action given a specific percept.<br /><h4> </h4><h4>Afterglowing</h4>If the rewards are delayed (which is often the case in real world application), one can use afterglowing (<a href="https://en.wikipedia.org/wiki/Afterglow_%28drug_culture%29">lol</a>) : a technique for distributing rewards on recently used clips or edges. This is achieved by tagging each edge with a glowing factor $g$, whose value is set to $1$ each time it is used, and to $g^{t+1}(c_i, c_j) = g^t(c_i, c_j)(1- \eta)$ otherwise. Clip glowing (assign values to clips instead of edges) gives slightly different results on complex clip networks. The assumption behind afterglowing is that action in the past contribute less than recent actions for rewards we get (i.e., compare the importance for the victory between the first and the last move in a chess game). For more information take a look at [2].<br /><br /><br /><h3>PS without clip composition</h3><br />This section is used just to describe how PS works in its basic configuration, using only the tools described so far. The advanced features of the ECM memory are explained in the next sections. Here we assume that percept clips are directly connected to action clips, without any middle layer of clips.<br /><br /><br />Ladies and gentlemen let me introduce you to Projective Simulation algorithm!<br /><br /><ol><li> The input of the algorithm is a percept. A percept stimulates a percept clip.</li><li> The excitation of the percept clip start a stochastic process: a random walk on the network of clips. The initial clip of the walk is the the percept clip excited by the pecepts. The hopping probability at time $t$ is $p^{t}(s,a)$ and is initially uniform among all the clips. The random walk terminates once an action clip is reached. If the reflection parameter $R$ is set ($>1$), and if the emotion attached to that action is negative, than it engages reflection, and start computing further random walks (i.e. samples from the random variable of the stationary probability distribution). This process is repeated at most $R-1$ times: until a boolean function $f : E \to \{true, false \}$is true(otherwise the last action is taken). Weights for the calculation of the hopping probability can be defined by the following policies:</li><ul><li> <b>standard function</b> $$p^{(t)}(c_i, c_j) = \frac{h^{(t)}(c_i, c_j)}{\sum_k h^{(t)}(c_i, c_k)} $$</li><li> <b>softmax function</b> $$p^{(t)}(c_i, c_j) = \frac{e^{h^{(t)}(c_i, c_j)}}{\sum_k e^{h^{(t)}(c_i, c_k)}} $$ (i.e. it gives higher probability to stringer edges - enhancing the exploitation in exploitation/exploitation paradigm)</li><li> $\epsilon$-<b>Greedy algorithm</b>, as in <a href="https://junedmunshi.wordpress.com/2012/03/30/how-to-implement-epsilon-greedy-strategy-policy/">classical RL</a>.</li></ul><li> Selected action is executed in the real world and the eventual reward is collected and taken into account. As we do in ML, while updating our transition probability, we should model the act of forgetting (i.e. giving less weight to lessons learned in the past and letting our agent learn from newer experience). To do that, we use the forgetting factor $\gamma$. Forgetting and hopping probability update (for rewarded and penalized edges) can be compressed in a single formula: $$h^{t+1}(s,a) -h^t(s,a) = - \gamma[g^t(s,a)-] + \lambda \delta(s,s^n)\delta(a,a^n) $$. Where $\delta$ is the Kroneker's delta, and $\lambda =\Lambda^t(s^t,a^t)$. The emotion tag associated with that action is changed according to the reward received. If the reward is positive, the emotion is set to 😄, otherwise is set to 😄. Note that in this is just a toy case, and the emotion space can be way more complex.</li></ol><br />Note that our formula the entries of the matrix $h(c_j,c_i)$ are always greater than 1. These steps are iterated on and on, mimicking the continuous iteration of an agent within its environment.<br /><br /><br />Basically, SP is a continuous interaction with the environment, where each step comprises a call to the memory of an agent. Each call is a sample over a stationary distribution of an irreducible, aperiodic, and irreversible Markov Chain over the clip's space. The states of the Markov Chain may evolve over time, according to the feedback received from the environment.<br /><br /><h3>Invasion Game</h3><br />To have a taste of PS in action, we will focus on some variations of a game called Invasion Game. It's not the aim of this post to dig into the application of PS, but the study of this game really helps intuition. <br /><br /><br />Imagine you have two robots facing each other across a fence with holes.<br /><br /><br /> v <----attacker<br /><br />--- --- --- --- ---<br /><br /> ^ <---- defender <br /><br /><br />The attacker have to cross the fence in one of the holes, while the defender has to block the attack moving to the hole the attacker wants to go, preventing him from crossing. At each round, before the attack, the attacker show a symbol which is consistent with the direction he wants to go. <br /><br /><br />The defender, who has no prior information on the meaning of the symbol, have to learn, while playing, what is the meaning of the symbols and block the right hole. If the defender is able to block one attempt, it gets a reward of $1$, otherwise, it gets $0$. After the rewards have been collected by the agent, the battle starts again, with the two robots facing each other. That's exactly what has been done in [1], where the defender was programmed with a PS algorithm. <br /><br /><br />Let's make the game formally fit our model:<br /><br /> percepts: symbols shown by attacker: $ S \equiv S_1 = \{ \rightarrow, \leftarrow \}$ (attack right, attack left)<br /> action space: $A \equiv A_1= \{ +, - \} $ (move right, move left)<br /> emotions space: $\{$ ☺ ,☹ $\}$<br /><br />To measure how good is the algorithm in playing this game, we use the blocking efficiency, it shows us what is the expected reward for our agent at time $t$. Since we are dealing with non-deterministic processes we expect to be something averaged, and we must take into account the non-determinism of (a) the percept we receive from the external world which may change over time and (b) the conditional probability of choosing the right action given a specific percept at a specific time. This whole concept is nicely subsumed into this formula:<br /><br /><br />$$r^t = \sum_{s \in S} P^t(s)P^t(a^*_s | s) $$<br /><br /><br />Where $P^t(s)$ is the probability of receiving the percept $s$ at time $t$ and $P^t(a^*_s | s)$ is the probability at time $t$ of selecting the rewarded action ($a^*$) given the percept $s$ has been received.<br /><br /><br />For the following graphs, blocking efficiency has been empirically calculated, averaging the results among 1000 matches (i.e. you will see fluctuations in the curves).<br /><br /><br /><h3>Basic version of PS in practice</h3><br />Imagine to train our agent to play Invasion Game, setting initially the reflection parameter at $R=1$, with percept clips directly connected to action clips. Needless to say, PS can play Invasion Game as many other RL algorithms. Here we are no interested in the learning capabilities of a PS agent, but instead, we will test the behavior of the defender in specific "exceptional cases". We will try to mess up the environment in two different ways, and see how our agent react.<br /><br /><br />The simplest mischief we can do to our robot is this: wait until it has learned to defend with high efficiency, and then change the meaning of the direction of the arrows (i.e.suddenly $\leftarrow$ means right and $\rightarrow$ means left). What happens to the blocking efficiency of our robot?<br /><br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-GTUp7X2eM8I/V02QEYwqUII/AAAAAAAACMU/U7hDSmgwiUQcIOVSElR5IlOc4bPqe5dTACLcB/s1600/blockingEfficiency.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="261" src="https://2.bp.blogspot.com/-GTUp7X2eM8I/V02QEYwqUII/AAAAAAAACMU/U7hDSmgwiUQcIOVSElR5IlOc4bPqe5dTACLcB/s320/blockingEfficiency.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 5 from [1] - Blocking efficiency of the agent. The meaning of the arrows changed at time $t=250$.</td></tr></tbody></table><br />Here we clearly see that, when we change ($t=250$), each second curve is less steep than the first. That's because the agent has to forget what he has learned first, and then re-learn everything from scratch. This happens for 3 different $\gamma$ (the forgetting factor). Given the Bayesian structure of the algorithm itself, this is something we should expect: also other RL algorithm behave similarly.<br /><br /><br />Given this configuration of the game, there is another mischief we can do to our defender robot: expanding its percept space with edges of another color ( keeping the same direction of the arrows ). This would update our percept space to $S= S_1 \times S_2 = \{ \rightarrow, \leftarrow \} \times \{red, blue \}$. How does the efficiency of our agent evolve?<br /><ul></ul><h3></h3><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-w1qz7x5DFQI/V02QcJB9jNI/AAAAAAAACMY/Vgk-qvOsQ-0UT5Fy2-DabRTxZfIAqhV7ACKgB/s1600/twosymbols.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="249" src="https://3.bp.blogspot.com/-w1qz7x5DFQI/V02QcJB9jNI/AAAAAAAACMY/Vgk-qvOsQ-0UT5Fy2-DabRTxZfIAqhV7ACKgB/s320/twosymbols.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 6 from [1] - Blocking efficiency with arrows introducing arrows of different color at time $t=200$.</td></tr></tbody></table><br /><br />Ah ah. For an agent totally unaware of the semantic of an arrow, we would expect to re-learn the correct behavior from scratch, and that's exactly what we see from the two curves. It is the "proof" that from the robot's perspective, we are using totally different symbols.<br /><br />And what now if we turn reflection on? Given the same measure of efficiency we can plot results for $R=1$ and $R=2$ (without symbol modification):<br /><h3><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-3JbY2Er1ahI/V02QtnGhc4I/AAAAAAAACMg/ro5mSOhXEpYvMX4l78Za9IudEHFDhDeOgCLcB/s1600/reflection.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="253" src="https://2.bp.blogspot.com/-3JbY2Er1ahI/V02QtnGhc4I/AAAAAAAACMg/ro5mSOhXEpYvMX4l78Za9IudEHFDhDeOgCLcB/s320/reflection.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 7 from [1] - Blocking efficiency with 2 different Reflection's value.</td></tr></tbody></table></h3>Reflection has a concrete impact on the learning phase of an agent. And this is by itself a nice result. The structure of EC memory installed in the agent is<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-f2LMJWra3g4/V12B7sM_z9I/AAAAAAAACNk/o1hM8JecVeop39rqDFaMFyU6_KNUsqNJgCLcB/s1600/Fig4.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="216" src="https://3.bp.blogspot.com/-f2LMJWra3g4/V12B7sM_z9I/AAAAAAAACNk/o1hM8JecVeop39rqDFaMFyU6_KNUsqNJgCLcB/s320/Fig4.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 4 from [1] - The EC memory of the defending agent in Invasion Game.</td></tr></tbody></table><br /><br />We have just started exploring the feature of PS. Wouldn't we love if an "intelligent" agent can find some sort of similarities between the two arrows pointing in the similar direction, but with different colors? Yes! After all, isn't this our definition of creativity?<br /><br />So be surprised then, because this behavior is exactly what we get using two advanced feature of the ECM. Starting to be impressed? You should... :v<br /><br /><h3>Learning how to learn<b><span style="font-weight: normal;"> </span></b></h3>Mimicking what happen in biological brains, we can enhance our model of memory by adding new percept clips during the lifetime of an agent. These new percept clips can be connected at the same time to action clip (as before) but also to other percept clips. Practically this means that the walk on the graph could start from percept with other incoming edges from other percept clips.<br /><br /><br />The steps of the algorithm are similar to the previous case, but now we have to update the transition probability of all the edges in the sequence of clips which lead from the percept to the action. Without digging into the details (that you can find in [1][12]), these are the modified steps of the algorithm:<br /><br /> The input of the algorithm is a percept. A percept stimulates a percept clip.<br /> A random walk from the percept clip $s$ to an action clip $a$ select a sequence of memory clips $\Gamma$. The length of the sequence is called deliberation length. The length $D$ of the chain $(s, s^D)$ is called "deliberation length", and it is roughly "how long does it take to think". The probability for the walk to go from clip $c$ to clip $c'$ is proportional to the ratio between the weight of the edge from clip $c$ to clip $c'$ and the sum of all the weights of the edges starting from $c$. If $(s,a)$ is rewarded sufficiently, and if its emotion is suitable (i.e. 😄), the action is executed. Otherwise, this step is iterated to most other $R-1$ times, and then a random action is taken. <br /> If the action is rewarded, also edges in the associative memory from $s$ to $a$ will be rewarded by a factor $K$ called growth rate of associative connections (direct connections of percept clips with action clip are increased by $1$, as usual). Forgetting factor works as usual, with the requirement that weights of the compositional sequences of clips (i.e. not direct sequences from percept clip to action clip) are dumped towards $K$.<br /><br /><br />Let's play again Invasion Game. This time, we will repeat the experiment of changing the color of the arrows shown by the attacker. We will have a slightly different EC memory, depicted in the image below. <br /><br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-kZUWVWPiseY/V1iKhC3YFVI/AAAAAAAACNM/DTRHwv84r605bt2Ls9BCtI_3tTI4nYaWwCLcB/s1600/Fig12.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://4.bp.blogspot.com/-kZUWVWPiseY/V1iKhC3YFVI/AAAAAAAACNM/DTRHwv84r605bt2Ls9BCtI_3tTI4nYaWwCLcB/s320/Fig12.png" width="282" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 12 From [1] - Evolution of the ECM while learning</td></tr></tbody></table>In the picture, (a) is the initial state of the memory, (b) is the memory after it has been trained with red arrows (please note the slightly thick arc from the blue arrows to the red ones: its weight is $K$.), and (c) the associative memory in action.<br /><br /><br />This is an example of how a newly excited percept clip (blue arrow) could excite another clip in episodic memory (red arrow), from which strong links to specific action clips had been built-up by previous experience. This capability can be used to speed the learning time of an agent, as we see in the graph below. The agent has been trained for until $t=200$ to play with red arrows, and then the attacker switch to blue arrows.<span style="font-weight: normal;"><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-D_v-zebrNCg/V1hzkCmBF2I/AAAAAAAACM8/h9iAp3kc3rotPUNiODj8OfQMbNSoNM7PwCLcB/s1600/Fig11.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="252" src="https://4.bp.blogspot.com/-D_v-zebrNCg/V1hzkCmBF2I/AAAAAAAACM8/h9iAp3kc3rotPUNiODj8OfQMbNSoNM7PwCLcB/s320/Fig11.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 11 From [1] - The bigger the K, the steeper the learning curve.</td></tr></tbody></table></span><br /><br />This match our previous definition of intelligence: we would expect an intelligent agent to learn how to play faster with the new symbols, since their "semantics" (in terms of rewards) is already known for similar symbols. Our agent, in fact, knows already how to play! That's exactly what happens using EC memory with higher deliberation length.<br /><br />This structure resembles an associative <a href="https://en.wikipedia.org/wiki/Associative_memory_%28psychology%29">memory</a>, and this is an example of associative <a href="https://en.wikipedia.org/wiki/Learning#Associative_learning">learning</a>.<br /><i></i> <br /><h3>Combining actions </h3><br />We can generalize our model creating new clips by composing pre-existing ones. For instance, we can compose together two action clips. There are a few requirements from merging, though:<br /><br /><ol><li> Both action clips should be sufficiently rewarded for the same percept: there is a threshold level of reward for two actions to be considered sufficiently rewarded.</li><li> The actions are similar, e.g action vectors only differ in two components, and are semantically compatible (i.e. you cannot combine "jump" and "stand still" in your agent, or "go right" with "go left").</li><li> The newly composed action clip does not exist already.</li></ol><br />This feature is what allows us to do action clip compositions, which we will see applied in a 3D version of Invasion Game.<br /><br /><br />Now the attacker have to cross an imaginary grid-like plane, and the defender can move over the plane, so to block the attacker.<br /><br /><br />Our percept space is: $S \equiv S_1 = \{ \rightarrow, \leftarrow, \nearrow, \searrow, \nwarrow, \swarrow, \uparrow, \downarrow \}$<br /><br />Our action space $A \equiv A_1 \times A_2 = \{ (+, 0), (-, 0) \} \times \{ (0, +), (0, -) \} $, where each component of the tuple fix an axes, and the sign is ment to point the verse.<br /><br /><br />Beware, we will now give the partial reward to our agent if he can match half of the direction of the attacker. For instance, if the attacker decides to move in diagonal $\nearrow$, the defender gets partial rewards if it chooses one among up or right.<br /><br /><br />This time [1], the agent has first been trained only with attacks along the axis. When he is presented with attacks on the $\nearrow$ direction, he will soon realize that there are two action clips equally rewarded for that action, which are semantically compatible. So the agent might think it could merge two actions into one, by creating a new action clip which activates the two action in the real world simultaneously, and sees what happen. <br /><br /><br />This is what happens in his brain in terms of clips. The bigger the edge, the bigger is the transition probability.<br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="http://www.nature.com/article-assets/npg/srep/2012/120515/srep00400/images_hires/w926/srep00400-f16.jpg" style="margin-left: auto; margin-right: auto;"><img border="0" src="http://www.nature.com/article-assets/npg/srep/2012/120515/srep00400/images_hires/w926/srep00400-f16.jpg" height="181" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 16 from [1] - Example of action clip composition where our agent learns to move in diagonal.</td></tr></tbody></table><br />Needless to say, our new favorite ML algorithm does not let us down, showing how the defender can discover new "behavior" which were not previously given. <br /><br /><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody><tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-E0uegmaUYrg/V12CsWeeveI/AAAAAAAACNw/VX-um2I6UNQj8nEQ40CtP9it-AykRDPMwCLcB/s1600/Fig17.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="220" src="https://3.bp.blogspot.com/-E0uegmaUYrg/V12CsWeeveI/AAAAAAAACNw/VX-um2I6UNQj8nEQ40CtP9it-AykRDPMwCLcB/s320/Fig17.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Fig 17 from [1] - Agent's performance for the various threshold level of association.</td></tr></tbody></table><br />In the graph above, the agent has been trained for $t<0$ (not shown) with only attacks along the axes, while for $t>0$ the agent is faced with only attacks along the $\nearrow$ direction. As you can see, the partial reward is set to $1$, while the total reward for a diagonal move is set to $4$.<br /><h3> </h3><h3>Comparison with Reinforcement Learning algorithms</h3><h3></h3>PS has been compared to another algorithm of reinforcement learning in a recent thesis [2]. The comparison was on real implementations of games and other tasks. Java code can be found in [3]. Moreover, since the interface for the algorithm of PS and RL is very similar, a comparison of their respective computational complexity was possible [2]. Both algorithms, in fact, expose a function $getAction$ - the function that gives you an action given an input- and a function $getReward$ - which distribute rewards in the model.<br /><ul><li>The complexity of $getAction$ of both algorithms is $O(|A|)$, but goes to $O(|A|*R)$ if glowing or dumping are enabled.</li><li>$getAction$ can be implemented using the same selection function ($\epsilon$-Greedy, SoftMax, and the plain probability function).</li><li>For some games, the use of emotion has been useful.</li><li>Glowing can be compared to eligibility traces. The same concept for distributing delayed rewards on the previous action in RL.</li><li>The complexity of $getReward$ is more efficient in the RL case. </li><li>In a well-defined world, the RL algorithms are guaranteed to find an optimal policy[2], but PS seems to be more suitable in a more complex environment, where rules are unknown and subject to changes.</li></ul><br />There are two algorithms in literature that resemble PS: experience replay, and a dyna-algorithm. Those are "conceptually" similar, but with totally different structure, assumptions, complexity, and features.<br /><h3> </h3><h3>The quantum world</h3>he only way for an agent to exploit quantum mechanics to interact with a classical environment is to use quantum mechanics for internal state representation. To do that, we translate clips into vectors in a Hilbert space. Clips are a composite structure of remembered or fictitious percepts or action, and we will capture this composite structure by means of a tensor product of Hilbert spaces. The steps for running PS on a quantum machine are those:<br /><br /><ul><li>A classical percept $s \in S$ is translated into a state in the quantum memory of the agent. This state is described by the density operator $\rho(s)$.</li><li>The physical system will evolve according to an Hamiltonian (which we will specify later).</li><li>A quantum measure on this quantum system will lead us to a specific action.</li></ul>We define the probability of transition from a clip to another by the Born's rule: nonorthogonal vector/clips are connected by a probability $0 \leq p= \left| \langle c_i|c_j \rangle \right|^2 \leq 1$ that the jump of the excitation during the walk may occur. That's the natural choice of embedding probability in a quantum world, so to give raise to our beloved quantum interference. As you may imagine, the initial state could be a superposition of multiple initial states, and this might lead to a speedup.<br /><br /><br />Now the tough part: how to translate in the quantum physical word a conceptual thing such as a walk on a graph? The Hamiltonian for such operation is pretty complex and has been found here.<br /><br /><br />$$ H = \sum_{ \{ j,k \} \in E} \lambda_{jk}(\hat{c}_k^{\dagger}\hat{c}_{j} + \hat{c}_k \hat{c}^{\dagger}_j ) + \sum_{j \in V} \epsilon \hat{c}_j^{\dagger}\hat{c}_j $$<br /><br /><br />If you are curious about the operator $c$ and $c_j$, take a look at my previous blog post here: those are exactly the same operation used to probabilistiscally move the excitation along a chain of qubits.<br /><br /><br />However, there's a problem with this definition: our Hamiltonian describe the evolution of a reversible system: it means that the transition probability between two state is symmetric. While this quantum environment is perfect for undirected graphs, we have hard time modeling directed graphs. [1] We thus have to model a reversible, irreducible, and aperiodic Markov chain.<br /><br /><br />The quantum speedup, which has been showed in [11] to calculate reflection via the Groover's algorithm. But this is a whole new story worth another single blog post I will be glad to write in another post.<br /><br /><br /><br />Moreover, a recently, a broader class of graphs for which there is an exponential speedup of the hitting time of the Markov chain has been found. That's cool, since the core of our algorithm is a random walk on a graph-like structure, and the hitting time (in our case) is the average time for which the excitation goes from the first percept clip to the action. To date, we know that discrete quantum walks on hypercubes are exponentially faster on quantum computers, and this class of graph has been extended to <a href="https://en.wikipedia.org/wiki/Graph_embedding">embedded</a> hypercubes on certain graphs [12].<br /><br /><br /><h3></h3><h3>Parameters</h3><span style="font-weight: normal;">Straight from [2], we have: </span><br /><br /><table border="1" class="table table-bordered table-hover table-condensed"><tbody><tr><td>Parameter</td><td>Range</td><td>Field</td><td>Default</td><td>Explanation</td></tr><tr><td>Damping</td><td>$0 \leq \gamma \leq 1$</td><td>$\mathbb{R}$</td><td>0 or $\frac{1}{10}$</td><td>Forgetting factor</td></tr><tr><td>Reflection</td><td>$ 1 \leq R $</td><td>$\mathbb{N}$</td><td>1</td><td>Number of reflection cycle</td></tr><tr><td>Glowing</td><td>$ \eta \leq 1$</td><td>$\mathbb{R}$</td><td>1</td><td>Glowing factor for weighting rewards</td></tr><tr><td>Associative growth rate</td><td>$K > 0$</td><td>$\mathbb{R}$</td><td>-</td><td>Growth rate of associative connections of composite paths</td></tr></tbody></table><h3> </h3><h3>Conclusions</h3>PS can be thought as a generalization of RL. The job of updating transition probability between edges can be done by Bayesian updating, which is basically a RL algorithm.<br /><br /><br />The ECM memory is an important part of the PS model. Is what allows an agent to do reflection, which is a kind simulation made by the agent of it's action in the world. Emotions and ECM are what allows the agent to detach from primary experience and to project themselves into conceivable situations (fictitious memory), without taking any real action.<br /><br /><br />A further generalization of PS scheme can be found in [10]. <br /><br /><br />To sum everything up, learning in PS is achieved in three different ways:<br /><br /><ul><li>modifying via bayesian updating the transition probability between the clips of the network (aided by an RL algorithm)</li><li>creating new clips when new percepts are received</li><li>creating new clips from existing ones according to a compositional principle.</li></ul><br /><br />Much of the "magic" of this algorithm is embedded into our definition of clips, but it is something it must be specified case by case. That is because PS agents are defined on a more conceptual level as agents whose internal states represent episodic and compositional memory and whose deliberation comprises association-driven hops between memory sequences (the clips). [13]<br /><br /><br />I am not aware yet of any implementation in quipper or LiQI|> of the algorithm. We could, in fact, write the Hamiltonian of the physical system and give it as an input to the GSE algorithm, and run our software on our favorite quantum programming language [5] [6]. GSE is an algorithm for simulating a physical system (given it's Hamiltonian) on a gate-based model of quantum computer, efficiently. There is already an implementation of GSE, based on the work of [4].<br /><br /><br />Speaking about simulating physics with computers, I built an Hamiltonian with my own bare hands, and you can play with it here!<br /><br />I would like to close this article with a quote [7] which I believe subsume the Zeitgeist among many scientists nowadays:<br /><i><br /></i><br /><div style="text-align: center;"><i> If we accept that free will is compatible with physical law, we also have to accept that it must be possible, in principle, to build a machine that would exhibit similar forms of freedom as the one we usually ascribe to humans and certain animals</i></div><br /><br />I do not claim copyright for any of the pictures in this post. They all belongs to the authors of [1]. <br /><br /><br />Ocio: I would appreciate if you can send me an email if you find any mistake. Feedback is always welcome, I want to improve! This post may get updated over time as I learn new things or I am not satisfied with my previous explanations.<br /><br /><br /><div class="separator" style="clear: both; text-align: center;"><br /></div><br /><iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/HDRXG125PmY/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/HDRXG125PmY?feature=player_embedded" width="320"></iframe><br /><br />Sybreed - God is an automation (piano cover by a random guy on youtube) <br /><h3></h3><h3><b><span style="font-weight: normal;">References</span></b></h3><br /><span style="font-weight: normal;">[1] Briegel, Hans J., and Gemma De las Cuevas. "Projective simulation for artificial intelligence." Scientific reports 2 (2012). - <a href="http://www.nature.com/articles/srep00400" target="_blank">http://www.nature.com/articles/srep00400 </a></span><br /><span style="font-weight: normal;">[2] Bjerland, Øystein Førsund. "Projective Simulation compared to reinforcement learning" - <a href="http://bora.uib.no/bitstream/handle/1956/10391/135269761.pdf?sequence=1&isAllowed=y" target="_blank">http://bora.uib.no/bitstream/handle/1956/10391/135269761.pdf?sequence=1&isAllowed=y </a></span><br />[3] Java Implementation of PS algorithm by Bjerland, Øystein Førsund - <a href="https://bitbucket.org/mroystein/projectivesimulation">https://bitbucket.org/mroystein/projectivesimulation</a><br />[4] Simulation of Electronic Structure Hamiltonians Using Quantum Computers<br />James D. Whitfield, Jacob Biamonte, Alán Aspuru-Guzik: <a href="http://arxiv.org/abs/1001.3855">http://arxiv.org/abs/1001.3855</a><br />[5] Green, Alexander S., et al. "An introduction to quantum programming in Quipper." Reversible Computation. Springer Berlin Heidelberg, 2013. 110-124. <a href="http://arxiv.org/pdf/1304.5485v1.pdf">http://arxiv.org/pdf/1304.5485v1.pdf</a> <br />[6] Liquid User Manual : <a href="https://msr-quarc.github.io/Liquid/LIQUiD.pdf">https://msr-quarc.github.io/Liquid/LIQUiD.pdf</a><br />[7] Briegel, Hans J. "On creative machines and the physical origins of freedom." Scientific reports 2 (2012). - <a href="http://www.nature.com/articles/srep00522" target="_blank">http://www.nature.com/articles/srep00522 </a><br />[8] Kandel, Eric R. "The molecular biology of memory storage: a dialogue between genes and synapses." Science 294.5544 (2001): 1030-1038. <a href="http://www.ncbi.nlm.nih.gov/pubmed/11691980" target="_blank">http://www.ncbi.nlm.nih.gov/pubmed/11691980</a><br /><span style="color: red;"><span style="color: #444444;">[9] Antonov, Igor, et al. "Activity-dependent presynaptic facilitation and Hebbian LTP are both required and interact during classical conditioning in Aplysia." Neuron 37.1 (2003): 135-147</span>. </span><br />[10] <a href="http://arxiv.org/pdf/1504.02247v1.pdf" target="_blank">Projective simulation with generalization</a>: Alexey A. Melnikov, Adi Makmal, Vedran Dunjko, and Hans J. Briegel.<br />[11] Paparo, Giuseppe Davide, et al. "Quantum speedup for active learning agents." Physical Review X 4.3 (2014): 031002. <a href="http://journals.aps.org/prx/abstract/10.1103/PhysRevX.4.031002" target="_blank">http://journals.aps.org/prx/abstract/10.1103/PhysRevX.4.031002</a><br />[12] Makmal, Adi, et al. "Quantum walks on embedded hypercubes." Physical Review A 90.2 (2014): 022314. <a href="https://arxiv.org/pdf/1309.5253.pdf">https://arxiv.org/pdf/1309.5253.pdf</a><br />[13] Hines, Andrew P., and P. C. E. Stamp. "Quantum walks, quantum gates, and quantum computers." Physical Review A 75.6 (2007): 062321. <a href="http://arxiv.org/abs/quant-ph/0701088">http://arxiv.org/abs/quant-ph/0701088</a> returnlambdaIn these days, I had the chance to read about a recent algorithm for (quantum) machine learning. Its name is Projective Simulation (PS), proposed by Briegel and Cuevas [1] in 2012. PS melts together ideas from neural networks (whence takes the idea of a graph-like network of objects to model the used to model the memory used in PS) and reinforcement learning (wherefrom it takes the idea of training the algorithm using rewards and punishments as a function of the environment). PS connect aspects of research such as artificial intelligence, quantum information, and quantum computation.Among the various features that make this algorithm interesting, one of the things I like most is that a steps of PS can be executed on a quantum physical system - such as a quantum computer - even gaining computational efficiency over classical computers using quantum interference [11], or exploiting topological structure of certain kind of graph [12].PS uses a graph-like representation of the memory of an agent. This representation is so general that allows us to think of the neural networks as a physical implementation of the same kind of memory - called episodic and compositional memory - used by PS. In fact, both algorithms share the concepts that any input received is accompanied by a certain spatiotemporal excitation pattern within the nodes of a network, where similar input cause the same excitation. But PS is way more than this...A digression on freedom and information-processing machinesPS algorithm was proposed with a broader scope than being a working quantum Machine Learning (ML) algorithm. In fact, PS play a relevant role in the philosophical debate on the existence of free will. Briegel et al. [7], instead of claiming the presence or the absence of free will in mankind, claim that programmable structure, if (a) sufficiently complex and organized, and (b) capable of developing a specific kind of memory, can exhibit behaviors of creativity and free-will. PS, other than competing with other Reinforcement Learning (RL) algorithms, is proposed as a tool: a fundamental information-theoretic concept that can be used to define a notion of freedom (creativity, intelligence), compatible with the laws of physics.If we accept that free will is compatible with physical law, we also have to accept that it must be possible, in principle, to build a machine that would exhibit similar forms of freedom as the one we usually ascribe to humans and certain animals. [7]Think of this algorithm as the "proof" of the claims of the authors relatively the existence of complex system exhibiting free will. We will see how with PS we can create agents whose behavior can (hopefully, unarguably) considered by common sense as creative or intelligent.To better understand the design choices behind the algorithm, we will define Intelligence as the capability of an agent to perceive and act on its environment in a way that maximizes its chances of success. Creativity (a manifestation of intelligence) as the capability of an agent to deal with unprecedented situations, and relate a given situation to other conceivable situations [7][1]. Learning is described as a modification of the molecular details of a neural network of an agent due to experience. The definition of learning is purposely similar to what happens in the brain of some animals, as discovered by some recent results in neuroscience [8]. For instance, the behavior of this poor Aplysia can be largely described as a stimulus-reflex circuit, where the structure of the circuit change over time [9].ECM-like memory animal. LOLPS can be used to analyze emerging behavior of an agent under specific conditions, so to test its behavior in simple environments, such as very simple games.Authors remark that PS does not try to: explain how the brain works, explain the nature of consciousness, explain the nature of human freedom.Let's see what PS is about now.Key concepts in PSThe model of memory used by the intelligent agents in PS is called Episodic and Compositional Memory. ECM is deliberately similar to other notion of memory in other fields of science. Based on the concept of episodic memory, (see this article by Tulving and Ingvar), ECM is basically a stochastic network of clips. We define: percepts: pieces of data that represent information from the outer world. These are the possible inputs of our algorithm. Formally, a percept is a tuple $( s = (s_1, s_2, ...,s_N) \in S $ , where $S$ is the percept space $S_1 \times ... \times S_n$. Each $s_i = 1, ..., |S_i|$. actions: actions the agent can perform in the external world. These are represented by tuples in the action (or actuator) space: $a = (a_1, a_2, ..., A_k) \in A = A_1 \times ... A_k$ where $a_k = 1, ... |A_k| $ . Imagine $a_1$ as the state of "doing a jump", $a_2$ the state of "walking" and so on. These are the output of a single call to EC memory. clips: these are nodes in the network of EC memory. A clip is meant to represent the fundamental unit of the episodic memory I told you before. Percept clips are those clips $c$ that gets stimulated by percepts $s$ according to a specific probability distribution $I(c|s)$, while action clips $a$ are clips that, if stimulated, trigger an action. One of the most important features in ECM is the possibility to have remembered or fictitious percepts ⓢ$ := \mu(s) \in \mu(S)$ or actions ⓐ . Fictitious or remembered percepts or actions are stored inside clips, as sequences. Each clip has a length $L \geq 1$, which means that $c$ is composed by of $(c^1, c^2, ..., c^L)$ where each $c^i \in \mu(S) \cup \mu(A)$. (In this article we will only use clips with $L=1$). edges : are objects representing directed arc between clips (they have a starting clip and a receiving clip), and they contain data useful to the execution of the algorithm, such as weights, glowing tags, and emotion tags. emotions: peace of data (called emotions tag) attached to the edge. Tags are represented as tuples $e = (e_1, e_2, ..., e_k)$ in the emotional space $E \equiv E_1 \times ... E_k = E, e_k = 1, ... |E_k| $. The weight of an edge between clip $c_i$ and $c_j$ is at time $t$ is stored in the weight matrix $h^t(c_i, c_j)$. The transition probabilities between clips are directly proportional to the weight of the edges. At the beginning of the execution of the algorithm, every percept clip is directly connected to every action clip. For the first part of our journey into PS, there will be no further connections between clips, and there are no further layers of clips (this will be generalized in the further sections). The reward function is $\Lambda : S \times A \to \mathbb{R}$.The gist of PS is that a percept excites a percept clips, this excitation will start a random walk in the episodic memory, going through a chain of clips, eventually triggering an action clip. The transition probability between states of this stochastic process is obtained as a function of the weight matrix.Fig 2 from [1] - EC memory: a network of clips in PSEmotions and reflection:The state of emotional tags can change during the execution of the algorithm, according to a feedback function. It may seems that this is a similar concept of the reward function being used to update of the transition probability between clips, but it is not. The reward function is defined externally: it is dependent on the external environment. This function instead, is totally independent of it, and it is a parameter of the algorithm. Emotional tags can be thought as remembered rewards for previous actions. Seen from the purpose they serve, they are more similar to clips than to transition probability.Emotions represent a "higher abstraction" over the "lower level" of transition probability. We will use them to introduce the idea of reflection. When the reflection parameter $R$ is $> 1$, the agent keeps computing the output, but in such a way that action clips are detached from their actual execution in the environment. This simulation is repeated until a certain condition on the emotion tags of the selected path has been satisfied, or it is stopped after $R -1 \in \mathbb{N}$ round of simulation into episodic memory. With a name that perfectly aid the intuition, $R$ is called reflection time for the agent. If the agent is unable to select a satisfying action, it executes in the environment one of the previously selected actions.Emotion's space can be as complex as needed, but in the example, I found on literature it is limited to the funny space of $\{$ ☺ ,☹ $\}$.I stress that during reflection, action clips are only virtually excited, and they do not trigger any real action. It is thanks to this capability of the system that during the period of reflection an agent can project itself into "conceivable future situations", before triggering the real actuator, so to "think" their possible outcome. We could say that the emotions attached to the actions represent the state of belief of the agent for the right action given a specific percept. AfterglowingIf the rewards are delayed (which is often the case in real world application), one can use afterglowing (lol) : a technique for distributing rewards on recently used clips or edges. This is achieved by tagging each edge with a glowing factor $g$, whose value is set to $1$ each time it is used, and to $g^{t+1}(c_i, c_j) = g^t(c_i, c_j)(1- \eta)$ otherwise. Clip glowing (assign values to clips instead of edges) gives slightly different results on complex clip networks. The assumption behind afterglowing is that action in the past contribute less than recent actions for rewards we get (i.e., compare the importance for the victory between the first and the last move in a chess game). For more information take a look at [2].PS without clip compositionThis section is used just to describe how PS works in its basic configuration, using only the tools described so far. The advanced features of the ECM memory are explained in the next sections. Here we assume that percept clips are directly connected to action clips, without any middle layer of clips.Ladies and gentlemen let me introduce you to Projective Simulation algorithm! The input of the algorithm is a percept. A percept stimulates a percept clip. The excitation of the percept clip start a stochastic process: a random walk on the network of clips. The initial clip of the walk is the the percept clip excited by the pecepts. The hopping probability at time $t$ is $p^{t}(s,a)$ and is initially uniform among all the clips. The random walk terminates once an action clip is reached. If the reflection parameter $R$ is set ($>1$), and if the emotion attached to that action is negative, than it engages reflection, and start computing further random walks (i.e. samples from the random variable of the stationary probability distribution). This process is repeated at most $R-1$ times: until a boolean function $f : E \to \{true, false \}$is true(otherwise the last action is taken). Weights for the calculation of the hopping probability can be defined by the following policies: standard function $$p^{(t)}(c_i, c_j) = \frac{h^{(t)}(c_i, c_j)}{\sum_k h^{(t)}(c_i, c_k)} $$ softmax function $$p^{(t)}(c_i, c_j) = \frac{e^{h^{(t)}(c_i, c_j)}}{\sum_k e^{h^{(t)}(c_i, c_k)}} $$ (i.e. it gives higher probability to stringer edges - enhancing the exploitation in exploitation/exploitation paradigm) $\epsilon$-Greedy algorithm, as in classical RL. Selected action is executed in the real world and the eventual reward is collected and taken into account. As we do in ML, while updating our transition probability, we should model the act of forgetting (i.e. giving less weight to lessons learned in the past and letting our agent learn from newer experience). To do that, we use the forgetting factor $\gamma$. Forgetting and hopping probability update (for rewarded and penalized edges) can be compressed in a single formula: $$h^{t+1}(s,a) -h^t(s,a) = - \gamma[g^t(s,a)-] + \lambda \delta(s,s^n)\delta(a,a^n) $$. Where $\delta$ is the Kroneker's delta, and $\lambda =\Lambda^t(s^t,a^t)$. The emotion tag associated with that action is changed according to the reward received. If the reward is positive, the emotion is set to 😄, otherwise is set to 😄. Note that in this is just a toy case, and the emotion space can be way more complex.Note that our formula the entries of the matrix $h(c_j,c_i)$ are always greater than 1. These steps are iterated on and on, mimicking the continuous iteration of an agent within its environment.Basically, SP is a continuous interaction with the environment, where each step comprises a call to the memory of an agent. Each call is a sample over a stationary distribution of an irreducible, aperiodic, and irreversible Markov Chain over the clip's space. The states of the Markov Chain may evolve over time, according to the feedback received from the environment.Invasion GameTo have a taste of PS in action, we will focus on some variations of a game called Invasion Game. It's not the aim of this post to dig into the application of PS, but the study of this game really helps intuition. Imagine you have two robots facing each other across a fence with holes. v <----attacker--- --- --- --- --- ^ <---- defender The attacker have to cross the fence in one of the holes, while the defender has to block the attack moving to the hole the attacker wants to go, preventing him from crossing. At each round, before the attack, the attacker show a symbol which is consistent with the direction he wants to go. The defender, who has no prior information on the meaning of the symbol, have to learn, while playing, what is the meaning of the symbols and block the right hole. If the defender is able to block one attempt, it gets a reward of $1$, otherwise, it gets $0$. After the rewards have been collected by the agent, the battle starts again, with the two robots facing each other. That's exactly what has been done in [1], where the defender was programmed with a PS algorithm. Let's make the game formally fit our model: percepts: symbols shown by attacker: $ S \equiv S_1 = \{ \rightarrow, \leftarrow \}$ (attack right, attack left) action space: $A \equiv A_1= \{ +, - \} $ (move right, move left) emotions space: $\{$ ☺ ,☹ $\}$To measure how good is the algorithm in playing this game, we use the blocking efficiency, it shows us what is the expected reward for our agent at time $t$. Since we are dealing with non-deterministic processes we expect to be something averaged, and we must take into account the non-determinism of (a) the percept we receive from the external world which may change over time and (b) the conditional probability of choosing the right action given a specific percept at a specific time. This whole concept is nicely subsumed into this formula:$$r^t = \sum_{s \in S} P^t(s)P^t(a^*_s | s) $$Where $P^t(s)$ is the probability of receiving the percept $s$ at time $t$ and $P^t(a^*_s | s)$ is the probability at time $t$ of selecting the rewarded action ($a^*$) given the percept $s$ has been received.For the following graphs, blocking efficiency has been empirically calculated, averaging the results among 1000 matches (i.e. you will see fluctuations in the curves).Basic version of PS in practiceImagine to train our agent to play Invasion Game, setting initially the reflection parameter at $R=1$, with percept clips directly connected to action clips. Needless to say, PS can play Invasion Game as many other RL algorithms. Here we are no interested in the learning capabilities of a PS agent, but instead, we will test the behavior of the defender in specific "exceptional cases". We will try to mess up the environment in two different ways, and see how our agent react.The simplest mischief we can do to our robot is this: wait until it has learned to defend with high efficiency, and then change the meaning of the direction of the arrows (i.e.suddenly $\leftarrow$ means right and $\rightarrow$ means left). What happens to the blocking efficiency of our robot?Fig 5 from [1] - Blocking efficiency of the agent. The meaning of the arrows changed at time $t=250$.Here we clearly see that, when we change ($t=250$), each second curve is less steep than the first. That's because the agent has to forget what he has learned first, and then re-learn everything from scratch. This happens for 3 different $\gamma$ (the forgetting factor). Given the Bayesian structure of the algorithm itself, this is something we should expect: also other RL algorithm behave similarly.Given this configuration of the game, there is another mischief we can do to our defender robot: expanding its percept space with edges of another color ( keeping the same direction of the arrows ). This would update our percept space to $S= S_1 \times S_2 = \{ \rightarrow, \leftarrow \} \times \{red, blue \}$. How does the efficiency of our agent evolve?Fig 6 from [1] - Blocking efficiency with arrows introducing arrows of different color at time $t=200$.Ah ah. For an agent totally unaware of the semantic of an arrow, we would expect to re-learn the correct behavior from scratch, and that's exactly what we see from the two curves. It is the "proof" that from the robot's perspective, we are using totally different symbols.And what now if we turn reflection on? Given the same measure of efficiency we can plot results for $R=1$ and $R=2$ (without symbol modification):Fig 7 from [1] - Blocking efficiency with 2 different Reflection's value.Reflection has a concrete impact on the learning phase of an agent. And this is by itself a nice result. The structure of EC memory installed in the agent isFig 4 from [1] - The EC memory of the defending agent in Invasion Game.We have just started exploring the feature of PS. Wouldn't we love if an "intelligent" agent can find some sort of similarities between the two arrows pointing in the similar direction, but with different colors? Yes! After all, isn't this our definition of creativity?So be surprised then, because this behavior is exactly what we get using two advanced feature of the ECM. Starting to be impressed? You should... :vLearning how to learn Mimicking what happen in biological brains, we can enhance our model of memory by adding new percept clips during the lifetime of an agent. These new percept clips can be connected at the same time to action clip (as before) but also to other percept clips. Practically this means that the walk on the graph could start from percept with other incoming edges from other percept clips.The steps of the algorithm are similar to the previous case, but now we have to update the transition probability of all the edges in the sequence of clips which lead from the percept to the action. Without digging into the details (that you can find in [1][12]), these are the modified steps of the algorithm: The input of the algorithm is a percept. A percept stimulates a percept clip. A random walk from the percept clip $s$ to an action clip $a$ select a sequence of memory clips $\Gamma$. The length of the sequence is called deliberation length. The length $D$ of the chain $(s, s^D)$ is called "deliberation length", and it is roughly "how long does it take to think". The probability for the walk to go from clip $c$ to clip $c'$ is proportional to the ratio between the weight of the edge from clip $c$ to clip $c'$ and the sum of all the weights of the edges starting from $c$. If $(s,a)$ is rewarded sufficiently, and if its emotion is suitable (i.e. 😄), the action is executed. Otherwise, this step is iterated to most other $R-1$ times, and then a random action is taken. If the action is rewarded, also edges in the associative memory from $s$ to $a$ will be rewarded by a factor $K$ called growth rate of associative connections (direct connections of percept clips with action clip are increased by $1$, as usual). Forgetting factor works as usual, with the requirement that weights of the compositional sequences of clips (i.e. not direct sequences from percept clip to action clip) are dumped towards $K$.Let's play again Invasion Game. This time, we will repeat the experiment of changing the color of the arrows shown by the attacker. We will have a slightly different EC memory, depicted in the image below. Fig 12 From [1] - Evolution of the ECM while learningIn the picture, (a) is the initial state of the memory, (b) is the memory after it has been trained with red arrows (please note the slightly thick arc from the blue arrows to the red ones: its weight is $K$.), and (c) the associative memory in action.This is an example of how a newly excited percept clip (blue arrow) could excite another clip in episodic memory (red arrow), from which strong links to specific action clips had been built-up by previous experience. This capability can be used to speed the learning time of an agent, as we see in the graph below. The agent has been trained for until $t=200$ to play with red arrows, and then the attacker switch to blue arrows.Fig 11 From [1] - The bigger the K, the steeper the learning curve.This match our previous definition of intelligence: we would expect an intelligent agent to learn how to play faster with the new symbols, since their "semantics" (in terms of rewards) is already known for similar symbols. Our agent, in fact, knows already how to play! That's exactly what happens using EC memory with higher deliberation length.This structure resembles an associative memory, and this is an example of associative learning. Combining actions We can generalize our model creating new clips by composing pre-existing ones. For instance, we can compose together two action clips. There are a few requirements from merging, though: Both action clips should be sufficiently rewarded for the same percept: there is a threshold level of reward for two actions to be considered sufficiently rewarded. The actions are similar, e.g action vectors only differ in two components, and are semantically compatible (i.e. you cannot combine "jump" and "stand still" in your agent, or "go right" with "go left"). The newly composed action clip does not exist already.This feature is what allows us to do action clip compositions, which we will see applied in a 3D version of Invasion Game.Now the attacker have to cross an imaginary grid-like plane, and the defender can move over the plane, so to block the attacker.Our percept space is: $S \equiv S_1 = \{ \rightarrow, \leftarrow, earrow, \searrow, warrow, \swarrow, \uparrow, \downarrow \}$Our action space $A \equiv A_1 \times A_2 = \{ (+, 0), (-, 0) \} \times \{ (0, +), (0, -) \} $, where each component of the tuple fix an axes, and the sign is ment to point the verse.Beware, we will now give the partial reward to our agent if he can match half of the direction of the attacker. For instance, if the attacker decides to move in diagonal $ earrow$, the defender gets partial rewards if it chooses one among up or right.This time [1], the agent has first been trained only with attacks along the axis. When he is presented with attacks on the $ earrow$ direction, he will soon realize that there are two action clips equally rewarded for that action, which are semantically compatible. So the agent might think it could merge two actions into one, by creating a new action clip which activates the two action in the real world simultaneously, and sees what happen. This is what happens in his brain in terms of clips. The bigger the edge, the bigger is the transition probability.Fig 16 from [1] - Example of action clip composition where our agent learns to move in diagonal.Needless to say, our new favorite ML algorithm does not let us down, showing how the defender can discover new "behavior" which were not previously given. Fig 17 from [1] - Agent's performance for the various threshold level of association.In the graph above, the agent has been trained for $t<0$ (not shown) with only attacks along the axes, while for $t>0$ the agent is faced with only attacks along the $ earrow$ direction. As you can see, the partial reward is set to $1$, while the total reward for a diagonal move is set to $4$. Comparison with Reinforcement Learning algorithmsPS has been compared to another algorithm of reinforcement learning in a recent thesis [2]. The comparison was on real implementations of games and other tasks. Java code can be found in [3]. Moreover, since the interface for the algorithm of PS and RL is very similar, a comparison of their respective computational complexity was possible [2]. Both algorithms, in fact, expose a function $getAction$ - the function that gives you an action given an input- and a function $getReward$ - which distribute rewards in the model.The complexity of $getAction$ of both algorithms is $O(|A|)$, but goes to $O(|A|*R)$ if glowing or dumping are enabled.$getAction$ can be implemented using the same selection function ($\epsilon$-Greedy, SoftMax, and the plain probability function).For some games, the use of emotion has been useful.Glowing can be compared to eligibility traces. The same concept for distributing delayed rewards on the previous action in RL.The complexity of $getReward$ is more efficient in the RL case. In a well-defined world, the RL algorithms are guaranteed to find an optimal policy[2], but PS seems to be more suitable in a more complex environment, where rules are unknown and subject to changes.There are two algorithms in literature that resemble PS: experience replay, and a dyna-algorithm. Those are "conceptually" similar, but with totally different structure, assumptions, complexity, and features. The quantum worldhe only way for an agent to exploit quantum mechanics to interact with a classical environment is to use quantum mechanics for internal state representation. To do that, we translate clips into vectors in a Hilbert space. Clips are a composite structure of remembered or fictitious percepts or action, and we will capture this composite structure by means of a tensor product of Hilbert spaces. The steps for running PS on a quantum machine are those:A classical percept $s \in S$ is translated into a state in the quantum memory of the agent. This state is described by the density operator $\rho(s)$.The physical system will evolve according to an Hamiltonian (which we will specify later).A quantum measure on this quantum system will lead us to a specific action.We define the probability of transition from a clip to another by the Born's rule: nonorthogonal vector/clips are connected by a probability $0 \leq p= \left| \langle c_i|c_j \rangle \right|^2 \leq 1$ that the jump of the excitation during the walk may occur. That's the natural choice of embedding probability in a quantum world, so to give raise to our beloved quantum interference. As you may imagine, the initial state could be a superposition of multiple initial states, and this might lead to a speedup.Now the tough part: how to translate in the quantum physical word a conceptual thing such as a walk on a graph? The Hamiltonian for such operation is pretty complex and has been found here.$$ H = \sum_{ \{ j,k \} \in E} \lambda_{jk}(\hat{c}_k^{\dagger}\hat{c}_{j} + \hat{c}_k \hat{c}^{\dagger}_j ) + \sum_{j \in V} \epsilon \hat{c}_j^{\dagger}\hat{c}_j $$If you are curious about the operator $c$ and $c_j$, take a look at my previous blog post here: those are exactly the same operation used to probabilistiscally move the excitation along a chain of qubits.However, there's a problem with this definition: our Hamiltonian describe the evolution of a reversible system: it means that the transition probability between two state is symmetric. While this quantum environment is perfect for undirected graphs, we have hard time modeling directed graphs. [1] We thus have to model a reversible, irreducible, and aperiodic Markov chain.The quantum speedup, which has been showed in [11] to calculate reflection via the Groover's algorithm. But this is a whole new story worth another single blog post I will be glad to write in another post.Moreover, a recently, a broader class of graphs for which there is an exponential speedup of the hitting time of the Markov chain has been found. That's cool, since the core of our algorithm is a random walk on a graph-like structure, and the hitting time (in our case) is the average time for which the excitation goes from the first percept clip to the action. To date, we know that discrete quantum walks on hypercubes are exponentially faster on quantum computers, and this class of graph has been extended to embedded hypercubes on certain graphs [12].ParametersStraight from [2], we have: ParameterRangeFieldDefaultExplanationDamping$0 \leq \gamma \leq 1$$\mathbb{R}$0 or $\frac{1}{10}$Forgetting factorReflection$ 1 \leq R $$\mathbb{N}$1Number of reflection cycleGlowing$ \eta \leq 1$$\mathbb{R}$1Glowing factor for weighting rewardsAssociative growth rate$K > 0$$\mathbb{R}$-Growth rate of associative connections of composite paths ConclusionsPS can be thought as a generalization of RL. The job of updating transition probability between edges can be done by Bayesian updating, which is basically a RL algorithm.The ECM memory is an important part of the PS model. Is what allows an agent to do reflection, which is a kind simulation made by the agent of it's action in the world. Emotions and ECM are what allows the agent to detach from primary experience and to project themselves into conceivable situations (fictitious memory), without taking any real action.A further generalization of PS scheme can be found in [10]. To sum everything up, learning in PS is achieved in three different ways:modifying via bayesian updating the transition probability between the clips of the network (aided by an RL algorithm)creating new clips when new percepts are receivedcreating new clips from existing ones according to a compositional principle.Much of the "magic" of this algorithm is embedded into our definition of clips, but it is something it must be specified case by case. That is because PS agents are defined on a more conceptual level as agents whose internal states represent episodic and compositional memory and whose deliberation comprises association-driven hops between memory sequences (the clips). [13]I am not aware yet of any implementation in quipper or LiQI|> of the algorithm. We could, in fact, write the Hamiltonian of the physical system and give it as an input to the GSE algorithm, and run our software on our favorite quantum programming language [5] [6]. GSE is an algorithm for simulating a physical system (given it's Hamiltonian) on a gate-based model of quantum computer, efficiently. There is already an implementation of GSE, based on the work of [4].Speaking about simulating physics with computers, I built an Hamiltonian with my own bare hands, and you can play with it here!I would like to close this article with a quote [7] which I believe subsume the Zeitgeist among many scientists nowadays: If we accept that free will is compatible with physical law, we also have to accept that it must be possible, in principle, to build a machine that would exhibit similar forms of freedom as the one we usually ascribe to humans and certain animalsI do not claim copyright for any of the pictures in this post. They all belongs to the authors of [1]. Ocio: I would appreciate if you can send me an email if you find any mistake. Feedback is always welcome, I want to improve! This post may get updated over time as I learn new things or I am not satisfied with my previous explanations.Sybreed - God is an automation (piano cover by a random guy on youtube) References[1] Briegel, Hans J., and Gemma De las Cuevas. "Projective simulation for artificial intelligence." Scientific reports 2 (2012). - http://www.nature.com/articles/srep00400 [2] Bjerland, Øystein Førsund. "Projective Simulation compared to reinforcement learning" - http://bora.uib.no/bitstream/handle/1956/10391/135269761.pdf?sequence=1&isAllowed=y [3] Java Implementation of PS algorithm by Bjerland, Øystein Førsund - https://bitbucket.org/mroystein/projectivesimulation[4] Simulation of Electronic Structure Hamiltonians Using Quantum ComputersJames D. Whitfield, Jacob Biamonte, Alán Aspuru-Guzik: http://arxiv.org/abs/1001.3855[5] Green, Alexander S., et al. "An introduction to quantum programming in Quipper." Reversible Computation. Springer Berlin Heidelberg, 2013. 110-124. http://arxiv.org/pdf/1304.5485v1.pdf [6] Liquid User Manual : https://msr-quarc.github.io/Liquid/LIQUiD.pdf[7] Briegel, Hans J. "On creative machines and the physical origins of freedom." Scientific reports 2 (2012). - http://www.nature.com/articles/srep00522 [8] Kandel, Eric R. "The molecular biology of memory storage: a dialogue between genes and synapses." Science 294.5544 (2001): 1030-1038. http://www.ncbi.nlm.nih.gov/pubmed/11691980[9] Antonov, Igor, et al. "Activity-dependent presynaptic facilitation and Hebbian LTP are both required and interact during classical conditioning in Aplysia." Neuron 37.1 (2003): 135-147. [10] Projective simulation with generalization: Alexey A. Melnikov, Adi Makmal, Vedran Dunjko, and Hans J. Briegel.[11] Paparo, Giuseppe Davide, et al. "Quantum speedup for active learning agents." Physical Review X 4.3 (2014): 031002. http://journals.aps.org/prx/abstract/10.1103/PhysRevX.4.031002[12] Makmal, Adi, et al. "Quantum walks on embedded hypercubes." Physical Review A 90.2 (2014): 022314. https://arxiv.org/pdf/1309.5253.pdf[13] Hines, Andrew P., and P. C. E. Stamp. "Quantum walks, quantum gates, and quantum computers." Physical Review A 75.6 (2007): 062321. http://arxiv.org/abs/quant-ph/0701088