Search

# The Butterfly Effect

### November 2016

In this series, we are observing the semantic errors of a hippocampal simulation of neurointerfaces, and the sampling grid approach used to model its unsupervised feature maps. This section will get into the linear algebra and calculus behind the sampling grids and how they relate to a variate error in the final system.

### Parametrized Sampling Grid

A sampling grid, neuroanatomically a receptive network, will be parameterized to allow the mutation of various neurobiological parameters, such as dopamine, oxytocin, or adrenaline, and produce a synthetic, reactionary response in the neurointerface stack. A modulation of the initial sampling grid will be used to classify the transformations to their respective location in the comprehensive memory field. In order to perform the spatial transform of the normalized input feature map, a sampler must sample a set of parameters from ${ \tau }_{ \theta }({ G }_{ i })$ where $G$ represents a static translational grid of the applied transforms. The input feature map $U$, the raw equivalent of the receptive fields, along with its primed resultant of the ${ f }_{ loc }(x) = V$ function will be accounted for as well in the translational grid. Each coordinate in $G$ represented as ${ \left( { x }_{ j }^{ s },{ y }_{ j }^{ s } \right) }_{ j }$, giving a gradient dimensionality $j$ to the spatial grid input. A gradient dimensionality allows the sparse network to have an infinite number of spatial perspectives as I will soon be posting about concentric bias simulation for mental illnesses.

Each coordinate in the ${ \tau }_{ \theta }({ G }_{ i })$ represents a spatial location in the input where the sampling kernel can concentrically be applied to get a projected and subsequent value in $V$ \. This, for stimuli transforms, can be written as:

${ V }_{ i }^{ c }(j)=\frac { \sum _{ n }^{ H }{ \sum _{ m }^{ W }{ { U }_{ nm }^{ c } } k\left( { x }_{ i }^{ s }-{ m };{ \Phi }_{ x } \right) k\left( { y }_{ i }^{ s }-n;{ \Phi }_{ x } \right) { :\quad \forall }_{ i }\in \left[ 1\dots { H }^{ ' }{ W }^{ ' } \right] } { :\quad \forall }_{ c }\in \left[ 1\dots C \right] }{ \left< { j }|{ { H }^{ ' } }|{ { W }^{ ' } } \right> }$

Here, $\Phi$ represents the parameterized potential of the sampling kernel of the spatial transformer which will be used to forward neuroanatomical equivalences through recall gradients.

The use of kernel sampling can be varied as long as all levels of gradients can be simplified to functions of ${ \left( { x }_{ j }^{ s },{ y }_{ j }^{ s } \right) }_{ j }$. For the purposes of our experimentation, a bilinear sampling kernel will be used to co-parallely process inputs, allowing for a larger parametrization of learning transforms. To allow backpropagation of loss through this sampling mechanism, the gradient functions must be with respect to $U$ and $G$. This observation was initially established as a means to allow sub-differentiable sampling in a similar bilinear sampling method:

$\frac { \delta { V }_{ i }^{ c } }{ \delta { U }_{ nm }^{ c } } =\sum _{ n }^{ H }{ \sum _{ m }^{ W }{ \max _{ j }{ (0,1-\left| { x }_{ i }^{ s }-m \right| ) } \max _{ j }{ (0,1-\left| { y }_{ i }^{ s }-n \right| ) } } }$

$\frac { \delta { V }_{ i }^{ c } }{ \delta { x }_{ i }^{ s } } =\sum _{ n }^{ H }{ \sum _{ m }^{ W }{ { U }_{ nm }^{ c }\max _{ j }{ (0,1-\left| { y }_{ i }^{ s }-n \right| ) } \begin{cases} 0 & if\left| m-{ x }_{ i }^{ s } \right| \ge 1 \\ 1 & if\quad m\ge { x }_{ i }^{ s } \\ -1 & if\quad m<{ x }_{ i }^{ s } \end{cases} } }$

Therefore, loss gradients can be attributed not only to the spatial transformers, but also to the input feature map, sampling grid, and, finally, back to the parameters, $\Phi$ & $\theta$. The bilinear sampler has been slightly modified in this case to allow for concentric recall functions to be applied to its resultant fields. It is worth noting that due to this feature, the spatial networks representation of the learned behavior is unique in the rate and method of preservation, much like how each person is unique in his ability to learn and process information. The observable synthetic activation complexes can also be modeled through the monitoring of these parameters as they elastically adapt to the stimulus. The knowledge of how to transform is encoded in localization networks, which fundamentally are non-static as well.

### Sparse Learning Recall Networks

Recall-based functions are classically indicative of a mirror neuron system in which each approximation of the neural representation remains equally utilized, functioning as a load balancing mechanism. Commonly attributed to the preemptive execution of a planned task, the retention of memory in mirror neural systems tends to be modular in persistence and metaphysical in nature. Sparse neural systems interpret signals from cortical portions of the brain, allowing learned behaviors from multiple portions of the brain to execute simultaneously as observed in Fink’s studies on cerebral memory structures. It is theorized that the schematic representation of memory in these portions of the brain exists in memory fields only after a number of transformations have occurred in response to the incoming stimulus. Within these transformations lies the inherent differentiating factor in functional learning behavior: specifically, those which cause the flawed memory functions in the patients of such mental illnesses.

#### Semantic Learning Transformation

Now, similar to my fluid intelligence paper, we will need to semantically represent all types of ideas in a way that most directly allows for future transformations and biases to be included. For this, we will use a mutated version of the semantic lexical transformations.

The transformation of raw stimulus, in this case a verbal and unstructured story-like input, to a recall-able and normalized memory field will be simulated by a spatial transformer network. These mutations in raw input are the inherent reason for differentiated recall mechanisms between all humans. An altered version of the spatial transformer network, as developed in \cite{JaderbergSpatialNetworks} in Google’s Deepmind initiative, will be used to explicitly allow the spatial manipulation of data within the neural stack. Recall gradients mapped from our specialized network find their activation complexes similar to that of the prefrontal cortex in the brain,

An altered version of the spatial transformer network, as developed in Google’s Deepmind initiative, will be used to explicitly allow the spatial manipulation of data within the neural stack. Recall gradients mapped from our specialized network find their activation complexes similar to that of the prefrontal cortex in the brain, tasked with directing and encoding raw stimulus.

##### The Spatial Transformer Network (Unsupervised)

Originally designed for pixel transformations inside a neural network, the sampling grid or the input feature map will be parameterized to fit the translational needs of comprehension. The formulation of such a network will incorporate an elastic set of spatial transformers, each with a localisation network and a grid generator. Together, these will function as the receptive fields interfacing with the hypercolumns.

Now these transformer networks allowed us to parameterize any type of raw stimulus to be parsed and propagated through a more abstracted and generalized network capable of modulating fluid outputs.

The localisation network will take a mutated input feature map of $U\in { \textbf{R}}^{ { H }_{ i }\times { W }_{ i }\times { C }_{ i } }$, with width $W$, height $H$, channels $C$ and outputs ${\theta }_{i }$. $i$ represents a differentiated gradient-dimensional resultant prioritized for storage in the stack. This net feature map allows the convolution of learned transformations to a neural stack in a compartmentalized system. A key characteristic of this modular transformation, as noted in Jaderberg’s spatial networks, is that the parameters of the transformations in the input feature map, as the size of $\theta$, can vary depending on the transformation type. This allows the sparse network to easily retain the elasticity needed to react to any type of stimulus, giving opportunity for compartmentalized learning space. The net dimensionality of the transformation ${ \tau }_{ \theta }$ on the feature map can be represented: $\theta ={ f }_{ loc }\left( x \right)$. In any case, the ${ f }_{ loc }\left( \right)$ can take any form, especially that of a learning network. For example, for a simple laplace transform, $\theta$ will assume a 6-dimensional position, and ${ f }_{ loc }\left( \right)$ will take the form of a convolutional network or a fully connected network (\cite{AndrewsIntegratingRepresentations}). The form of ${ f }_{ loc }\left( \right)$ is unbounded and nonrestrictive in domain, allowing all forms of memory persistence to coexist in the spatial stack.