The Butterfly Effect

for the idea in all of us


Artificial Intelligence

Sparse Learning in Hippocampal Simulations

Sparse Learning Recall Networks

Recall-based functions are classically indicative of a mirror neuron system in which each approximation of the neural representation remains equally utilized, functioning as a load balancing mechanism. Commonly attributed to the preemptive execution of a planned task, the retention of memory in mirror neural systems tends to be modular in persistence and metaphysical in nature. Sparse neural systems interpret signals from cortical portions of the brain, allowing learned behaviors from multiple portions of the brain to execute simultaneously as observed in Fink’s studies on cerebral memory structures. It is theorized that the schematic representation of memory in these portions of the brain exists in memory fields only after a number of transformations have occurred in response to the incoming stimulus. Within these transformations lies the inherent differentiating factor in functional learning behavior: specifically, those which cause the flawed memory functions in the patients of such mental illnesses.

Semantic Learning Transformation

Now, similar to my fluid intelligence paper, we will need to semantically represent all types of ideas in a way that most directly allows for future transformations and biases to be included. For this, we will use a mutated version of the semantic lexical transformations.

The transformation of raw stimulus, in this case a verbal and unstructured story-like input, to a recall-able and normalized memory field will be simulated by a spatial transformer network. These mutations in raw input are the inherent reason for differentiated recall mechanisms between all humans. An altered version of the spatial transformer network, as developed in \cite{JaderbergSpatialNetworks} in Google’s Deepmind initiative, will be used to explicitly allow the spatial manipulation of data within the neural stack. Recall gradients mapped from our specialized network find their activation complexes similar to that of the prefrontal cortex in the brain,

An altered version of the spatial transformer network, as developed in Google’s Deepmind initiative, will be used to explicitly allow the spatial manipulation of data within the neural stack. Recall gradients mapped from our specialized network find their activation complexes similar to that of the prefrontal cortex in the brain, tasked with directing and encoding raw stimulus.

The Spatial Transformer Network (Unsupervised)

Originally designed for pixel transformations inside a neural network, the sampling grid or the input feature map will be parameterized to fit the translational needs of comprehension. The formulation of such a network will incorporate an elastic set of spatial transformers, each with a localisation network and a grid generator. Together, these will function as the receptive fields interfacing with the hypercolumns.

Now these transformer networks allowed us to parameterize any type of raw stimulus to be parsed and propagated through a more abstracted and generalized network capable of modulating fluid outputs.

The localisation network will take a mutated input feature map of U\in { \textbf{R}}^{ { H }_{ i }\times { W }_{ i }\times { C }_{ i } }, with width W, height H, channels C and outputs {\theta }_{i }. $i$ represents a differentiated gradient-dimensional resultant prioritized for storage in the stack. This net feature map allows the convolution of learned transformations to a neural stack in a compartmentalized system. A key characteristic of this modular transformation, as noted in Jaderberg’s spatial networks, is that the parameters of the transformations in the input feature map, as the size of \theta, can vary depending on the transformation type. This allows the sparse network to easily retain the elasticity needed to react to any type of stimulus, giving opportunity for compartmentalized learning space. The net dimensionality of the transformation { \tau }_{ \theta } on the feature map can be represented: \theta ={ f }_{ loc }\left( x \right) . In any case, the { f }_{ loc }\left( \right) can take any form, especially that of a learning network. For example, for a simple laplace transform, $\theta$ will assume a 6-dimensional position, and { f }_{ loc }\left( \right) will take the form of a convolutional network or a fully connected network (\cite{AndrewsIntegratingRepresentations}). The form of { f }_{ loc }\left( \right) is unbounded and nonrestrictive in domain, allowing all forms of memory persistence to coexist in the spatial stack.





TEDx Talk

If you didn’t get a chance to see my TED talk live, the video has just been produced and uploaded onto the TEDx channel on Youtube (below).

The talk is about some of my work in artificial intelligence: specifically the results we’ve observed in our research in synthetic neurointerfaces. Our goal was to functionally and synthetically model the human neocortical columns in an artificial intelligence to give a more differentiable insight into the cognitive behaviors we, as humans, exhibit on a daily basis.

If you would like to know more, I have published the working paper here.

Please let me know what you all think in the comments section below or on Youtube, I would love all the feedback I can get!

Synthetic Neuruointerfaces: Abstract

Earlier this week, I published my working paper on simulating synthetic neurointerfaces. It’s been quite a journey getting here, and I apologize for the delay in posting about the posting of my paper. I’m going to submit the paper to the 2017 International Conference for Learning Representations (ICLR). What I have posted is a working paper, meaning that there will be more drafts and revisions to come before January. If you have any questions please feel free to contact me. I would also like to give a disclaimer that my work purely comes from a mathematical, and a computer science background. This is a draft, and there are field experts that helped me with the computational neuroscience portion of this project. In the end, my goal was to make the brain itself, a formal system: and I have treated the brain as such throughout.

I’m very excited about this project not only because of its potential but because of what it’s already showed us. We are now able to get some basic neural representations of simple cognitive functions and modulate the functional anatomy of a synthetic neocortical column with ease, a step that we couldn’t achieve otherwise.

In this study, we explore the potential of an unbounded, self-organizing spatial network to simulate translational awareness lent by the brain’s neocortical hypercolumns as a means to better understand the nature of awareness and memory. We modularly examine the prefrontal cortical function, amygdalar responses, and cortical activation complexes to model a synthetic recall system capable of functioning as a compartmentalized and virtual equivalent of the human memory functions. The produced neurointerfaces are able to consistently reproduce the reductive learning quotients of humans in various learning complexities and increase generalizing potentials across all learned behaviors. The cognitive system is validated by examining its persistence under the induction of various mental illnesses and mapping the synthetic changes to their equivalent neuroanatomical mutations. The resultant set of neurointerfaces is a form of artificial general intelligence that produces wave forms empirically similar to that of a patient’s brain. The interfaces also allow us to pinpoint, geometrically and neuroanatomically, the source of any functional behavior.

The rest of the paper can be found here:



Coming Soon: Synthetic Neurointerfaces

I’m getting ready to release my work in persisting synthetic neurointerfaces in unbounded spatial networks. I truly believe that the use of computational tools such as this can be used to study the structure of intelligent computation in high-dimensional neural systems. What I tried to emulate in this project was a neuron by neuron representation of some basic cognitive functions by persisting a memory field in which self organizing neocortical hypercolumns could be functionally represented. The project was inspired by biological neural dynamical systems and foundationally rooted in some of the brilliant work Google’s Deep Mind project has been doing.  Before I publish any results, I would like to give a special thanks to my mentor and long time friend, Dr. Celia Rhodes Davis. Also, I would like to especially thank the Stanford Department of Computational Neuroscience  (Center for Brain, Mind & Computation) for functioning as an advisory board throughout my independent research and functioning as a sound logic board for general guidance.

Below is a problem definition, goals, and a small sneak peek regarding the immediate potential, and execution of my project:


The interface between the neuroanatomical activation of neocortical hypercolumns and their expressive function is a realm largely unobserved, due to the inability to efficiently and ethically study causational relationships between previously exclusively observed phenomenon. The field of general neuroscience explores the anatomical significance of cortical portions of the brain, extending anatomy as a means to explain the persistence of various nervous and physically expressive systems. Psychological approaches focus purely on \textit{expressive} behaviors as means to extend, with greater fidelity, the existence and constancy of the brain-mind interface. The interface between the anatomical realms of the mind and their expressive behaviors is a field widely unexplored, with surgeries such as the lobotomy and other controversial, experimental, and life-threatening procedures at the forefront of such study. However, the understanding of these neurological interfaces has potential to function as a window into the neural circuitry of mental illnesses, opening the door for cures and an ultimately more complete understanding of our brain.


We propose a method to simulate unbounded memory fields upon which recall functions can be parameterized. This model will be able to simulate cortical functions of the amygdala in its reaction to various, unfiltered stimuli. An observer network will be parallely created to analyze geometric anomalies in the neuroanatomical interface in memory recall functions, and extend equivalences between recall function parameters and memory recall gradients. This enables it to extend hypothesis to neuroanatomical functions.

Fluid Intelligence: Introduction


Fluid intelligence: the capacity to think logically and solve problems in novel situations, independent of acquired knowledge

Psychology has found the basis of fluid intelligence in the juxtaposition of layered memory and application as means to essentially “connect two fluid ideas with an an abstractly analogous property”. Such a mathematical design would have to be able to therefore derive temporal relationships with weighted bonds between two coherently disparate concepts through the means of similar properties. These properties within node types will have to be self-defined and self-propagated within idea types.


In a pursuit towards a truly dynamic artificial intelligence, it is necessary to establish a recurrent method to decipher the presence of concrete yet abstract entities (“ideas”) independent of a related and coherent topic set.
A considerable amount of work venturing into this field has culminated in the prevalence of statistical methods to extract probabilistic models dependent on large amounts of unstructured data. These Bayesian data analytic techniques often result in an understanding superficial in the context of a true relational understanding. Furthermore, this “bag-of-words” approach when looking at amounts of unstructured data (quantifiable by correct relationships derived between the idea nodes) often relate to a single dimensional understanding of the topics at hand. Traditionally, when these topics are transformed, it is difficult to extract hierarchy and queryable relations using matrix transformations from a derived data set.

The project that I will be describing in the subsequent posts is an effort to change the approach from which dynamic fluid intelligence is derived, finding a backbone in streaming big data. Ideally, this model would be able to take a layered, multi-dimensional approach to autonomous identification of properties of dynamically changing ideas from portions of said data set. It would also be able to find types of relationships, ultimately deriving a set of previously undefined relational schemas through unsupervised machine learning techniques that would ultimately allow for a queryable graph with properties and nodes initially undefined.

Big Data Coorelation: Purpose


About 1.8 zettabytes (1.8 trillion gigabytes) of data is being created every year. In all this data there are answers to problems we have been wondering about for ages. It’s just how you can process the information most efficiently and derive correlations from the complexity of the data on the internet. You may not be able to prove anything scientifically, but you may be able to prove hypotheses statistically with huge amounts of data which is hidden somewhere in this intimidating data set. So is it possible to mine hidden information from these huge scales? Can one use existing technologies such as Apache Hadoop, Nutch, Map Reduce, and Google API to develop an engine that can derive comprehendible correlational data autonomously and efficiently?


With all this data being produced every year, finding a radical and innovative way of processing large and complex data sets is a need that is unfulfilled. For any computer, processing unstructured data is a very arduous and long process (all the internet’s data is unstructured). This exercise of an engine implementation is an attempt at combining multiple high-end technologies to work in unison to crutch and sift through large and complex data sets to Continue reading “Big Data Coorelation: Purpose”

In the Comparison of Genetic Operators For Solving the Traveling Salesman Problem: Selection

In comparing selection methods, for the sake of comparison it was in our best interest to leave the least to randomness except in the selection method. The mutation method was the center inverse mutation throughout all the trials and a center mutation point was chosen every time. The cutoff percentage was the same (30%) for each trial and the number of generations was a fixed 5000.

The numbers displayed below are the average of 10 trials conducted with the same input graph but a different initial population for each trial.Selection Comparison

Genetic Algorithm: Selection

In every generation, a selection agent comes to play which sifts out the fit chromosomes from the unfit chromosomes. The selection agent “kills off” a user specified percentage of organisms in the population.However, it is under the discretion of the selection agent in determining which chromosomes to kill. As mentioned earlier, fitness is defined by having the lowest weight in the circumstances put forth by the TSP. However selection may not necessarily be only off of that. This can be seen when comparing the two most prevalent types of selection operators:
Continue reading “Genetic Algorithm: Selection”

Genetic Algorithm: Mutation

During the progression of a genetic algorithm, the population can hit a local optima (or extrema). Nature copes for this local optima by adding random genetic diversity to the population set “every-so-often” with the help of mutation. Our genetic algorithm accomplishes this via the mutation operator. Although there are a plethora of mutation types our GA focused on a select two:

1. Reverse Sequence Mutation – In the reverse sequence mutation operator, we take a random point in the sequence or organism. We split the path (P1) at the selected point. The second half of the split path (P1H2) is then inverted and appended to the end of the first half (P1H1) with the necessary corrections made to make sure the last node is the same as the start node to get a final mutated path (M1).

P1: {A, C, J | D, G, H, E, B, F, I, A}  ⇒ M1: {A, C, J, I, F, B, E, H, G, D, A}

2. Center Inverse Mutation – The chromosome (path or organism) is divided into two sections at the middle of the chromosome. Each of these sections are then inverted and added to a new path. The order of each of these halves remains constant, meaning the first inverted half remains the first half in the mutated path. The necessary corrections are made to amend the mutated path into a viable path so solve the TSP.

P1: {A, C, J, D, G, H, E | B, F, I, A}  ⇒ M1: {A, E, H, G, D, J, C, I, F, B, A}

Create a website or blog at

Up ↑