Building a homemade Long Short Term Memory with FSMs

Trollfaz · Apr 7, 2024

I am doing a project to build a Long Short Term Memory algorithm from scratch. LSTMs are capable of retaining memory of the past inputs and carrying them for future operations thanks to Recurring Neural Networks to process a series of inputs such as sounds and text.

One possible way I can think of such methods is Finite State Machines (FSMs) . In the simplest model the FSM at any point in time can be in any state ##s \epsilon S ##. After reading an input at time t, the state of the node transits from ##s_{t-1}## to ##s_t## via a function ##f_{in}(s_{t-1},x_t)## for a valid input ##x\epsilon X##. The node then produces an output ##o_t=f_{out}(s_t)## while it will remain in the transited state for the next iteration. In this way it can retain some memory or information of the past input.

Now in complex modelling such as text, does a large numbers of FSMs build a good LSTM model?

Trollfaz · Apr 7, 2024

I shall now elaborate on how the network of FSMs work. Allow the system to contain N FSMs for a large value N say ##10^4##. Each FSM has it's output assigned to a random weight and multiplied by it. Hence the aggregate output of the system gives
$$\sum_{i=1}^N w_i o_i= \textbf{w}^T\textbf{o}_t$$
where ##\textbf{w},\textbf{o}_t## is the vector of assigned weights and output of the nodes at t respectively. The weights are free to adjust when we teach the algorithm and are initially set to small random values. During training, we minimize the loss L=##\sum (predicted-actual)^2## by gradient descent with respect to the weights.

Tom.G · Apr 7, 2024

Sounds like what is (or was, see below) normally done with the terminology of 'gates', or 'neurons', being replaced by the words 'finite state machine'. 'Neural network' is another common term that seems to apply to the same general approach.

Disclaimer: I'm Not an expert by any means! I've only dabbled in the field out of curiousity, and that was many years ago.

Cheers,
Tom

Building a homemade Long Short Term Memory with FSMs

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect