Python Building a homemade Long Short Term Memory with FSMs

  • Thread starter Thread starter Trollfaz
  • Start date Start date
  • Tags Tags
    Homemade
AI Thread Summary
The discussion centers on building a Long Short Term Memory (LSTM) algorithm from scratch, emphasizing the ability of LSTMs to retain memory of past inputs through Recurrent Neural Networks. The author proposes using Finite State Machines (FSMs) as a foundational model, explaining that an FSM can transition between states based on inputs, thereby retaining some memory. The concept involves creating a network of multiple FSMs, where each FSM's output is weighted and aggregated to form the overall system output. The weights are adjusted during training using gradient descent to minimize prediction errors. The author draws parallels between the terminology used in LSTM development and FSMs, noting that while the terms 'gates' and 'neurons' are common in neural networks, FSMs can serve a similar purpose in this context. The author acknowledges their limited expertise in the field, highlighting a personal curiosity in the subject.
Trollfaz
Messages
143
Reaction score
14
I am doing a project to build a Long Short Term Memory algorithm from scratch. LSTMs are capable of retaining memory of the past inputs and carrying them for future operations thanks to Recurring Neural Networks to process a series of inputs such as sounds and text.

One possible way I can think of such methods is Finite State Machines (FSMs) . In the simplest model the FSM at any point in time can be in any state ##s \epsilon S ##. After reading an input at time t, the state of the node transits from ##s_{t-1}## to ##s_t## via a function ##f_{in}(s_{t-1},x_t)## for a valid input ##x\epsilon X##. The node then produces an output ##o_t=f_{out}(s_t)## while it will remain in the transited state for the next iteration. In this way it can retain some memory or information of the past input.

Now in complex modelling such as text, does a large numbers of FSMs build a good LSTM model?
 
Last edited by a moderator:
Technology news on Phys.org
I shall now elaborate on how the network of FSMs work. Allow the system to contain N FSMs for a large value N say ##10^4##. Each FSM has it's output assigned to a random weight and multiplied by it. Hence the aggregate output of the system gives
$$\sum_{i=1}^N w_i o_i= \textbf{w}^T\textbf{o}_t$$
where ##\textbf{w},\textbf{o}_t## is the vector of assigned weights and output of the nodes at t respectively. The weights are free to adjust when we teach the algorithm and are initially set to small random values. During training, we minimize the loss L=##\sum (predicted-actual)^2## by gradient descent with respect to the weights.
 
Sounds like what is (or was, see below) normally done with the terminology of 'gates', or 'neurons', being replaced by the words 'finite state machine'. 'Neural network' is another common term that seems to apply to the same general approach.

Disclaimer: I'm Not an expert by any means! I've only dabbled in the field out of curiousity, and that was many years ago.

Cheers,
Tom
 
Last edited:
Thread 'Star maps using Blender'
Blender just recently dropped a new version, 4.5(with 5.0 on the horizon), and within it was a new feature for which I immediately thought of a use for. The new feature was a .csv importer for Geometry nodes. Geometry nodes are a method of modelling that uses a node tree to create 3D models which offers more flexibility than straight modeling does. The .csv importer node allows you to bring in a .csv file and use the data in it to control aspects of your model. So for example, if you...
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...
I am trying to run an .ipynb file and have installed Miniconda as well as created an environment as such -conda create -n <env_name> python=3.7 ipykernel jupyter I am assuming this is successful as I can activate this environment via the anaconda prompt and following command -conda activate <env_name> Then I downloaded and installed VS code and I am trying to edit an .ipynb file. I want to select a kernel, via VS Code but when I press the button on the upper right corner I am greeted...
Back
Top