Build a Natural Language Processing Transformer from Scratch

Click For Summary

Discussion Overview

The discussion revolves around building and training a Natural Language Processing (NLP) transformer from scratch, focusing on the theoretical understanding and practical implementation aspects. Participants express interest in both the underlying theory and the absence of straightforward Python implementations without libraries.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • Some participants highlight the importance of understanding the theory behind transformers, suggesting that without this understanding, the technology may seem opaque or "magical."
  • Others express frustration over the lack of basic Python implementations available without using libraries like PyTorch or TensorFlow.
  • Some participants suggest that examining the source code of existing libraries could provide insights into how transformers are constructed, even if one aims to build from scratch.
  • A participant compares the complexity of understanding transformers to the challenges of grasping quantum mechanics, emphasizing the need for foundational knowledge.
  • Another participant requests numerical examples to aid understanding, arguing that practical examples can clarify theoretical concepts.
  • Links to resources and articles are shared by participants as potential aids for those seeking to learn more about transformers.

Areas of Agreement / Disagreement

Participants generally agree on the necessity of understanding the theory behind transformers, but there is disagreement on the availability and need for practical Python implementations without libraries. The discussion remains unresolved regarding the best approach to learning and implementing transformers from scratch.

Contextual Notes

Participants express varying levels of frustration regarding the perceived opacity of transformer technology and the terminology used in deep learning. There is an acknowledgment that while theory is important, practical examples are also sought after to enhance understanding.

jonjacson
Messages
450
Reaction score
38
TL;DR
I wonder if anybody knows how to build and train one from scratch or if there is any book, video, or website explaining it.
I have read that transformers are the key behind recent success in artificial intelligence but the problem is that it is quite opaque.

I wonder if anybody knows how to build and train one from scratch or if there is any book, video, or website explaining it.

Thanks
 
Technology news on Phys.org
jonjacson said:
I have read that transformers are the key behind recent success in artificial intelligence but the problem is that it is quite opaque.
Then you need to understand the theory.

jonjacson said:
But I don't see a python implementation, just the theory.
You did not ask for python code.
Google: python code for NPL transformer

There will be more answers from others.
 
jonjacson said:
But I don't see a python implementation, just the theory.
But you didn't ask for a Python implementation, you asked about building one from scratch!

If I wanted to find a Python machine learning algorithm related to [X] I would input "Tensorflow X" into a search engine. Have you tried this?
 
  • Informative
Likes   Reactions: berkeman
Baluncore said:
Then you need to understand the theory.You did not ask for python code.
Google: python code for NPL transformer

There will be more answers from others.
I see answers but they use libraries like pytorch or tensorflow. I mean from scratch, pure python.

pbuk said:
But you didn't ask for a Python implementation, you asked about building one from scratch!

If I wanted to find a Python machine learning algorithm related to [X] I would input "Tensorflow X" into a search engine. Have you tried this?
I don't want to use libraries.
 
jonjacson said:
I see answers but they use libraries like pytorch or tensorflow. I mean from scratch, pure python.
Even if you don't use libraries, looking at the source code for the libraries might be a good way of learning how these things are done in Python.

If searching the web doesn't turn up any Python implementations that don't use libraries, that's probably a clue that everyone else who has tried what you are trying has found it easier to use the well-tested implementations in the libraries than to try and roll their own.
 
PeterDonis said:
Even if you don't use libraries, looking at the source code for the libraries might be a good way of learning how these things are done in Python.

If searching the web doesn't turn up any Python implementations that don't use libraries, that's probably a clue that everyone else who has tried what you are trying has found it easier to use the well-tested implementations in the libraries than to try and roll their own.

The problem is that this looks like a magic thing, I don't know why is it "hidden" behind the bogus language "deep learning", "encoder", "decoder", "tokeninez input embeeding", "multi head self attention", "layer normalization", "feed forward network", "residual connection".... and all that stuff.

At the end I guess this will be a whole bunch of vectors, matrices and operations on them.

Hopefully now you understand what I want to know.
 
jonjacson said:
The problem is that this looks like a magic thing
That problem doesn't look to me like a "find Python code" problem. It looks to me like an "learn and understand the theory" problem, as @Baluncore has already pointed out.
 
  • Like
Likes   Reactions: russ_watters, pbuk, Tom.G and 1 other person
  • #10
jonjacson said:
The problem is that this looks like a magic thing, ...
“Any sufficiently advanced technology is indistinguishable from magic”.
Arthur C. Clarke's third law.
 
  • Like
Likes   Reactions: russ_watters
  • #11
Baluncore said:
“Any sufficiently advanced technology is indistinguishable from magic”.
Arthur C. Clarke's third law.

Nice, but still there is no basic example of this anywhere.
 
  • #12
jonjacson said:
Nice, but still there is no basic example of this anywhere.
It is only magic because you do not yet understand the theory. If you were given some version of the Python code, you would still not understand the theory. It would still be magic, and a danger to the uninitiated.
 
  • Like
Likes   Reactions: russ_watters, PeterDonis and pbuk
  • #13
jonjacson said:
The problem is that this looks like a magic thing, I don't know why is it "hidden" behind the bogus language "deep learning", "encoder", "decoder", "tokeninez input embeeding", "multi head self attention", "layer normalization", "feed forward network", "residual connection".... and all that stuff.
For the same reason that quantum mechanics is hidden behind the bogus language "complex projective space", "Hermitian operators", "Hamiltonians", "eigenstates", "superpositions" and all that stuff.

At the end this is just a whole bunch of vectors, matrices and operations on them.

jonjacson said:
Hopefully now you understand what I want to know.
Yes, you want to do QM without learning the theory. Good luck.

Edit: or is this the kind of thing you are looking for: https://habr.com/en/companies/ods/articles/708672/
 
  • #14
pbuk said:
For the same reason that quantum mechanics is hidden behind the bogus language "complex projective space", "Hermitian operators", "Hamiltonians", "eigenstates", "superpositions" and all that stuff.

At the end this is just a whole bunch of vectors, matrices and operations on them.Yes, you want to do QM without learning the theory. Good luck.

Edit: or is this the kind of thing you are looking for: https://habr.com/en/companies/ods/articles/708672/

I am not saying that theory is bad or unnecessary. What I am looking for is a numerical example.

Schrödinger equation is fine, but once you compute the orbitals of the hydrogen atom you get a better understanding.

I don't understand why it is bad to ask for numerical examples and numbers.

Your edit was great and it is what I was looking for, I add the link you have at the end of that article:

https://jalammar.github.io/illustrated-transformer/

And something I just found:

https://e2eml.school/transformers.html

I hope this helps anybody interested on this topic.

Thanks to all for your replies.

Edit:

This may be good too:

 
Last edited:

Similar threads

Replies
4
Views
4K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 35 ·
2
Replies
35
Views
5K
  • · Replies 12 ·
Replies
12
Views
2K
Replies
2
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K