Similarities among ML domains
NN | Transformer | CNN |
---|---|---|
- | Multi-head | Multi-channel |
- | Skip-connection | ResNet |
Progress of Natural Language Processing
Model | Main Disadvantage | Solved by | How? |
---|---|---|---|
NN | Can’t handle dynamic length input | RNN | RNN can handle dynamic length input |
RNN | Vanishing Gradient Problem | LSTM | LSTM can handle vanishing gradient problem |
LSTM | Non parallelizable | Transformer | Transformer can parallelize the computation |
Trasformer | losses sequentiality | Transformer | Positional Encoding |