From: Transforming the generative pretrained transformer into augmented business text writer
Matrix | Dimension |
---|---|
\(\hbox {X}_1....X_n\) | Upto 512 depends on length of sentence |
Every W | 64 |
X \(\times W\) | DX\(\times\)64 |
Z | DX\(\times\)64 |
\(\hbox {W}_o\) | DX\(\times\)64 |