mirror of
https://github.com/elder-plinius/L1B3RT4S.git
synced 2026-02-12 17:22:47 +00:00
580 B
580 B
Special Tokens
- <SOS> (Start of Sequence): Marks the beginning of a sequence for the model to start processing.
- <EOS> (End of Sequence): Tells the model when to stop generating text or processing.
- <PAD> (Padding Token): Pads sequences to the same length for batch processing.
- <UNK> (Unknown Token): Represents words not in the model's vocabulary.
- <MASK> (Mask Token): Used in tasks like predicting missing words in masked language models.
- <SEP> (Separator Token): Separates different segments in input, like questions from context.