A quest for very long context: Part 1
Experiments on extending transformer context length, including training observations, tradeoffs, and lessons from long-context tuning.
1 piece in this tag.
Experiments on extending transformer context length, including training observations, tradeoffs, and lessons from long-context tuning.