mirror of
https://forgejo.ellis.link/continuwuation/continuwuity.git
synced 2026-05-26 20:49:55 +00:00
20a54aacd6
This ensures that the tokenization algorithm will remain in sync between querying, indexing, and deindexing. The existing code had slightly different behavior for querying, because it did not discard words with >50 bytes. This was inconsequential, because >50 byte tokens are never present in the index. Signed-off-by: strawberry <strawberry@puppygock.gay>