Commit 8d7fd48
committed
Convert & to & as a Characters token
This fixes a problem in LinkifyFilter when using it with the Cleaner where
the Cleaner sets up the tokenizer to not consume entities. So character
entities end up in their own Entity tokens and Linkifyfilter can't match
links that cross token boundaries. If there's a &, then LinkifyFilter
won't match across that.
This fixes that by converting & to & in the sanitizer when it's pulling out
entities and putting them in separate Entity tokens. The & Characters tokens
will get merged by BleachSanitizerFilter.__iter__ and & will get converted
back to & in the serialier.
Fixes #4221 parent 3097fd3 commit 8d7fd48
2 files changed
Lines changed: 16 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
395 | 395 | | |
396 | 396 | | |
397 | 397 | | |
398 | | - | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
399 | 410 | | |
400 | 411 | | |
401 | 412 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
694 | 694 | | |
695 | 695 | | |
696 | 696 | | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
697 | 701 | | |
698 | 702 | | |
699 | 703 | | |
| |||
0 commit comments