[ Quellcode: html-text ]
Paket: python3-html-text (0.5.2-2)
Links für python3-html-text
Debian-Ressourcen:
Quellcode-Paket html-text herunterladen:
Betreuer:
Externe Ressourcen:
- Homepage [github.com]
Ähnliche Pakete:
extract text from HTML.
How is html_text different from .xpath('//text()') from LXML or .get_text() from Beautiful Soup ?
* Text extracted with html_text does not contain inline styles,
javascript, comments and other text that is not normally visible to
users;
* html_text normalizes whitespace, but in a way smarter than
.xpath('normalize-space()), adding spaces around inline elements (which
are often used as block elements in html markup), and trying to avoid
adding extra spaces for punctuation;
* html-text can add newlines (e.g. after headers or paragraphs), so that
the output text looks more like how it is rendered in browsers.
Andere Pakete mit Bezug zu python3-html-text
|
|
|
|
-
- dep: python3
- interactive high-level object-oriented language (default python3 version)
-
- dep: python3-lxml
- Python-Anbindung für die Bibliotheken libxml2 und libxslt
python3-html-text herunterladen
| Architektur | Paketgröße | Größe (installiert) | Dateien |
|---|---|---|---|
| all | 9,0 kB | 38,0 kB | [Liste der Dateien] |
