全部搜尋項
bookworm  ] [  trixie  ] [  sid  ]
[ 原始碼: html-text  ]

套件:python3-html-text(0.5.2-2)

python3-html-text 的相關連結

Screenshot

Debian 的資源:

下載原始碼套件 html-text

維護者:

外部的資源:

相似套件:

extract text from HTML.

How is html_text different from .xpath('//text()') from LXML or .get_text() from Beautiful Soup ?

 * Text extracted with html_text does not contain inline styles,
   javascript, comments and other text that is not normally visible to
   users;
 * html_text normalizes whitespace, but in a way smarter than
   .xpath('normalize-space()), adding spaces around inline elements (which
   are often used as block elements in html markup), and trying to avoid
   adding extra spaces for punctuation;
 * html-text can add newlines (e.g. after headers or paragraphs), so that
   the output text looks more like how it is rendered in browsers.

其他與 python3-html-text 有關的套件

  • 依賴
  • 推薦
  • 建議
  • 增強

下載 python3-html-text

下載可用於所有硬體架構的
硬體架構 套件大小 安裝後大小 檔案
all 9。0 kB38。0 kB [檔案列表]