diff options
| author | 2023-06-26 23:28:24 +0200 | |
|---|---|---|
| committer | 2023-06-26 23:28:24 +0200 | |
| commit | 2f485096784c9638cd5909bb826aecaeed004988 (patch) | |
| tree | 63ceaf0311d74edd76907a05ac0465e987161610 /docs | |
| parent | 666e951fa3e30c3a1b7f5ae68a2be4e06577b75d (diff) | |
docs: web scrapping with XPath (#5494)
* added docs
* add correct link
* typo
* A bit of typography
---------
Co-authored-by: Alexandre Alapetite <alexandre@alapetite.fr>
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/en/users/02_First_steps.md | 2 | ||||
| -rw-r--r-- | docs/en/users/11_website_scraping.md | 24 |
2 files changed, 26 insertions, 0 deletions
diff --git a/docs/en/users/02_First_steps.md b/docs/en/users/02_First_steps.md index 7f176af77..17c3cb88a 100644 --- a/docs/en/users/02_First_steps.md +++ b/docs/en/users/02_First_steps.md @@ -25,3 +25,5 @@ Now that you’ve mastered basic use, it’s time to configure FreshRSS to impro * [Access your feeds on a mobile device](06_Mobile_access.md) * [Add some extensions](https://github.com/FreshRSS/Extensions) * [Frequently asked questions](07_Frequently_Asked_Questions.md) + +FreshRSS has a built-in engine that [scraps a website to create an own feed](11_website_scraping.md). diff --git a/docs/en/users/11_website_scraping.md b/docs/en/users/11_website_scraping.md new file mode 100644 index 000000000..9d27981b2 --- /dev/null +++ b/docs/en/users/11_website_scraping.md @@ -0,0 +1,24 @@ +# Website scraping + +FreshRSS has a built-in [Web scrapping](https://en.wikipedia.org/wiki/Web_scraping) engine that generates a feed from websites that have no RSS/Atom feed published. + +## How to add + +Go to “Subscription Management” where a new feed can be added. +Change the “Type of feed source” to “HTML + XPath (Web scrapping)”. +An additional list of text boxes to configure the web scraping. +[XPath 1.0](https://www.w3.org/TR/xpath-10/) is used as traversing language. + +### Get the XPath path + +Firefox: the built-in “inspect” tool may be used to help create a valid XPath expression. +Select the node in the HTML, right click with your mouse and chose “Copy” and “XPath”. +The XPath is stored in your clipboard now. + +## Tipps & tricks + +- [Timezone of date](https://github.com/FreshRSS/FreshRSS/discussions/5483) + +## Recommended external manuals + +- [XPath Scraping with FreshRSS, by Dan Q](https://danq.me/2022/09/27/freshrss-xpath/) (September 2022) |
