From 2f485096784c9638cd5909bb826aecaeed004988 Mon Sep 17 00:00:00 2001 From: maTh Date: Mon, 26 Jun 2023 23:28:24 +0200 Subject: docs: web scrapping with XPath (#5494) * added docs * add correct link * typo * A bit of typography --------- Co-authored-by: Alexandre Alapetite --- docs/en/users/02_First_steps.md | 2 ++ docs/en/users/11_website_scraping.md | 24 ++++++++++++++++++++++++ 2 files changed, 26 insertions(+) create mode 100644 docs/en/users/11_website_scraping.md (limited to 'docs') diff --git a/docs/en/users/02_First_steps.md b/docs/en/users/02_First_steps.md index 7f176af77..17c3cb88a 100644 --- a/docs/en/users/02_First_steps.md +++ b/docs/en/users/02_First_steps.md @@ -25,3 +25,5 @@ Now that you’ve mastered basic use, it’s time to configure FreshRSS to impro * [Access your feeds on a mobile device](06_Mobile_access.md) * [Add some extensions](https://github.com/FreshRSS/Extensions) * [Frequently asked questions](07_Frequently_Asked_Questions.md) + +FreshRSS has a built-in engine that [scraps a website to create an own feed](11_website_scraping.md). diff --git a/docs/en/users/11_website_scraping.md b/docs/en/users/11_website_scraping.md new file mode 100644 index 000000000..9d27981b2 --- /dev/null +++ b/docs/en/users/11_website_scraping.md @@ -0,0 +1,24 @@ +# Website scraping + +FreshRSS has a built-in [Web scrapping](https://en.wikipedia.org/wiki/Web_scraping) engine that generates a feed from websites that have no RSS/Atom feed published. + +## How to add + +Go to “Subscription Management” where a new feed can be added. +Change the “Type of feed source” to “HTML + XPath (Web scrapping)”. +An additional list of text boxes to configure the web scraping. +[XPath 1.0](https://www.w3.org/TR/xpath-10/) is used as traversing language. + +### Get the XPath path + +Firefox: the built-in “inspect” tool may be used to help create a valid XPath expression. +Select the node in the HTML, right click with your mouse and chose “Copy” and “XPath”. +The XPath is stored in your clipboard now. + +## Tipps & tricks + +- [Timezone of date](https://github.com/FreshRSS/FreshRSS/discussions/5483) + +## Recommended external manuals + +- [XPath Scraping with FreshRSS, by Dan Q](https://danq.me/2022/09/27/freshrss-xpath/) (September 2022) -- cgit v1.2.3