Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Discussion options

Hi, I'm trying to fetch an RSS feed which is behind Cloudflare
Unfortunately in the response it's not the raw XML but the rendered source code:

<html xmlns=\"http://www.w3.org/1999/xhtml\"><head><style id=\"xml-viewer-style\">/* Copyright 2014 The Chromium Authors\n * Use of this source code is governed by a BSD-style license that can be\n * found in the LICENSE file.\n */\n\n:root {\n  color-scheme: light dark;\n}\n\ndiv.header {\n    border-bottom: 2px solid black;\n    padding-bottom: 5px;\n    margin: 10px;\n}\n\n@media (prefers-color-scheme: dark) {\n  div.header {\n    border-bottom: 2px solid white;\n  }\n}\n\ndiv.folder &gt; div.hidden {\n    display:none;\n}\n\ndiv.folder &gt; span.hidden {\n    display:none;\n}\n\n.pretty-print {\n    margin-top: 1em;\n    margin-left: 20px;\n    font-family: monospace;\n    font-size: 13px;\n}\n\n#webkit-xml-viewer-source-xml {\n    display: none;\n}\n\n.opened {\n    margin-left: 1em;\n}\n\n.comment {\n    white-space: pre;\n}\n\n.folder-button {\n    user-select: none;\n    cursor: pointer;\n    display: inline-block;\n    margin-left: -10px;\n    width: 10px;\n    background-repeat: no-repeat;\n    background-position: left top;\n    vertical-align: bottom;\n}\n\n.fold {\n    background: url(\"data:image/svg+xml,&lt;svg xmlns='http://www.w3.org/2000/svg' fill='%23909090' width='10' height='10'&gt;&lt;path d='M0 0 L8 0 L4 7 Z'/&gt;&lt;/svg&gt;\");\n    height: 10px;\n}\n\n.open {\n    background: url(\"data:image/svg+xml,&lt;svg xmlns='http://www.w3.org/2000/svg' fill='%23909090' width='10' height='10'&gt;&lt;path d='M0 0 L0 8 L7 4 Z'/&gt;&lt;/svg&gt;\");\n    height: 10px;\n}\n</style></head><body><div id=\"webkit-xml-viewer-source-xml\"><rss xmlns=\"\" xmlns:content=\"http://purl.org/rss/1.0/modules/content/\" xmlns:wfw=\"http://wellformedweb.org/CommentAPI/\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:atom=\"http://www.w3.org/2005/Atom\" xmlns:sy=\"http://purl.org/rss/1.0/modules/syndication/\" xmlns:slash=\"http://purl.org/rss/1.0/modules/slash/\" version=\"2.0\">\n\n<channel>\n\t<title>#####</title>\n\t<atom:link href=\"https://######.com/feed/\" rel=\"self\" type=\"application/rss+xml\"/>

I've tried the returnRawHtml option but didn't change anything.

Is there a way to get the real feed source code?

it should start with:

<?xml version="1.0" encoding="UTF-8"?>

Thank you in advance.

You must be logged in to vote

Replies: 3 comments · 5 replies

Comment options

Did you solve it? I also encountered the same problem.

You must be logged in to vote
0 replies
Comment options

I didn't dig into the code, so no. I hoped that one of the devs could take a look.

You must be logged in to vote
3 replies
@gl0zzy
Comment options

I have currently written a program to solve this problem

@Organizer21
Comment options

I have currently written a program to solve this problem

Are you able to share, also facing this issue as I need the pure XML returned from an RSS feed.

@gl0zzy
Comment options

I can't guarantee that the format processed by the program is correct. I only tested it on two RSS websites.
You can visite http://localhost:8192/http://example.com/rss.xml to get Rss feed after you running the app.

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"github.com/antchfx/htmlquery"
	"github.com/antchfx/xmlquery"
	"github.com/gin-gonic/gin"
	"html"
	"io"
	"net/http"
	"strings"
)

type Response struct {
	Status   string `json:"status"`
	Message  string `json:"message"`
	Solution struct {
		Response string            `json:"response"`
		Headers  map[string]string `json:"headers"`
	} `json:"solution"`
}

func main() {
	app := gin.Default()
	app.GET("/*url", func(c *gin.Context) {
		url := c.Param("url")[1:]
		urlWithQuery := url + "?" + c.Request.URL.RawQuery
		payload := []byte(fmt.Sprintf(`{"cmd": "request.get", "url": "%s"}`, urlWithQuery))
		buffer := bytes.NewBuffer(payload)
		post, err := http.Post("http://localhost:8191/v1", "application/json", buffer)
		if err != nil {
			print(err)
			return
		}
		defer post.Body.Close()
		all, err := io.ReadAll(post.Body)
		if err != nil {
			print(err)
			return
		}
		var response Response
		err = json.Unmarshal(all, &response)
		if err != nil {
			c.String(200, string(all))
			return
		}
		if response.Status != "ok" {
			c.String(500, response.Message)
			return
		}
		c.Header("Content-Type", response.Solution.Headers["content-type"])
		if strings.Contains(response.Solution.Response, "<body><div id=\"webkit-xml-viewer-source-xml\">") {
			doc, err := xmlquery.Parse(strings.NewReader(response.Solution.Response))
			if err != nil {
				return
			}
			node := xmlquery.FindOne(doc, "//div[@id=\"webkit-xml-viewer-source-xml\"]")
			c.Header("Content-Type", "text/xml;charset=UTF-8")
			c.String(200, node.OutputXML(false))
			return
		}
		if strings.Contains(response.Solution.Response, "?xml version") {
			if strings.Contains(response.Solution.Response, "<body><pre") {
				root, err := htmlquery.Parse(strings.NewReader(response.Solution.Response))
				if err != nil {
					c.String(500, err.Error())
					return
				}
				c.Header("Content-Type", "text/xml;charset=UTF-8")
				c.String(200, html.UnescapeString(htmlquery.OutputHTML(htmlquery.FindOne(root, "//pre"), false)))
				return
			}
		}
		c.String(200, response.Solution.Response)
	})
	app.Run(":8192")
}
Comment options

@gl0zzy thank you, I see how it works (just not a coder myself)... on that topic I'm also running my FlareSolverr in a Docker container on a NAS (pretty much not touched the code and dumbly getting my results via a remote CURL to that NAS IP on port 8196. Any quick nudge in the direction I'd have to take to get you addition above running and then listening to 8192 with such a setup (consider me a total noob on non-html code and a person mainly using GUI interfaces... any details that might help me get going in the right direction) :)

You must be logged in to vote
2 replies
@gl0zzy
Comment options

https://github.com/gl0zzy/flaresolverr-proxy
I store the source code in this repository
You can clone it to your NAS and run go build in the project directory (make sure you have go installed). Then you will get an executable file and run it to use it.

@Organizer21
Comment options

Thank you, let me see how far I get :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
🙏
Q&A
Labels
None yet
3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.