Fixed html regex filter. Edited some logging types. See full msg.

The HTML regex wasn't working because I was missing some
really obvious capture groups. The regex filter is really only
for the kernel.org Releases Feed, just to make it look a bit
cleaner. We don't actually need the direct links because the
post's title already links directly to the front page. We mostly
just want to know there's been a release.

Some logging parameters were changed to make the log less
cluttered by default. Going to write a logrotate config that
will be included in the README or something to help people with
rotating the logs automatically. Also added a few small
clarifying comments, as well as an IDEA!
This commit is contained in:
A.M. Rowsell 2025-03-03 07:41:41 -05:00
parent 1c78edd38e
commit 8ff64608cd
Signed by: amr
GPG key ID: 0B6E2D8375CF79A9

View file

@ -32,10 +32,14 @@ log_dir = r"/var/log/discorss"
log_file_path = r"/app.log"
# Yes, I know you "can't parse HTML with regex", but
# just watch me.
html_filter = re.compile(r"\<\/?([A-Za-z \:\.\/\"\=])*\>")
html_filter = re.compile(r"\<\/?([A-Za-z0-9 \:\.\-\/\"\=])*\>")
success_codes = [200, 201, 202, 203, 204, 205, 206]
app_config = {}
# IDEA: Consider making this into a class-based program
# This would solve a couple issues around global variables and generally
# make things a bit neater
# This function gets and formats the brief excerpt that goes in the embed
# Different feeds put summaries in different fields, so we pick the best
@ -109,7 +113,7 @@ def main():
except KeyError:
last_check = now - 21600 # first run, no lastupdate, check up to 6 hours ago
for i, hook in enumerate(app_config["feeds"]): # Feed loop start
logger.info("Parsing feed %s...", hook["name"])
logger.debug("Parsing feed %s...", hook["name"])
feeds = feedparser.parse(hook["url"])
latest_post = []
prev_best = 0
@ -183,10 +187,14 @@ def main():
r = requests.post(hook["webhook"], data=webhook_string, headers=custom_header)
if r.status_code not in success_codes:
logger.error(
"Error %d while trying to post %s", r.status_code, hook["webhook"]
"Error %d while trying to post %s", r.status_code, hook["name"]
)
else:
logger.debug("Got %d when posting %s", r.status_code, hook["name"])
# End of feed loop
# Dump updated config back to json file
app_config["lastupdate"] = now
with open(config_file_path, "w") as config_file:
json.dump(app_config, config_file, indent=4)