I built a simple RSS reader on Python and it is not working.In addition, I want to get the featured image source link of every post and I didn't find a way to do so.
it shows me the Error: Traceback (most recent call last): File"RSS_reader.py", line 7, in feed_title = feed['feed']['title']
If there are some other RSS feeds that work fine. So I don't understand why there are some RSS feeds that are working and others that aren't
So I would like to understand why the code doesn't work and also how to get the featured image source link of a postI attached the code, is written on Python 3.7
import feedparserimport webbrowserfeed = feedparser.parse("https://finance.yahoo.com/rss/")feed_title = feed['feed']['title']feed_entries = feed.entriesfor entry in feed.entries:article_title = entry.titlearticle_link = entry.linkarticle_published_at = entry.published # Unicode stringarticle_published_at_parsed = entry.published_parsed # Time objectarticle_author = entry.authorcontent = entry.summaryarticle_tags = entry.tagsprint ("{}[{}]".format(article_title, article_link))print ("Published at {}".format(article_published_at))print ("Published by {}".format(article_author))print("Content {}".format(content))print("catagory{}".format(article_tags))
Best Answer
A few things.
1) First feed['feed']['title']
does not exist.
2) At least for this site entry.author, entry.tags
do not exist
3) It seems feedparser is not compatible with python3.7 (it gives me KeyError, "object doesn't have key 'category'
)
So as a starting point try to run the following code in python 3.6 and go from there.
import feedparserimport webbrowserfeed = feedparser.parse("https://finance.yahoo.com/rss/")# feed_title = feed['feed']['title'] # NOT VALIDfeed_entries = feed.entriesfor entry in feed.entries:article_title = entry.titlearticle_link = entry.linkarticle_published_at = entry.published # Unicode stringarticle_published_at_parsed = entry.published_parsed # Time object# article_author = entry.author DOES NOT EXISTcontent = entry.summary# article_tags = entry.tags DOES NOT EXISTprint ("{}[{}]".format(article_title, article_link))print ("Published at {}".format(article_published_at))# print ("Published by {}".format(article_author)) print("Content {}".format(content))# print("catagory{}".format(article_tags))
Good luck.
You can also use xml parser libraries like beatifulsoup (https://www.crummy.com/software/BeautifulSoup/bs4/doc/) and create custom parsers. A sample customer parser code can be found here (https://github.com/vintageplayer/RSS-Parser). A walk through the same can read here (https://towardsdatascience.com/rss-feed-parser-in-python-553b1857055c)
Though libraries can be useful, beautifulsoup is an extremely handy library to try out.
I have used BeautifulSoup for a beginner RSS feed reader project (You need to install lxml for it to work since we are dealing with xml):
from bs4 import BeautifulSoupimport requestsurl = requests.get('https://realpython.com/atom.xml')soup = BeautifulSoup(url.content, 'xml')entries = soup.find_all('entry')for i in entries:title = i.title.textlink = i.link['href']summary = i.summary.textprint(f'Title: {title}\n\nSummary: {summary}\n\nLink: {link}\n\n------------------------\n')
You can find the Youtube video here:https://www.youtube.com/watch?v=8HbqO-TfjlI