You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm struggling to get this working with MSN news articles. Here's the approach I'm using:
deffetch_url(url: str, timeout: int=10) ->str:
"""Get the content from a page at URL, if it is a URL."""ifnotis_url(url):
returnurlresponse=requests.get(url, timeout=timeout)
response.raise_for_status()
soup=bs(response.content, "html.parser")
returnsoup.get_text()
defsummarize(content: str) ->str:
"""Take content and use readability to return a document summary."""doc=Document(content)
title: str=doc.short_title()
summary: str=bs(doc.summary(), "lxml").textreturnf"{title}\n{summary}"
This works well on all the other news sites I've tried, but with MSN it's different.
Example. With this URL, I only get MSN for a title and the summary is empty.
Any suggestions?
The text was updated successfully, but these errors were encountered:
I'm struggling to get this working with MSN news articles. Here's the approach I'm using:
This works well on all the other news sites I've tried, but with MSN it's different.
Example. With this URL, I only get
MSN
for a title and the summary is empty.Any suggestions?
The text was updated successfully, but these errors were encountered: