Failing to Automatically Retrieve Content from the Washington Post


#1

Whenever I try to link to a story from the Washington Post in one of my lists, Listly consistently fails to automatically retrieve the title, image and text from the story. Inevitably, I have to use my own stock photos for these entries and manually enter the title and preview text from the article.

I have not experienced problems automatically retrieving content at any other news media site (e.g. The New York Times, the Wall Street Journal, Politico, The Hill, Vox, CNN, USA Today, NPR, etc.) – but it is always a problem when linking to content from the Post.

Can you please look into this? Many thanks!


#2

@drosen we will investigate this, but my guess is that WaPo blocks access to “robots” such as our code that grabs title, description and images from their pages. There might be a workaround. We will check. Meanwhile, can you please share your list(s) here?


#3

Thanks for looking into this. Here’s the list.


#4

Just thinking out loud: Some of this material has to be available/accessible in order for the Post’s articles to render correctly on social media sites like Facebook and Twitter, right? Is that somehow accessible?


#5

@drosen you are correct. When our team investigated this, it looks like WaPo takes too long to respond (more than 10 seconds) and we time out the connection in our code assuming that site is not available. WaPo seems to be consistently slow in responding to requests compared to other sites. We are looking for a solution for this.


#6

Okay, appreciate it. Hope you can find a fix, because we gather a fair amount of content from them.