Skip to main content


Seeking help! No matter how many keywords/names u search related to this story the article doesn't appear on Google

This was a front page A1 story I wrote for WaPo on how smear campaigns and abuse women journalists endure are a press freedom issue. Can someone explain why the article does not appear on Google? washingtonpost.com/investigati…

reshared this

in reply to Taylor Lorenz

One of the journalists featured in this story raised it to me, she is wondering if bad actors have been able to get the story hidden from Google. These women journalists were so brave to detail their abuse and harassment, and now the story is essentially wiped from the web. What is going on!?

Sarah Brown reshared this.

in reply to Taylor Lorenz

Even searching for the URL directly in Google won't show it. Links to posts that link to the article.

The only reason I can think is this feature somehow made it on their suppression list. This is just wrong.

in reply to Taylor Lorenz

I’ve been able to find syndicated copies of the story (on sites like rsn.org and thefridaytimes.com) which credit WaPo for the original. But definitely not finding the original easily. Seems like it must be on an explicit “no index” list, which is certainly something either Google or WaPo could do. Given that the behaviour is consistent across other search engines, I’d suspect WaPo has set a “no index” flag. Interesting that they still have the article online if you know the link.
in reply to Taylor Lorenz

Perhaps Alphabet has patriarchal elements that are hidden from the public gaze...
in reply to Taylor Lorenz

Is it paywalled? I can't get more than the first couple of paragraphs.
in reply to Taylor Lorenz

It would be extremely interesting to know when media orgs do this. I doubt this is the first time.
in reply to Taylor Lorenz

if I had to guess, which I am, there's probably am active lawsuit open between the publication and AI companies, so it's on a 'no-fly' list publicly even though they're still gonna scrape the data
in reply to Taylor Lorenz

So I happen to use Firefox on Ubuntu, but google search, and it shows up for me right now.
in reply to Taylor Lorenz

it's because there's a "noindex" tag in the head. This is telling search engines not to display the article in search results

Dawn Ahukanna reshared this.

in reply to fromjason.xyz ❤️ 💻

@fromjason
It's difficult to reconcile "democracy dies in darkness" with "content=noindex, content=noarchive".
in reply to Taylor Lorenz

Depends on what keywords you use. "women journalists harassed" gets a lot of hits. If I toss "Washington post" in front, I get this article on top of a bunch of other articles about harassment of women journalists.
in reply to Taylor Lorenz

Does wWPo’s seo suck? (Probabaly, I don’t know) Any way to get them to look into it?

Too bad google bombing doesn’t work anymore. We could mount a linking campaign.

in reply to Sarah Brown

@goatsarah @james Turns out “Democracy Dies in Darkness” was more a statement of intent than a warning

Sarah Brown reshared this.

in reply to Sarah Brown

@goatsarah @james now we just need to know if this is standard practice for all older articles at WaPo or if it is selective.
in reply to Taylor Lorenz

not sure this is what may be playing a role, but WaPo’s robots.txt file (one way webmasters can request robots handle things on their domain) has this:

User-agent: Google-Extended
Disallow: /

That is requesting any machine that makes a request with the user-agent set to “Google-Extended” not access (and therefore not index) anything at all on that domain.

Odd thing to have in a robots.txt file, but it is there at washingtonpost.com/robots.txt

in reply to Taylor Lorenz

I found it in duckduckgo, which is Bing with a better tailor. But it's buried without adding "Washington Post" to the search. It popped right up when I did.

I didn't find it on Alexandra search, but they don't seem to include Washington Post articles. They have some bizarre sources I just noticed. Newt Gingrich 360? Wtaf?

Google's algorithms must not like you.

Maybe the fediverse needs a search engine.

in reply to Taylor Lorenz

It's marked as "noindex".

You can look up the source code.
<meta name="robots" content="noindex"/><meta name="robots" content="noarchive, max-image-preview:large"/>

in reply to Taylor Lorenz

Perhaps because Bezos prefers plastique women over the real thing.
mastodon.online/@davidaugust/1…


Amazon is flooding striking workers at DBK4 in Queens with freezing water in sub-zero weather, endangering everyone.

If you can, please do not shop at Amazon right now.

Social media posts confirming the flooding:
instagram.com/reel/DD2nNx1Sl6i…

instagram.com/reel/DD2oNicJ96-…

facebook.com/share/r/bjYgQLcLV…

twitter.com/nycdsa/status/1870…

#UnionStrong #union #amazon #strike #law


in reply to Taylor Lorenz

seems like it's not in the index, but you have to be the owner of the site to use the tool to inspect the index I think.

support.google.com/webmasters/…

in reply to Taylor Lorenz

Maybe a failure of SEO optimization? That sometimes happened with my articles in a big newspaper.
in reply to Taylor Lorenz

Forgive my laziness in not checking myself, but have they told google not to index the article via a robot.txt file?
in reply to Taylor Lorenz

the page is set to noindex. This means that WaPo asks google not to put this article into search results. This is either a manual error or on purpose, but definitely something from WaPo. Happy to provide more info if you dm me.
in reply to Taylor Lorenz

I searched for the headline, in quotes, on #Qwant. The top 12 three results link to it but none are the original article. Washington Post link was in 13th place. I suspect the paper has done something in the indexing the settings for that page.
in reply to Taylor Lorenz

google returned the article in a search of the first sentence of the headline.
in reply to Taylor Lorenz

It looks like you're already getting answers about an internal block. I'll just add that when searching for the title of the article on Kagi, I don't get any results to WAPO either, only mentions of the article on other sites.
in reply to Taylor Lorenz

Neither duckduckgo, startpage, qwant show this story, so I think this might be something on WaPo's side.
in reply to Taylor Lorenz

I got it on first page of Google search results but not via Washington Post
I used search “by Taylor Lorenz” women journalists.
However, I couldn’t get the link to open.

unifor2000.ca/these-women-jour…

This entry was edited (1 month ago)
in reply to Taylor Lorenz

Search engine AI learns many people don’t click on links to paywall sites

So they drop in the rankings and get clicked /featured even less

in reply to Taylor Lorenz

I got it using quotation marks but had to include "Washington Post." It should be easier to find than that.
in reply to Taylor Lorenz

Is this blatant manipulation of the algorithms to prevent finding such stories? I've noticed incidents when seeking information on Trump is not allowed and have used newspapers instead.
in reply to Taylor Lorenz

Google has been memory holeying results that do not fit the narrative.

The beauty of having the search engine, it can become accidentally on purpose
an un-search engine. Google keeps its algo super secret as mane SEO experts will attest to...

...try Duck duck go, often it's better for subjects google wants to forget.

in reply to Taylor Lorenz

Looking at the article, I would say it has been marked that way in the code.
1/2
in reply to Taylor Lorenz

I suspect the problem is that #WaPo is owned by #billionaire #Bezos. He prevented the paper from endorsing #KamalaHarris' campaign in order to curry favor from #convictedfelon #DonaldTrump. WaPo lost credibility because of that, to the detriment of many fine journalists working there.
Maybe the article needs to be published by some other outlet that is more trustworthy.

Update: The simple reason is that WaPo puts "noindex" tags on their pages!

This entry was edited (1 month ago)
in reply to Taylor Lorenz

It showed up as the fourth link in mojeek (mojeek.com). Mojeek is an independent web crawler, so it doesn't rely upon Google or Bing for results.
in reply to Taylor Lorenz

It’s also not in @kagihq - however, interestingly, a link to archive.md/3g14T does appear, which links to the story while also potentially explaining what happened to it. (The link in Kagi is to the WaPo, but I’m linking here beyond the paywall to help others understand what’s going on).
in reply to Taylor Lorenz

Meanwhile, Taylor learns HTTP.

Get that website up and working Taylor. Even if it's just a splash page.

in reply to Taylor Lorenz

Never in my life have I seen an article that doesn't immediately appear when searching for the exact title.

Weirdly enough, it doesn't appear on DuckDuckGo too. Could it be that WaPo turned off indexing for this?

in reply to Erik Uden 🍑

there is a meta: noindex,noarchive tag in there, so kinda seems like it?

In the head, look for: (its html escaped else mastodon removes it)
<meta name="robots" content="noindex">
<meta name="robots" content="noarchive, max-image-preview:large">

This entry was edited (1 month ago)
in reply to Erik Uden 🍑

Hey @taylorlorenz, this may be even worse than you thought: The Washington Post un-indexed the article, so that no search engine is allowed to index it.

I used this index checker to both check an arbitrary WaPo article and the one you sent. You could try more articles to see if this is more of a common thing.

The way it looks to me now is that the Washington Post basically shadow-banned this article.

reshared this

in reply to Erik Uden 🍑

Hey @dougv, could you maybe write a small script to check how many pages serve the noindex / noarchive tag? If it's just this one, it'd be pretty bad. It'd be important to collect the data now before they switch it up. Possibly you can just collect a bunch of links and use the tool I mentioned above 😁
in reply to Erik Uden 🍑

@ErikUden @dougv you could start with Screaming Frog. It will crawl a site and return SEO issues including noindex.
The free version stops at 500 pages
in reply to Erik Uden 🍑

HOLY fucking christ, what rabbit wormhole to hell did we just open here
This entry was edited (1 month ago)
in reply to Taylor Lorenz

The Post is serving this page with a <meta name="robots" content="noindex"> tag developers.google.com/search/d…
in reply to Doug Valenta

@dougv Holy cow. Most other articles don't have this, right?

mastodon.de/@ErikUden/11370270…


Hey @taylorlorenz, this may be even worse than you thought: The Washington Post un-indexed the article, so that no search engine is allowed to index it.

I used this index checker to both check an arbitrary WaPo article and the one you sent. You could try more articles to see if this is more of a common thing.

The way it looks to me now is that the Washington Post basically shadow-banned this article.


in reply to Taylor Lorenz

I don't see it on Google.
On DuckDuckGo I see it when I put 'Story Killers:' in front of the title. It's not on top though, it's one among other 'Story Killers' articles. I guess that's because it's already a year old.
It does show up like on DuckDuckGo if I put the whole thing in quotation marks.
in reply to Taylor Lorenz

not showing up in mojeek or stract… maybe they included a robots.txt exemption for scraping to avoid ai training or something
Unknown parent

mastodon - Link to source
sysop
@GossiTheDog
Proving again it only takes a couple words to make something "disappear" off the internet...
Unknown parent

mastodon - Link to source
Zack Whittaker
@GossiTheDog this is correct, i'm also seeing this. this is "noindex" code that hides the page from google and other search engines, and almost certainly set in the WaPo CMS.
Unknown parent

mastodon - Link to source
Expertenkommision Cyberunfall

Someone might crawl those newssites, builing an index of “ni indexed” topics?
Looking at @ZEITONLINE
@evawolfangel

@taylorlorenz

This entry was edited (1 month ago)
Unknown parent

mastodon - Link to source
FinchHaven

This makes me laugh almost without end

robots dot txt were put in use in the late 1990s if memory serves -- mine at my original web site dates to about 1998-99

Big news then was that Google was *not* respecting them, and was pulling down whatever it damn well pleased

Fast-forward 25 years and Google is still doing what it damn well pleases, but probably honoring private, backdoor deals/agreements as to whose web sites or web pages it will or will not expose

cc Jeff Bay-zohsss...

cc @taylorlorenz

This entry was edited (1 month ago)
in reply to Taylor Lorenz

Hi, it does appear for me, with the link you provided, I use Chromium on Linux Mint and with a bunch of blockers in case that works for you.
in reply to Taylor Lorenz

Hm. First link on Duck Duck Go for the headline "These women journalists were doing their jogs. That made them targets." is to a site called "TV News Check". Which contains a short blurb and a direct link to the article, published February 2023.

A search for "Women journalists washington post" brings up nothing for at least 6 "More Results" screens down.

This entry was edited (1 month ago)
in reply to Taylor Lorenz

Your direct link works, but I see only the first three paragraphs.

On RSN I see all:
rsn.org/001/these-women-journa…

So … let's share that link. The story is too important to suppress it, right?

in reply to Taylor Lorenz

: fwiw with so many responses, google.com found it (I'm in the Netherlands) as number 10 (bottom of first page) using the following search string (not mentioning wapo or washington post):

"These women journalists were doing their jobs. That made them targets." "Taylor Lorenz"

#Google #SearchEngines #SearchEngineEffectiveness

in reply to Taylor Lorenz

Also not findable in Apple News, despite numerous/most other WaPo stories being readily accessable via that channel.
in reply to Taylor Lorenz

what would effectively do that is a no-index tag in the header. I can’t check that from my phone. It could have only been added by the WaPo.
in reply to Taylor Lorenz

Searching for [gharidah farooqi taylor lorenz] brings up a syndicated copy of the article
in reply to Taylor Lorenz

It works from here. How are you not trying to boost the article? I don't mind, but the experience tastes awkward.
in reply to Taylor Lorenz

Searching on Google for “Gharidah Farooqi attacked for clothing” didn’t return a link to that specific page but did return a link to a longer, presumably syndicated, version on The Friday Times (thefridaytimes.com/18-Feb-2023… ). The first part of the article looks to be verbatim the same as the Washington Post article but then it continues in the same vein giving further details. Other links include other articles on the same and related topics.
in reply to Stephen Booth

@stephenbooth yeah this is a weird, AI generated (it seems!?) rip off of my story with other weird text added in. So strange.
in reply to Taylor Lorenz

because is attributes the article to you and a WaPo series of articles I figured syndication or maybe it being sold on. Could WaPo have sold it on without you knowing?