GithubHelp home page GithubHelp logo

Robots.txt edits about the-seo-framework HOT 15 OPEN

sybrew avatar sybrew commented on May 31, 2024
Robots.txt edits

from the-seo-framework.

Comments (15)

 avatar commented on May 31, 2024 2

hi @sybrew thanks I added a robots.txt in my root folder and that worked.
Thanks for replying.
Maybe add a robots.txt editor maybe in upcoming versions😊😁

from the-seo-framework.

sybrew avatar sybrew commented on May 31, 2024 1

Hi @chandlerbing26

You can do either of the following:

  1. Add a robots.txt file to the root of your website anyway; then you'll have complete control over its contents. This is probably your best bet, but it does not translate well with WordPress Multisite's domain mapping (a corner case).
  2. Add to or overwrite our filters as you described. Yes, that can be added to the functions.php file. See https://tsf.fyi/docs/filters#where to learn about alternative methods.

from the-seo-framework.

sybrew avatar sybrew commented on May 31, 2024 1

From #647: Add more directives for AI-blocking, including opt-out for "Google-Extended" -- see https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers#user-agents-in-robots.txt.

from the-seo-framework.

proweb avatar proweb commented on May 31, 2024

how about multisite support?
Different robots.txt for different domains.

from the-seo-framework.

sybrew avatar sybrew commented on May 31, 2024

Multisite support is fundamental and by default for all extensions unless otherwise stated.
That's why I noted that it will only work when no static file is present πŸ˜„.

Also, all sites I own are on a Multisite network so you don't have to worry about that!

from the-seo-framework.

proweb avatar proweb commented on May 31, 2024

so I can generate robots.txt different for any site in my network?
How to do it? Is where any manual?

from the-seo-framework.

sybrew avatar sybrew commented on May 31, 2024

All open issues are just a draft for now, there's no operational code yet. This includes this issue.

The idea is that it will be different for each site in the network, yes.

from the-seo-framework.

proweb avatar proweb commented on May 31, 2024

ОК, thanks @sybrew

from the-seo-framework.

trainoasis avatar trainoasis commented on May 31, 2024

At the moment, it's not possible to add a Disallow in robots.txt via the plugin, right?

from the-seo-framework.

sybrew avatar sybrew commented on May 31, 2024

Hi @trainoasis

That's correct. You'd want to use a WordPress filter, at priority >10, instead:

add_filter( 'robots_txt', function( $robots ) {

	$my_robots = <<<'MYROBOTS'
User-agent: some-bot
Disallow: /

MYROBOTS;

	return $my_robots . $robots;
}, 11 );

from the-seo-framework.

 avatar commented on May 31, 2024

Hey @sybrew
I need to manually a disallow code in the robots.txt, but I figured out that the plugin currently does not allow that and there is no robots.txt in the root folder for me to edit.
So can you tell me how to add it?
If I have to use the above WordPress filter, then where do I add the filter?
In functions.php? or somewhere else.
sorry, it may sound silly but I googled and could not find anything reliable.

from the-seo-framework.

vir-gomez avatar vir-gomez commented on May 31, 2024

Hi @sybrew, I recently added the Blackhole for Bad Bots plugin by Jeff Starr and I must add some lines with a directive on the robots.txt

I remember with Yoast o others, I had my robots.txt on the mail path html_public directory, but now, with The SEO Framework, the robots.txt is added dynamically and I don't know how to edit manually..

Any suggestions? How could I add an small directive like the following to send to Bing crawler or Google spiders?

User-agent: *
Disallow: /?blackhole

from the-seo-framework.

vir-gomez avatar vir-gomez commented on May 31, 2024

Hi @chandlerbing26

You can do either of the following:

  1. Add a robots.txt file to the root of your website anyway; then you'll have complete control over its contents. This is probably your best bet, but it does not translate well with WordPress Multisite's domain mapping (a corner case).
  2. Add to or overwrite our filters as you described. Yes, that can be added to the functions.php file. See https://tsf.fyi/docs/filters#where to learn about alternative methods.

If we have the dynamically created by The SEO Framework plugin robots.txt and another one created manually by us, what of them should we add to Google/Bing Webmaster Tools?

from the-seo-framework.

sybrew avatar sybrew commented on May 31, 2024

Hi @vir-gomez,

When there's a static robots.txt file in the root folder of your website, the virtual "file" cannot be outputted. So, with a robots.txt file present, The SEO Framework's output won't work.

The virtual robots.txt "file" will look a bit like this.

Now, the robots.txt file may just as well be empty because there are many other signals utilized to steer robots away from administrative and duplicated pages. Like the X-Robots-Tag HTTP header and the <meta name=robots /> HTML tag. So, feel free to use a custom robots.txt file with the blackhole directive in place.

P.S. Please send us future requests via our WordPress.org support forums. This issue is about a feature proposal, not a support topic.

from the-seo-framework.

sybrew avatar sybrew commented on May 31, 2024

From mouste63's request:

Add a rule to block GPTBot from scraping.

from the-seo-framework.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.