Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?
-
Hi- I have a client that had thousands of dynamic php pages indexed by Google that shouldn't have been. He has since blocked these php pages via robots.txt disallow. Unfortunately, many of those php pages were linked to by high quality sites mulitiple times (instead of the static urls) before he put up the php 'disallow'.
If we create 301 redirects for some of these php URLs that area still showing high value backlinks and send them to the correct static URLs, will Google even see these 301 redirects and pass link value to the proper static URLs? Or will the robots.txt keep Google away and we lose all these high quality backlinks? I guess the same question applies if we use the canonical tag instead of the 301. Will the robots.txt keep Google from seeing the canonical tags on the php pages?
Thanks very much,
V
-
No problem
-
Hello Dmitrii,
Yes, that clarifies things perfectly. Thanks very much for your explanation. And I missed this particular WBF, so I will give it a close look as well.
Thanks again for your quick help.
-
Hello, my friend.
You should realize how exactly htaccess' 301 redirects work. They are server side commands/operations. So, when bots request a page, they wait until server response. In case of 301s - they get response "Don't go here, go there". Now, they also may get response from robots.txt saying "you're not allowed to look at the contents of this file/directory", however this will not prevent the server response. That's why sometimes you can see indexed pages, which are saying "blocked by robots". They are indexed though.
Now, in case of canonical links you are correct, since canonical is IN the content of the page, then robots won't be able to read it, therefore won't be able to be told that there is a canonical page.
There is a recent WBF on this subject - https://moz.com/blog/controlling-search-engine-crawlers-for-better-indexation-and-rankings-whiteboard-friday
Hope this clarifies some things.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
'domain:example.com/' is this line with a '/' at the end of the domain valid in a disavow report file ?
Hi everyone Just out of curiosity, what would happen if in my disavow report I have this line : domain:example.com**/** instead of domain:example.com as recommended by google. I was just wondering if adding a / at the end of a domain would automatically render the line invalid and ignored by Google's disavow backlinks tool. Many thanks for your thoughts
Technical SEO | | LabeliumUSA0 -
Should a login page for a payroll / timekeeping comp[any be no follow for robots.txt?
I am managing a Timekeeping/Payroll company. My question is about the customer login page. Would this typically be nofollow for robots?
Technical SEO | | donsilvernail0 -
Multiple robots.txt files on server
Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate)
Technical SEO | | mjukhud
robots.txt-Original (original)
robots.txt-NEW (other content)
robots.txt-Working (other content dupplicate) Would really appreciate help and expertise suggestions. Thanks!0 -
How to solve the meta : A description for this result is not available because this site's robots.txt. ?
Hi, I have many URL for commercialization that redirects 301 to an actual page of my companies' site. My URL provider say that the load for those request by bots are too much, they put robots text on the redirection server ! Strange or not? Now I have a this META description on all my URL captains that redirect 301 : A description for this result is not available because this site's robots.txt. If you have the perfect solutions could you share it with me ? Thank You.
Technical SEO | | Vale70 -
Can I Disallow Faceted Nav URLs - Robots.txt
I have been disallowing /*? So I know that works without affecting crawling. I am wondering if I can disallow the faceted nav urls. So disallow: /category.html/? /category2.html/? /category3.html/*? To prevent the price faceted url from being cached: /category.html?price=1%2C1000
Technical SEO | | tylerfraser
and
/category.html?price=1%2C1000&product_material=88 Thanks!0 -
Google Webmaster redirect vs 301 redirect
OK assuming a client's website has the right tracking script (hopefully analytics isn't effected by this issue), ... what happens if the htaccess file has a 301 redirect to the www-address, but within Google Webmaster Tools, the address chosen to crawl by Google is the non-www address? How will Google handle and which address takes precedence in this situation? _Cindy
Technical SEO | | CeCeBar0 -
301 redirects
Hi Guys, Question,
Technical SEO | | VividLime
Lets say I have a page oldfile.php at position #2 then set-up a redirection in the following way 100 incoming external links--> oldfile.php [301 to] newfile.php Google comes along and updates its index to newfile.php and ranking of newfile.php remains at position #2. Everything is good. Lets say in 5months, I come along and delete oldfile.php so we have
100 incoming external links--> deleted(oldfile.php) or 100 incoming external links-->404 error. |||| newfile.php Do I then loose the rankings on newfile.php. My thinking is that now that all the external links now point to a page not found, newfile.php should loose rankings Am I correct in my assumption?0 -
301 Redirect?
Sometimes I want to redirect pages on my site. Like a search result: http://www.inthelighturns.com/memorials/catalogsearch/result/?q=hearts to a page designed for what they're searching for: http://www.inthelighturns.com/hearts.html There's no real worry about transferring page rank and this may not be a permanent redirect. Just a "I want this page to show this page for some time" kind of redirect. What's the best solution? Thanks Tyler
Technical SEO | | tylerfraser0