How can I make it so that robots.txt is not ignored due to a URL re-direct?
-
Recently a site moved from blog.site.com to site.com/blog with an instruction like this one:
/etc/httpd/conf.d/site_com.conf:94: ProxyPass /blog http://blog.site.com
/etc/httpd/conf.d/site_com.conf:95: ProxyPassReverse /blog http://blog.site.comIt's a Wordpress.org blog that was set as a subdomain, and now is being redirected to look like a directory. That said, the robots.txt file seems to be ignored by Google bot. There is a Disallow: /tag/ on that file to avoid "duplicate content" on the site. I have tried this before with other Wordpress subdomains and works like a charm, except for this time, in which the blog is rendered as a subdirectory. Any ideas why? Thanks!
-
Hi there,
No, haven't tried it yet, but we'll give it a shot. Thanks!
-
Have you thought about adding rel canonicals by chance? Also, how do you know the robots.txt is being ignored are the page showing up in search results? If so maybe the syntax is incorrect in your robots.txt file. Check out robotstxt.org
-
Hi Rocio,
Have you tried YOAST SEO plugin? It has an option to ad to the tags.
That's the easiest way I'd go for.Best Luck.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Appending Blog URL inbetween my homepage and product page is it issue with base url?
Hi All, Google Appending Blog URL inbetween my homepage and product page. Is it issue or base url or relative url? Can you pls guide me? Looking to both tiny url you will get my point what i am saying. Please help Thanks!
Technical SEO | | amu1230 -
Robots.txt and Multiple Sitemaps
Hello, I have a hopefully simple question but I wanted to ask to get a "second opinion" on what to do in this situation. I am working on a clients robots.txt and we have multiple sitemaps. Using yoast I have my sitemap_index.xml and I also have a sitemap-image.xml I do put them in google and bing by hand but wanted to have it added into the robots.txt for insurance. So my question is, when having multiple sitemaps called out on a robots.txt file does it matter if one is before the other? From my reading it looks like you can have multiple sitemaps called out, but I wasn't sure the best practice when writing it up in the file. Example: User-agent: * Disallow: Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/plugins/ Sitemap: http://sitename.com/sitemap_index.xml Sitemap: http://sitename.com/sitemap-image.xml Thanks a ton for the feedback, I really appreciate it! :) J
Technical SEO | | allstatetransmission0 -
Adding multi-language sitemaps to robots.txt
I am working on a revamped multi-language site that has moved to Magento. Each language runs off the core coding so there are no sub-directories per language. The developer has created sitemaps which have been uploaded to their respective GWT accounts. They have placed the sitemaps in new directories such as: /sitemap/uk/sitemap.xml /sitemap/de/sitemap.xml I want to add the sitemaps to the robots.txt but can't figure out how to do it. Also should they have placed the sitemaps in a single location with the file identifying each language: /sitemap/uk-sitemap.xml /sitemap/de-sitemap.xml What is the cleanest way of handling these sitemaps and can/should I get them on robots.txt?
Technical SEO | | MickEdwards0 -
Robots.txt
www.mywebsite.com**/details/**home-to-mome-4596 www.mywebsite.com**/details/**home-moving-4599 www.mywebsite.com**/details/**1-bedroom-apartment-4601 www.mywebsite.com**/details/**4-bedroom-apartment-4612 We have so many pages like this, we do not want to Google crawl this pages So we added the following code to Robots.txt User-agent: Googlebot Disallow: /details/ This code is correct?
Technical SEO | | iskq0 -
Structure of urls
**Hallo from Athens, Greece. We have to implement the following project and i need your help: ** We will build a company guide for the whole country and company local guides for each city for the same client. **Information of the country guide is the sum of information of local guides, so when a user is at the country guide he sees information from companies from all cities and when the user is at city guide he sees info only for the city. ** The problem is the structure of the url we should have. Should the page of presentation of each company should have structure as domain.gr/id/company? or city.domain.gr/id/company and the one to be canonical to the other? is this good for seo? Should both urls be included in the sitemap? Thank you
Technical SEO | | herculesopa0 -
Can I 301 Re-Direct within the same site?
I have a magento site and would like to do a 301 redirect from page A to page B. Page B was created after Page A but contains the same products. I want page A to be replaced in the search engines with page B while carrying the link juice from page A. Is this possible? Am I better off just blocking page A through the robots .txt file? Thanks
Technical SEO | | Prime850 -
Is there actual risk to having multiple URLs that frame in main url? Or is it just bad form and waste of money?
Client has many urls that just frame in the main site. It seems like a total waste of money, but if they are frames, is there an actual risk?
Technical SEO | | gravityseo0 -
Robots.txt File Redirects to Home Page
I've been doing some site analysis for a new SEO client and it has been brought to my attention that their robots.txt file redirects to their homepage. I was wondering: Is there a benfit to setup your robots.txt file to do this? Will this effect how their site will get indexed? Thanks for your response! Kyle Site URL: http://www.radisphere.net/
Technical SEO | | kchandler0