Help needed with robots.txt regarding wordpress!
-
Here is my robots.txt from google webmaster tools. These are the pages that are being blocked and I am not sure which of these to get rid of in order to unblock blog posts from being searched.
http://ensoplastics.com/theblog/?cat=743
http://ensoplastics.com/theblog/?p=240
These category pages and blog posts are blocked so do I delete the /? ...I am new to SEO and web development so I am not sure why the developer of this robots.txt file would block pages and posts in wordpress. It seems to me like that is the reason why someone has a blog so it can be searched and get more exposure for SEO purposes.
IS there a reason I should block any pages contained in wodrpress?
Sitemap: http://www.ensobottles.com/blog/sitemap.xml
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
User-agent: *Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/themes/
Disallow: /trackback
Disallow: /commentsDisallow: /feed
-
I've just looked at the home pages of the two sites and they are pretty much the same apart from substituting plastics with bottles. I'm not an expert but I would have thought Google might think this is duplicate content.
In my opinion I would concentrate on one of the sites say plastics and have the bottle specific stuff as a subsection. I'm not sure how the sites rank etc so that may be easier said than done.
As for the site map / robot question, if you continue with two sites then I would recommend generating a new one for the copied site.
-
So basically this site was duplicated and apparently the robots.txt file was duplicated. There is no sitemap for the blog created for the enso plastics site, so I am not sure how to proceed at this point. Should I just create a new robots.text file for enoplastics and replace this one? Or do I edit this one, and go create a sitemap for my blog?
-
Well that is a problem isn't it? Like I said I am new to a lot of this and I didn't develop either site, this robot.txt file is pointing to the wrong site map. So I am going to change that.
However I am guessing I may need to change some of the rules to get it to where it is not blocking wordpress content.
-
Well that is a problem isn't it? Like I said I am new to a lot of this and I didn't develop either site, this robot.txt file is pointing to the wrong site map. So I am going to change that.
However I am guessing I may need to change some of the rules to get it to where it is not blocking wordpress content.
-
I'm a bit confused. You reference ensoplastics.com up the top and then show the robots text from ensobottles.com
Are they using the same robots content? The sites use different url naming so ensobottles uses rewrite whereas the other site uses ?p=
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl solutions for landing pages that don't contain a robots.txt file?
My site (www.nomader.com) is currently built on Instapage, which does not offer the ability to add a robots.txt file. I plan to migrate to a Shopify site in the coming months, but for now the Instapage site is my primary website. In the interim, would you suggest that I manually request a Google crawl through the search console tool? If so, how often? Any other suggestions for countering this Meta Noindex issue?
Technical SEO | | Nomader1 -
Disallow wildcard match in Robots.txt
This is in my robots.txt file, does anyone know what this is supposed to accomplish, it doesn't appear to be blocking URLs with question marks Disallow: /?crawler=1
Technical SEO | | AmandaBridge
Disallow: /?mobile=1 Thank you0 -
Best way to create robots.txt for my website
How I can create robots.txt file for my website guitarcontrol.com ? It is having login and Guitar lessons.
Technical SEO | | zoe.wilson170 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
301 Redirect Help
How would you 301 redirect and entire folder to a specific file within the same domain? Scenario www.domain.com/folder to www.domain.com/file.html Thanks for your Input...
Technical SEO | | dhidalgo11 -
Help With Analytics Data
Hello, I'm seeing the following Analytics data for some of my keywords: Multiple Visits Pages/Visit: 1 Avg Visit Duration: 00:00 % New Visits: 100% Bounce Rate: 100% The data is the same on all "affected keywords". What is going on and how do I fix it? Thanks for the help!
Technical SEO | | AWCthreads0 -
Warnings for blocked by blocked by meta-robots/meta robots Nofollow...how to resolve?
Hello, I see hundreds of notices for blocked by meta-robots/meta robots nofollow and it appears it is linked to the comments on my site which I assume I would not want to be crawled. Is this the case and these notices are actually a positive thing? Please advise how to clear them up if these notices can be potentially harmful for my SEO. Thanks, Talia
Technical SEO | | M80Marketing0 -
Robots.txt question
I want to block spiders from specific specific part of website (say abc folder). In robots.txt, i have to write - User-agent: * Disallow: /abc/ Shall i have to insert the last slash. or will this do User-agent: * Disallow: /abc
Technical SEO | | seoug_20050