Question about Robot.txt
-
I just started my own e-commerce website and I hosted it to one of the popular e-commerce platform Pinnacle Cart. It has a lot of functions like, page sorting, mobile website, etc. After adjusting the URL parameters in Google webmaster last 3 weeks ago, I still get the same duplicate errors on meta titles and descriptions based from Google Crawl and SEOMOZ crawl. I am not sure if I made a mistake of choosing pinnacle cart because it is not that flexible in terms of editing the core website pages. There is now way to adjust the canonical, to insert robot.txt on every pages etc. however it has a function to submit just one page of robot.txt. and edit the .htcaccess. The website pages is in PHP format.
For example this URL:
www.mycompany.com has a duplicate title and description with www.mycompany.com/site-map.html (there is no way of editing the title and description of my sitemap)
Another error is
www.mycompany.com has a duplicate title and description with http://www.mycompany.com/brands?url=brands
Is it possible to exclude those website with "url=" and my "sitemap.html" in the robot.txt? or the URL parameters from Google is enough and it just takes a lot of time.
Can somebody help me on the format of Robot.txt. Please? thanks
-
Thank you for your reply. This surely helps. I will probably edit the htaccess.
-
That's the problem with most sitebuilder type prgrams, they are very limited.
Perhaps look at your site title, and page titles. Usually the site title will be the included on all of your webpages followed by the page title so you could simply name your site www.yourcompany.com then add an individual page title to each page.
A robots.txt file is not supposed to be added to every page and only tells the bots what to crawl, and what not to.
If you can edit the htaccess, you should be able to get to the individual pages and insert/change the code for titles, just be aware that doing it manually can work, but sometimes when you go back to make an edit in the builder it may undo all of your manual changes, if that's the case, get your site perfect, then do the individual code changes as the last change.
Hope this helps.
-
I have no way of adding those too. Ooops thanks for the warning. I guess I would have to wait for Google to filter out the parameters.
Thanks for your answer.
-
You certainly don't want to block your sitemap file in robots.txt. It takes some time for Google to filter out the parameters and that is the right approach. If there is no way to change the title, I wouldn't be so concerned over a few pages with duplicate titles. Do you have the ability to add a noindex,follow meta tag on these pages?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Newby question about 301 redericts
I work for a design firm who has been updating a website for a client. In addition to a new look, we've consolidated redundant pages for a more streamlined site. My question is this: when I have replaced 3 somewhat redundant pages on the old site with 1 page on the new site, should I 301 redirect all the former pages to the one new page. I know this question is beyond basic but I'm pretty new to SEO, so be gentle.
Technical SEO | | TheKatzMeow0 -
Robots.txt on http vs. https
We recently changed our domain from http to https. When a user enters any URL on http, there is an global 301 redirect to the same page on https. I cannot find instructions about what to do with robots.txt. Now that https is the canonical version, should I block the http-Version with robots.txt? Strangely, I cannot find a single ressource about this...
Technical SEO | | zeepartner0 -
301 Redirect - Technical Question
I have recently updated a site and for the url's that had changed or were not transferring I set up 301 redirects in the htaccess file as follows This one works - Redirect 301 /industry-sectors http://www.tornadowire.co.uk/fencing But this one doesn't - Redirect 301 /industry-sectors/equine http://www.tornadowire.co.uk/fencing/application/equestrian/ What it does is change the url to this instead http://www.tornadowire.co.uk/fencing/equine ..... which returns a 404 page not found error The server is nginx based server and we have moved from a joomal platform to a wordpress platform I would be grateful for any ideas
Technical SEO | | paulie650 -
Is having no robots.txt file the same as having one and allowing all agents?
The site I am working on currently has no robots.txt file. However, I have just uploaded a sitemap and would like to point the robots.txt file to it. Once I upload the robots.txt file, if I allow access to all agents, is this the same as when the site had no robots.txt file at all; do I need to specify crawler access on can the robots.txt file just contain the link to the sitemap?
Technical SEO | | pugh0 -
SEOMoz Crawler vs Googlebot Question
I read somewhere that SEOMoz’s crawler marks a page in its Crawl Diagnostics as duplicate content if it doesn’t have more than 5% unique content.(I can’t find that statistic anywhere on SEOMoz to confirm though). We are an eCommerce site, so many of our pages share the same sidebar, header, and footer links. The pages flagged by SEOMoz as duplicates have these same links, but they have unique URLs and category names. Because they’re not actual duplicates of each other, canonical tags aren’t the answer. Also because inventory might automatically come back in stock, we can’t use 301 redirects on these “duplicate” pages. It seems like it’s the sidebar, header, and footer links that are what’s causing these pages to be flagged as duplicates. Does the SEOMoz crawler mimic the way Googlebot works? Also, is Googlebot smart enough not to count the sidebar and header/footer links when looking for duplicate content?
Technical SEO | | ElDude0 -
Domains and Hosting Question
I bought hosting for unlimited domains on Godaddy. It's not a dedicated server. It was just $85 a year. I have unlimited latency but a limited amount of "space." I don't know a lot about hosting servers etc... My question is relatively simple. When I go in GoDaddy to my hosting. There is a site that shows up as hosted, and all of the other sites show up under that site in it's directory. If you type the name of the site I bought the hosted package on, then type a forward slash and the name of one of the other sites on the hosting package, you will actually go to the other website. What is this relationship? Is it normal? Does that make all of my websites subdomains of the main site (that I bought the hosting package on)? I don't fully comprehend how this effects everything...
Technical SEO | | JML11790 -
Question for experts - a website is ranking defying all google rules
Hi all Today I saw a strange result in google's search page. When I typed a keyword: Rezeptfrei bestellen, it gave me results like number 1- http://www.rezeptfreibestellen.com/ If I check this site in opensiteexplorer, this is nothing but a new site with no backlinks, or any other offpage work. Onpage work is simply keyword stuffing and domain name is also same as keyword. Now why google ranked this website above others? Where are google EMD and Penguin, Panda? Amit
Technical SEO | | expertvillagemedia1 -
Robots.txt Sitemap with Relative Path
Hi Everyone, In robots.txt, can the sitemap be indicated with a relative path? I'm trying to roll out a robots file to ~200 websites, and they all have the same relative path for a sitemap but each is hosted on its own domain. Basically I'm trying to avoid needing to create 200 different robots.txt files just to change the domain. If I do need to do that, though, is there an easier way than just trudging through it?
Technical SEO | | MRCSearch0