Question about Robot.txt
-
I just started my own e-commerce website and I hosted it to one of the popular e-commerce platform Pinnacle Cart. It has a lot of functions like, page sorting, mobile website, etc. After adjusting the URL parameters in Google webmaster last 3 weeks ago, I still get the same duplicate errors on meta titles and descriptions based from Google Crawl and SEOMOZ crawl. I am not sure if I made a mistake of choosing pinnacle cart because it is not that flexible in terms of editing the core website pages. There is now way to adjust the canonical, to insert robot.txt on every pages etc. however it has a function to submit just one page of robot.txt. and edit the .htcaccess. The website pages is in PHP format.
For example this URL:
www.mycompany.com has a duplicate title and description with www.mycompany.com/site-map.html (there is no way of editing the title and description of my sitemap)
Another error is
www.mycompany.com has a duplicate title and description with http://www.mycompany.com/brands?url=brands
Is it possible to exclude those website with "url=" and my "sitemap.html" in the robot.txt? or the URL parameters from Google is enough and it just takes a lot of time.
Can somebody help me on the format of Robot.txt. Please? thanks
-
Thank you for your reply. This surely helps. I will probably edit the htaccess.
-
That's the problem with most sitebuilder type prgrams, they are very limited.
Perhaps look at your site title, and page titles. Usually the site title will be the included on all of your webpages followed by the page title so you could simply name your site www.yourcompany.com then add an individual page title to each page.
A robots.txt file is not supposed to be added to every page and only tells the bots what to crawl, and what not to.
If you can edit the htaccess, you should be able to get to the individual pages and insert/change the code for titles, just be aware that doing it manually can work, but sometimes when you go back to make an edit in the builder it may undo all of your manual changes, if that's the case, get your site perfect, then do the individual code changes as the last change.
Hope this helps.
-
I have no way of adding those too. Ooops thanks for the warning. I guess I would have to wait for Google to filter out the parameters.
Thanks for your answer.
-
You certainly don't want to block your sitemap file in robots.txt. It takes some time for Google to filter out the parameters and that is the right approach. If there is no way to change the title, I wouldn't be so concerned over a few pages with duplicate titles. Do you have the ability to add a noindex,follow meta tag on these pages?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website URL, Robots.txt and Google Search Console (www. vs non www.)
Hi MOZ Community,
Technical SEO | | Badiuzz
I would like to request your kind assistance on domain URLs - www. VS non www. Recently, my team have moved to a new website where a 301 Redirection has been done. Original URL : https://www.example.com.my/ (with www.) New URL : https://example.com.my/ (without www.) Our current robots.txt sitemap : https://www.example.com.my/sitemap.xml (with www.)
Our Google Search Console property : https://www.example.com.my/ (with www.) Question:
1. How/Should I standardize these so that Google crawler can effectively crawl my website?
2. Do I have to change back my website URLs to (with www.) or I just need to update my robots.txt?
3. How can I update my Google Search Console property to reflect accordingly (without www.), because I cannot see the options in the dashboard.
4. Is there any to dos such as Canonicalization needed, or should I wait for Google to automatically detect and change it, especially in GSC property? Really appreciate your kind assistance. Thank you,
Badiuzz0 -
Some questions about URL structure and multi country website
Gajanand angela dayHi,
Technical SEO | | Shahjahaaan
I have a question from SEO experts and web developers.
I want to setup a job website for 5 countries. for each country i will provide daily jobs listing on the basis of
1. jobs by categories - for example : accounting jobs. IT jobs, Sales jobs
2. jobs by city - for example : jobs in boston, jobs in chicago
3. jobs by companies for example : jobs in facebook, jobs in emirates case :
a company name " emirates " located in "boston" having vacancy of "accounting job " having position of full time this case job will be present in following categories . 1. accounting jobs in boston
2. jobs in boston
3. jobs in emirates and open any above option there will be filter box on left side showing
position i.e full time
salary i.e 1000-1500
location i.e boston,chicago Q.1
i want to know when user search on google these terms "accounting jobs in boston " or "jobs in boston" or "jobs in emirates" same job will display which url structure is recommended in for each search term? Q.2 how we can do on page SEO for these terms because jobs listing will be changing daily because of new jobs addition and content is changing not Q.3 should i create website on separate domains for each country or same domain but with different folders in it
.co.uk or com/uk for UK and .ae OR .com/uae for UAE Note : i will also attach blog on it and each blog will focus on specific country knowledge for example for USA , how to find jobs in new york and for UAE how to find jobs in Dubai etc . Thanks in Advance0 -
Do I need a separate robots.txt file for my shop subdomain?
Hello Mozzers! Apologies if this question has been asked before, but I couldn't find an answer so here goes... Currently I have one robots.txt file hosted at https://www.mysitename.org.uk/robots.txt We host our shop on a separate subdomain https://shop.mysitename.org.uk Do I need a separate robots.txt file for my subdomain? (Some Google searches are telling me yes and some no and I've become awfully confused!
Technical SEO | | sjbridle0 -
301 Redirect - Technical Question
I have recently updated a site and for the url's that had changed or were not transferring I set up 301 redirects in the htaccess file as follows This one works - Redirect 301 /industry-sectors http://www.tornadowire.co.uk/fencing But this one doesn't - Redirect 301 /industry-sectors/equine http://www.tornadowire.co.uk/fencing/application/equestrian/ What it does is change the url to this instead http://www.tornadowire.co.uk/fencing/equine ..... which returns a 404 page not found error The server is nginx based server and we have moved from a joomal platform to a wordpress platform I would be grateful for any ideas
Technical SEO | | paulie650 -
Robots.txt issue - site resubmission needed?
We recently had an issue when a load of new files were transferred from our dev server to the live site, which unfortunately included the dev site's robots.txt file which had a disallow:/ instruction. Bad! Luckily I spotted it quickly and the file has been replaced. The extent of the damage seems to be that some descriptions aren't displaying and we're getting a message about robots.txt in the SERPs for a few keywords. I've done a site: search and generally it seems to be OK for 99% of our pages. Our positions don't seem to be affected right now but obviously it's not great for the CTRs on those keywords affected. My question is whether there is anything I can do to bring the updated robots.txt file to Google's attention? Or should we just wait and sit it out? Thanks in advance for your answers!
Technical SEO | | GBC0 -
301 Re Direct Question for www
Can smeone check this code to make sure it is right so thaqt my site uses www always. RewriteCond %{HTTP_HOST} ^exercisebiology.com [NC] RewriteRule ^(.*)$ http://www.exercisebiology.com/$1 [R=301,L] I had the hostgators customer service personnel perform this. But I cannot get it to redirect. But he says it works.
Technical SEO | | anoopbal0 -
Client accidently blocked entire site with robots.txt for a week
Our client was having a design firm do some website development work for them. The work was done on a staging server that was blocked with a robots.txt to prevent duplicate content issues. Unfortunately, when the design firm made the changes live, they also moved over the robots.txt file, which blocked the good, live site from search for a full week. We saw the error (!) as soon as the latest crawl report came in. The error has been corrected, but... Does anyone have any experience with a snafu like this? Any idea how long it will take for the damage to be reversed and the site to get back in the good graces of the search engines? Are there any steps we should take in the meantime that would help to rectify the situation more quickly? Thanks for all of your help.
Technical SEO | | pixelpointpress0 -
Question concerning a 302 Redirect
Hi! I've already done some research on redirects, but I still have a question concerning a 302 redirect implemented at the homepage of a website. The Website www.domainA.com has a 302 redirect to www.domainA.com/content/.... Also all subsequent pages have the /content/ directory in their URLs: e.g domainA.com/content/products First thing I was wondering about, was the use of a redirect to a new site using an additional directory /content/... Why would anyone do this? Would it be enough to replace the 302 with a 301 redirect, or would you recommend to change the entire structure and eliminate this /content/ directory? The most logical structure would be www.domainA.com/products/.., and not www.domainA.com/content/products, right? Second thing: Given that 302 means temporary redirect, what are the actual implications when redirecting from domainA.com to domainA.com/content? I've heard that 302 redirects don't pass linkjuice and are detrimental for the site's rankings... What are the actual implications concerning the example above (302 redirect from domainA.com to domainA.com/content ? Would be great to get some advice about the first problem and maybe some insights about the second one concerning 302s in general. Thanks in advance! Cheers, Chris
Technical SEO | | adwordize0