Do i have my robots.txt file set up properly
-
Hi, just doing some seo on my site and i am not sure if i have my robots file set correctly. i use joomla and my website is www.in2town.co.uk.
here is my robots file, does this look correct to you
User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/many thanks
-
thanks for this, i will add a sitemap now
-
thanks for this. been having for a long time trouble with a site map. the reason is, i use joomla 1.5 and i am not sure the best way to have it set or which is the best tool to use.
my articles change all the time and not sure how many of the articles i should have in the site map or to have just the sections.
on an old site i had all the articles, well up to 2,000 and that gain me a lot of traffic but with the new site i took that down
-
Yes, this does look good. However, usually the robots.txt will define a location of a sitemap. Not absolutely needed, but good to know.
Here is an example of one of our client's wordpress sites.
User-agent: * Disallow: /wp-admin Disallow: /another-post Disallow: /dolor-and-the-sit-amet/ Disallow: /hello-world-2-2/ Disallow: /second-page-post/ Disallow: /hello-world-2-3/ Disallow: /tag/ Disallow: /events/ Disallow: /wp-content/ Sitemap: http://backcountrysnow.com/sitemap.xml.gz
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google has deindexed a page it thinks is set to 'noindex', but is in fact still set to 'index'
A page on our WordPress powered website has had an error message thrown up in GSC to say it is included in the sitemap but set to 'noindex'. The page has also been removed from Google's search results. Page is https://www.onlinemortgageadvisor.co.uk/bad-credit-mortgages/how-to-get-a-mortgage-with-bad-credit/ Looking at the page code, plus using Screaming Frog and Ahrefs crawlers, the page is very clearly still set to 'index'. The SEO plugin we use has not been changed to 'noindex' the page. I have asked for it to be reindexed via GSC but I'm concerned why Google thinks this page was asked to be noindexed. Can anyone help with this one? Has anyone seen this before, been hit with this recently, got any advice...?
Technical SEO | | d.bird0 -
Robots User-agent Query
Am I correct in saying that the allow/disallow is only applied to msnbot_mobile? mobile robots file User-agent: Googlebot-Mobile User-agent: YahooSeeker/M1A1-R2D2 User-agent: MSNBOT_Mobile Allow: / Disallow: /1 Disallow: /2/ Disallow: /3 Disallow: /4/
Technical SEO | | ThomasHarvey1 -
Robots txt. in page with 301 redirect
We currently have a a series of help pages that we would like to disallow from our robots txt. The thing is that these help pages are located in our old website, which now has a 301 redirect to current site. Which is the proper way to go around? 1- Add the pages we want to disallow to the robots.txt of the new website? 2- Break the redirect momentarily and add the pages to the robots.txt of the old one? Thanks
Technical SEO | | Kilgray0 -
Has anyone tested or knows whether it makes a difference to upload a disavow file to both www. and non-www. versions of your site in GWMT?
Although Google treats both as separate sites, I always assumed that uploading the disavow file to the canonical version of your site would solve the problem. Is this the case, or has anyone seen better results uploading to both versions?
Technical SEO | | CustardOnlineMarketing0 -
Exclude root url in robots.txt ?
Hi, I have the following setup: www.example.com/nl
Technical SEO | | mikehenze
www.example.com/de
www.example.com/uk
etc
www.example.com is 301'ed to www.example.com/nl But now www.example.com is ranking instead of www.example.com/nl
Should is block www.example.com in robots.txt so only the subfolders are being ranked?
Or will i lose my ranking by doing this.0 -
Question about construction of our sitemap URL in robots.txt file
Hi all, This is a Webmaster/SEO question. This is the sitemap URL currently in our robots.txt file: http://www.ccisolutions.com/sitemap.xml As you can see it leads to a page with two URLs on it. Is this a problem? Wouldn't it be better to list both of those XML files as separate line items in the robots.txt file? Thanks! Dana
Technical SEO | | danatanseo0 -
Robots.txt to disallow /index.php/ path
Hi SEOmoz, I have a problem with my Joomla site (yeah - me too!). I get a large amount of /index.php/ urls despite using a program to handle these issues. The URLs cause indexation errors with google (404). Now, I fixed this issue once before, but the problem persist. So I thought, instead of wasting more time, couldnt I just disallow all paths containing /index.php/ ?. I don't use that extension, but would it cause me any problems from an SEO perspective? How do I disallow all index.php's? Is it a simple: Disallow: /index.php/
Technical SEO | | Mikkehl0 -
Directory Naming & File Organization
We're redoing an entire site and are going to reorganize, and link to the site's pages by directory instead of page name. So instead of:xyz.com/services/fixingtvs.phpit will be:xyz.com/fixingtvsAt first I was thinking 1 index.php page per directory but that will make content management really confusing with a bunch of files with the same name.Anyone have a better idea?Thanks,Matt
Technical SEO | | mattloht0