Robots.txt file question? NEver seen this command before
-
Hey Everyone!
Perhaps someone can help me. I came across this command in the robots.txt file of our Canadian corporate domain. I looked around online but can't seem to find a definitive answer (slightly relevant).
the command line is as follows:
Disallow: /*?*
I'm guessing this might have something to do with blocking php string searches on the site?. It might also have something to do with blocking sub-domains, but the "?" mark puzzles me
Any help would be greatly appreciated!
Thanks, Rob
-
I don't think this is correct.
? is an attempt at using a RegEx in Robots file which I don't think works.
Further, if it was a properly formed regex, it would be ?
- is a special character for the user agent to mean all. For the disallow line, I believe you have to use a specific directory or page.
http://www.robotstxt.org/robotstxt.html
I could be wrong, but the info on this site has been my understanding from the past too.
-
It depends on how your site is structured.
For example if you have a page at
http://www.yourdomain.com/products.php
and this shows different things based on the parameter, like:
http://www.yourdomain.com/products.php?type=widgets
You will want to get rid of this line in your robots.txt
However if the parameter(s) doesn't change the content on the page, you can leave it in.
-
Thanks Ryan and Ryan! I'm just unfamiliar with this command set in the robots file, and getting settled into the company (5 weeks).. so I am still learning the site's structure and arch. With it all being new to me with limitations I am seeing from the CMS side, I was wondering if this might have been causing crawl issues for Bing and or Yahoo... I'm trying to gauge where we might be experiencing problems with the sites crawl functions.
-
Its not a bad idea in the robots.txt, but unless you are 100% confidant that you wont block something that you really want, i would consider just handling unwanted parameters and pages through the new Google Webmaster url handling toolset. that way you have more control over which ones do and dont get blocked.
-
So, for this parameter, should I keep it in the robots file?
-
Its preventing spiders from crawling pages with parameters in the URL. For example when you search on google you'll see a URL like so:
http://www.google.com/search?q=seo
This passes the parameter of q with a value of 'seo' to the page at google.com for it to work its magic with. This is almost definitely a good thing, unless the only way to access some content on your site is via URL parameters.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots and Canonicals on Moz
We noticed that Moz does not use a robots "index" or "follow" tags on the entire site, is this best practice? Also, for pagination we noticed that the rel = next/prev is not on the actual "button" rather in the header Is this best practice? Does it make a difference if it's added to the header rather than the actual next/previous buttons within the body?
Technical SEO | | PMPLawMarketing0 -
Robots.txt vs. meta noindex, follow
Hi guys, I wander what your opinion is concerning exclution via the robots.txt file.
Technical SEO | | AdenaSEO
Do you advise to keep using this? For example: User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/* Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is. Regards,
Tom Vledder0 -
Can Title Tag be seen in the page source, but not seen by search engines?
This is a follow up question derived from a previous question I posted - http://moz.com/community/q/does-title-tag-location-in-a-page-s-source-code-matter There have been several reputable crawl tools used on our (including Moz) site that state we are missing title tags on may pages. One such page is http://www.paintball-online.com/Paintball-Guns-And-Markers-0Y.aspx I can see the title tag on line 238 of the page source. I find it unlikely that there is an issue with the crawl tools. Any thoughts would be greatly appreciated. Nick
Technical SEO | | Istoresinc0 -
Meta-robots Nofollow on logins and admins
In my SEO MOZ reports I am getting over 400 errors as Meta-robots Nofollow. These are all leading to my admin login page which I do not want robots in. Should I put some code on these pages so the robots know this and don't attempt to and I do not get these errors in my reports?
Technical SEO | | Endora0 -
Indexing of flash files
When Google indexes a flash file, do they use a library for such a purpose ? What set me thinking was this blog post ( although old ) which states - "we expanded our SWF indexing capabilities thanks to our continued collaboration with Adobe and a new library that is more robust and compatible with features supported by Flash Player 10.1."
Technical SEO | | seoug_20050 -
Robots.txt
should I add anything else besides User-Agent: * to my robots.txt file? http://melo4.melotec.com:4010/
Technical SEO | | Romancing0 -
Weird Indexing Question
Google has indexed mysite.com/ and mysitem.com/\/ (no idea why). If you click on the /%5C? URL it takes you to mysite.com//. I have a rel=canonical tag on it that goes to mysite.com/ but I was wondering if there was another way to correct the issue.
Technical SEO | | BryanPhelps-BigLeapWeb0 -
Is having a sitemap.xml file still beneficial?
Hi, I'm pretty new to SEO and something I've noticed is that a lot of things become relevant and irrelevant like the weather. I was just wondering if having a sitemap.xml file for Google's use is still a good idea and beneficial? Logically thinking, my websites would get crawled faster by having one. Cheers.
Technical SEO | | davieshussein0