Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt Syntax for Dynamic URLs
-
I want to Disallow certain dynamic pages in robots.txt and am unsure of the proper syntax. The pages I want to disallow all include the string ?Page=
Which is the proper syntax?
Disallow: ?Page=
Disallow: ?Page=*
Disallow: ?Page=
Or something else? -
Thanks, Alick300 — unfortunately, the slash doesn't appear like that in the URLs on this site: they look like this
www.domain.com/page.html?Page= .........In running through an online robots.txt tester, all three versions in my original question seem to work. Until proven otherwise, I'm using the first one because it's the simplest.
-
Hi Bill,
Disallow: /?Page= will work
Thanks
-
Hi, James. It's not pagination I'm trying to disallow. The site structure has URLs that include things like "Page=give&...", that opens up a blank form ... but it comes from scores of web pages we want to spider. Since the "give" page is an empty form, we're getting tons of duplicate content errors as a result.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt allows wp-admin/admin-ajax.php
Hello, Mozzers!
Technical SEO | | AndyKubrin
I noticed something peculiar in the robots.txt used by one of my clients: Allow: /wp-admin/admin-ajax.php What would be the purpose of allowing a search engine to crawl this file?
Is it OK? Should I do something about it?
Everything else on /wp-admin/ is disallowed.
Thanks in advance for your help.
-AK:2 -
I have two robots.txt pages for www and non-www version. Will that be a problem?
There are two robots.txt pages. One for www version and another for non-www version though I have moved to the non-www version.
Technical SEO | | ramb0 -
Multiple robots.txt files on server
Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate)
Technical SEO | | mjukhud
robots.txt-Original (original)
robots.txt-NEW (other content)
robots.txt-Working (other content dupplicate) Would really appreciate help and expertise suggestions. Thanks!0 -
Robots.txt and Multiple Sitemaps
Hello, I have a hopefully simple question but I wanted to ask to get a "second opinion" on what to do in this situation. I am working on a clients robots.txt and we have multiple sitemaps. Using yoast I have my sitemap_index.xml and I also have a sitemap-image.xml I do put them in google and bing by hand but wanted to have it added into the robots.txt for insurance. So my question is, when having multiple sitemaps called out on a robots.txt file does it matter if one is before the other? From my reading it looks like you can have multiple sitemaps called out, but I wasn't sure the best practice when writing it up in the file. Example: User-agent: * Disallow: Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-content/plugins/ Sitemap: http://sitename.com/sitemap_index.xml Sitemap: http://sitename.com/sitemap-image.xml Thanks a ton for the feedback, I really appreciate it! :) J
Technical SEO | | allstatetransmission0 -
Spaces (actual spaces) in URL
Hi all, Is there a huge loss of SEO performance if a URL shows spaces with an actual space (i.e. %20) in the URL rather than a "-" (or indeed a "_")? I know the preferred option is to have a "-", but I am just wondering if it is worth our effort to manually change the "%20" to a "-" in all the instances? Thanks 🙂 Diana
Technical SEO | | Diana.varbanescu0 -
No indexing url including query string with Robots txt
Dear all, how can I block url/pages with query strings like page.html?dir=asc&order=name with robots txt? Thanks!
Technical SEO | | HMK-NL0 -
URL rewriting from subcategory to category
Hello everybody! I have quite simple question about URL rewriting from subcategory to category, yet I can't find any solution to this problem (due to lack of my deeper apache programming knowledge). Here is my problem/question: we have two website url structures that causes dublicate problems: www.website.lt/language/category/ www.website.lt/language/category/1/ 1 and 2 pages are absolutely same (both also returns 200 OK). What we need is 301 redirect from 2 to 1 without any other deeper categories redirects (like www.website.com/language/category/1/169/ redirecting to .../category/1/ or .../category/). Here goes .htaccess URL rewrite rules: RewriteRule ^([^/]{1,3})/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ /index.php?lang=$1&idr=$2&par1=$3&par2=$4&par3=$5&par4=$6&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ /index.php?lang=$1&idr=$2&par1=$3&par2=$4&par3=$5&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/([^/]+)/([^/]+)/([^/]+)/$ /index.php?lang=$1&idr=$2&par1=$3&par2=$4&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/([^/]+)/([^/]+)/$ /index.php?lang=$1&idr=$2&par1=$3&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/([^/]+)/$ /index.php?lang=$1&idr=$2&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/$ /index.php?lang=$1&%{QUERY_STRING} [L] There are other redirects that handles non-www to www and related issues: RedirectMatch 301 ^/lt/$ http://www.domain.lt/ RewriteCond %{HTTP_HOST} ^domain.lt RewriteRule (.*) http://www.domain.lt/$1 [R=301,L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_URI} !(.)/$RewriteRule ^(.)$ http://www.domain.lt/$1/ [R=301,L] At this moment we cannot solve this problem with rel canonical (due to our CMS limits). Thanks for your help guys! If You need any other details on our coding, just let me know.
Technical SEO | | jkundrotas0 -
Singular vs plural in urls
In keyword research for an ecommerce site, I've found that widget, singular gets a lot more searches than widgets, plural AND is much less competitive. Is it better for SEO purposes to have the URLs (and matching title tags) in the catalog as /brass-widget.html, /steel-widget.html, etc., or /brass-widgets.html, etc.? I'm worried that a) searches for widgets will pass by the singular urls but not vice versa, and b) the singular form will strike visitors as bad grammar. Any advice?
Technical SEO | | AmericanOutlets0