Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Using 2 wildcards in the robots.txt file
-
I have a URL string which I don't want to be indexed. it includes the characters _Q1 ni the middle of the string.
So in the robots.txt can I use 2 wildcards in the string to take out all of the URLs with that in it? So something like /_Q1. Will that pickup and block every URL with those characters in the string?
Also, this is not directly of the root, but in a secondary directory, so .com/.../_Q1. So do I have to format the robots.txt as //_Q1* as it will be in the second folder or just using /_Q1 will pickup everything no matter what folder it is on?
Thanks.
-
I'm not 100% positive, however it does make sense to use it this way.
User-agent: *
Disallow: /*_Q1$
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I use the on classified listing pages that have expired?
We have went back and forth on this and wanted to get some outside input. I work for an online listing website that has classified ads on it. These ads are generated by companies on our site advertising weekend events around the country. We have about 10,000 companies that use our service to generate their online ads. This means that we have thousands of pages being created each week. The ads have lots of content: pictures, sale descriptions, and company information. After the ads have expired, and the sale is no longer happening, we are currently placing the in the heads of each page. The content is not relative anymore since the ad has ended. The only value the content offers a searcher is the images (there are millions on expired ads) and the descriptions of the items for sale. We currently are the leader in our industry and control most of the top spots on Google for our keywords. We have been worried about cluttering up the search results with pages of ads that are expired. In our Moz account right now we currently have over 28k crawler warnings alerting us to the being in the page heads of the expired ads. Seeing those warnings have made us nervous and second guessing what we are doing. Does anybody have any thoughts on this? Should we continue with placing the in the heads of the expired ads, or should we be allowing search engines to index the old pages. I have seen websites with discontinued products keeping the products around so that individuals can look up past information. This is the closest thing have seen to our situation. Any help or insight would be greatly appreciated! -Matt
Intermediate & Advanced SEO | | mellison0 -
If Robots.txt have blocked an Image (Image URL) but the other page which can be indexed has this image, how is the image treated?
Hi MOZers, This probably is a dumb question but I have a case where the robots.tags has an image url blocked but this image is used on a page (lets call it Page A) which can be indexed. If the image on Page A has an Alt tags, then how is this information digested by crawlers? A) would Google totally ignore the image and the ALT tags information? OR B) Google would consider the ALT tags information? I am asking this because all the images on the website are blocked by robots.txt at the moment but I would really like website crawlers to crawl the alt tags information. Chances are that I will ask the webmaster to allow indexing of images too but I would like to understand what's happening currently. Looking forward to all your responses 🙂 Malika
Intermediate & Advanced SEO | | Malika11 -
When to Use Schema vs. Facebook Open Graph?
I have a client who for regulatory reasons cannot engage in any social media: no Twitter, Facebook, or Google+ accounts. No social sharing buttons allowed on the site. The industry is medical devices. We are in the process of redesigning their site, and would like to include structured markup wherever possible. For example, there are lots of schema types under MedicalEntity: http://schema.org/MedicalEntity Given their lack of social media (and no plans to ever use it), does it make sense to incorporate OG tags at all? Or should we stick exclusively to the schemas documented on schema.org?
Intermediate & Advanced SEO | | Allie_Williams0 -
How to Disallow Tag Pages With Robot.txt
Hi i have a site which i'm dealing with that has tag pages for instant - http://www.domain.com/news/?tag=choice How can i exclude these tag pages (about 20+ being crawled and indexed by the search engines with robot.txt Also sometimes they're created dynamically so i want something which automatically excludes tage pages from being crawled and indexed. Any suggestions? Cheers, Mark
Intermediate & Advanced SEO | | monster990 -
Was moving up in SERPS then Got Stuck on Page 2
Hi, I was continuously acquiring quality back-links and my site was moving up in Google SERPS for 3 main keywords. Within a few weeks i was on Page 2 and 3 for these three keywords, but after reaching there I got stuck on these pages and positions despite no change in link building strategy / pattern. I have even increased the number and quality of links that I acquire per day, but I am still stuck at exact same positions. The website is10 months old and related to a software niche. I update this website once a week. For one keyword I am stuck at position 1 of page two (you can well imagine the frustration..!!). My question is that what do I need to do to get out of this "SERP lock"?
Intermediate & Advanced SEO | | RightDirection0 -
Block an entire subdomain with robots.txt?
Is it possible to block an entire subdomain with robots.txt? I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas?
Intermediate & Advanced SEO | | kylesuss12 -
Sitemaps. When compressed do you use the .gz file format or the (untidy looking, IMHO) .xml.gz format?
When submitting compressed sitemaps to Google I normally use the a file named sitemap.gz A customer is banging on that his web guy says that sitemap.xml.gz is a better format. Google spiders sitemap.gz just fine and in Webmaster Tools everything looks OK... Interested to know other SEOmoz Pro's preferences here and also to check I haven't made an error that is going to bite me in the ass soon! Over to you.
Intermediate & Advanced SEO | | NoisyLittleMonkey0 -
How long is it safe to use a 302 redirect?
Hi All, Lets assume there is site A and site B, both sites are live on the internet today as standalone businesses, but they sell very similar products. Site B has built up some link equity and will eventually become the domain for site A due to an organisational re-brand. For the time being however site A will remain, but site B needs to disappear temporarily, but not lose the link equity which has been built up against it. My current thinking is to 302 redirect site B to site A such that users and search bots accessing site B will be redirected to site A whilst leaving the link equity that exists against site B fully intact and allowing us to continue to grow it should we wish to. The question is, does anybody have a view on how long it is safe to use a 302 temporary redirect for? i.e., is 8-10 months to long. Thanks, Ben
Intermediate & Advanced SEO | | BenRush0