/$1 URL Showing Up
-
Whenever I crawl my site with any kind of bot or a sitemap generator over my site. it comes up with /$1 version of my URLs. For example:
It gives me hdiconference.com & hdiconference.com/$1 and hdiconference.com/purchases & hdiconference.com/purchases/$1
Then I get warnings saying that it's duplicate content. Here's the problem: I can't find these /$1 URLs anywhere. Even when I type them in, I get a 404 error. I don't know what they are, where they came from, and I can't find them when I scour my code.
So, I'm trying to figure out where the crawlers are picking this up. Where are these things? If sitemap generators and other site crawlers are seeing them, I have to assume that Googlebot is seeing them as well.
Any help? My developers are at a loss as well.
-
Perfect. Thanks for the help, guys!
-
If you can't find them, you could put a disallow in your robots.txt files to keep them from being crawled.
-
I had a similar issue and found it was due to (in the case of a MozPro crawl at least) the bot crawling a JS command in the head. One of the commands included an anchor tag that was being read as a link rather than in context of the java script command. Check your JS files/scripts. It might be in there somewhere.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will adding /blog/ to my urls affect SEO rankings?
Following advice from an external SEO agency I removed /blog/ from our permalinks late last year. The logic was that it a) doesn't help SEO and b) reduces the character count for the slug. Both points make sense. However, it makes segmenting blog posts from other content in Google Analytics impossible. If I were to add /blog/ back into my URLs, and redirected the permalinks, would it harm my rankings? Thanks!
Technical SEO | | GerardAdlum0 -
Is there an SEO advantage to blog content being a child of /blog/ rather than the homepage?
I'm working on a website where all the blog content is listed as separate pages from the homepage, eg: www.domain.com/first-blog-post
Technical SEO | | MillyShaw
www.domain.com/second-blog-post However, it would make my life easier if all blog content was listed under /blog/ so that I could analyse it better in Google Analytics. Eg I'd like it to be: www.domain.com/blog/first-blog-post
www.domain.com/blog/second-blog-post The developer is not keen because it would create extra work for him, and he's also said it's a bad idea from an SEO point of view. But is this the case? Presumably with 301s in place it wouldn't make a difference? Thanks for your help!0 -
Can you have a /sitemap.xml and /sitemap.html on the same site?
Thanks in advance for any responses; we really appreciate the expertise of the SEOmoz community! My question: Since the file extensions are different, can a site have both a /sitemap.xml and /sitemap.html both siting at the root domain? For example, we've already put the html sitemap in place here: https://www.pioneermilitaryloans.com/sitemap Now, we're considering adding an XML sitemap. I know standard practice is to load it at the root (www.example.com/sitemap.xml), but am wondering if this will cause conflicts. I've been unable to find this topic addressed anywhere, or any real-life examples of sites currently doing this. What do you think?
Technical SEO | | PioneerServices0 -
Should I show archives on site?
Should I show my archives on my site? I have WordPress and have dragged the archive widget to the bottom. Would this be considered duplicate content?This is what it looks like. October 2012 September 2012 August 2012 July 2012 June 2012 May 2012 April 2012 February 2012
Technical SEO | | MyAllenMedia0 -
Overly Dynamic URLs
I have a site that I use to time fitness events and I like to post the results using query strings. I create a link to each event's results/gallery/etc. I don't need these pages crawled and I don't want them to hurt my seo. Can I put a "do not crawl" meta on them or will that hurt my overall positioning? What are my other options?
Technical SEO | | bobbabuoy0 -
Can I redirect a URL that has a # in it? How?
Hi there - My web developer is saying that I can't do a URL redirect with a "#" in it. Currently, the URL is actually an anchored link within a page (which the URL indicates with a #). I want to change the content to a new URL, but our website links internally to the old URL, so we would need to do a URL redirect (assume 301). Can you tell me if this is possible and how? Thanks!
Technical SEO | | sfecommerce0 -
HTML url extension
I've read some information about the extension of an url. But i couldn't find a clear answer. What is better for SEO, an extension with html or without? /make-money-online/how-to-make-a-million-dollars-in-1-year/ or /make-money-online/how-to-make-a-million-dollars-in-1-year.html/ Is there a difference between a normal website or a blog?
Technical SEO | | PlusPort0 -
Follow up from http://www.seomoz.org/qa/discuss/52837/google-analytics
Ben, I have a follow up question from our previous discussion at http://www.seomoz.org/qa/discuss/52837/google-analytics To summarize, to implement what we need, we need to do three things: add GA code to the Darden page _gaq.push(['_setAccount', 'UA-12345-1']);_gaq.push(['_setAllowLinker', true]);_gaq.push(['_setDomainName', '.darden.virginia.edu']);_gaq.push(['_setAllowHash', false]);_gaq.push(['_trackPageview']); Change links on the Darden Page to look like http://www.darden.virginia.edu/web/MBA-for-Executives/ and [https://darden-admissions.symplicity.com/applicant](<a href=)">Apply Now and make into [https://darden-admissions.symplicity.com/applicant](<a href=)" > onclick="_gaq.push(['_link', 'https://darden-admissions.symplicity.com/applicant']); return false;">Apply Now Have symplicity add this code. _gaq.push(['_setAccount', 'UA-12345-1']);_gaq.push(['_setAllowLinker', true]);_gaq.push(['_setDomainName', '.symplicity.com']);_gaq.push(['_setAllowHash', false]);_gaq.push(['_trackPageview']); Due to our CMS system, it does not allow the user to add onClick to the link. So, we CANNOT add part 2) What will be the result if we have only 1) and 3) implemented? Will the data still be fed to GA account 'UA-12345-1'? If not, how can we get cross domain tracking if we cannot change the link code? Nick
Technical SEO | | Darden0