How best to deal with www.home.com and www.home.com/index.html
-
Firstly, this is for an .asp site - and all my usual ways of fixing this (e.g. via htaccess) don't seem to work.
I'm working on a site which has www.home.com and www.home.com/index.html - both URL's resolve to the same page/content.
If I simply drop a rel canonical into the page, will this solve my dupe content woes?
The canonical tag would then appear in both www.home.com and www.home.com/index.html cases.
If the above is Ok, which version should I be going with?
- or -
Thanks in advance folks,
James @ Creatomatic -
It certainly does help, many thanks Paul - hugely appreciated.
-
In this situation, using a canonical to point to the primary is a workaround, but the correct way to handle it is with a 301 redirect. Canonicals are to be used when both versions of the page need to be indexed, but all the influence is to be directed to a single URL.
In this case, there is no functional reason why you would want both URLs to remain in the index and be reachable by the two different addresses because they are the exact same page. Therefore the correct solution is to 301 redirect the /index.html URL to the primary URL. (This will also be cleanest to maintain, will pass maximum amount of authority, and is best for usability)
ASP sites are hosted on Microsoft IIS servers. IIS does not use or recognize .htaccess files. Instead, you will need to use the URL Rewrite Module. It should be preinstalled on most IIS servers, or you can request that your host/server admin add it. (If the server's older than IIS 7, you'll need a 3rd part ISAPI Rewrite module instead of Microsoft's own module)
Here's a TechRepublic article on using the Rewrite Module to perform the same sorts of functions as .htaccess on Apache servers. http://ow.ly/fXSAB In many ways, its basics are easier than .htaccess.
Note you should also be redirecting the non-www version of the site to the fully qualified domain name as well if you haven't already
Hope this helps?
Paul
-
That's correct - they are the same page.
To better explain, this is all done old-school via FTP, so any edits or changes I make to the file/page "index.html" apply to the following URL's
Is there any harm in telling search engines that the Canonical version of a page IS the same page?
(Actually, there were LOADS more but I've got fixes in place for most of these)
-
Adam, unfortunately the method you link to won't work, because the two URLs in question here are actually the same page. If this were handled this way, you'd be creating an infinite redirect looping in on itself.
Paul
-
Hi James,
First, run a crawl on your site. Is the /index.html getting picked up in the crawl? If so then it is being linked to internally. Check the navigation bar(s) to see if the link to 'Home' is linking to /index.html. Once you have found all the internal links linking to /index.html, you will then need to change these to point to the home page without the filepath (e.g. http://www.example.com/).
The second step would be to implement a canonical tag on both pages that point to the home page without the filepath. So in your example that would be as follows:
That is one way of solving any duplicate content issues without using 301 redirects via .htaccess. However, I believe there is a way to do this via .asp but you would have to search around for this. I did a quick search and found this page that might be of help.
Hope that helps,
Adam.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I am looking for best way to block a domain from getting indexed ?
We have a website http://www.example.co.uk/ which leads to another domain (https://online.example.co.uk/) when a user clicks,in this case let us assume it to be Apply now button on my website page. We are getting meta data issues in crawler errors from this (https://online.example.co.uk/) domain as we are not targeting any meta content on this particular domain. So we are looking to block this domain from getting indexed to clear this errors & does this effect SERP's of this domain (**https://online.example.co.uk/) **if we use no index tag on this domain.
Technical SEO | | Prasadgotteti0 -
URL Structure On Site - Currently it's domain/product-name NOT domain/category/product name is this bad?
I have a eCommerce site and the site structure is domain/product-name rather than domain/product-category/product-name Do you think this will have a negative impact SEO Wise? I have seen that some of my individual product pages do get better rankings than my categories.
Technical SEO | | the-gate-films0 -
Why is my site not being indexed?
Hi, I have performed a site:www.menshealthanswers.co.uk search on Google and none of the pages are being indexed. I do not have a "noindex" value on my robot tag This is what is in place: Any ideas? Jason
Technical SEO | | Jason_Marsh1230 -
Redirect /label/ to /tags/
Hi guys, I have noticed loads of errors in webmaster, page not found.. /label/..... what i need to do is to a 301 redirect to /tags/... can some one tell me the redirect code to help fix this issue Regards T
Technical SEO | | Taiger0 -
Best SEO service/process to harness the power of quality backlinks?
What/who would you recommend for those looking for a strategy around realizing the benefits of high quality back links? We have tons of earned links from DA 90+ sites, but don't think we are realizing the full benefit due to onsite issues. We have scraper sites outranking us. Would it be a technical on page audit? Any guidance appreciated.
Technical SEO | | loveit0 -
Best way to fix a whole bunch of 500 server errors that Google has indexed?
I got a notification from Google Webmaster tools saying that they've found a whole bunch of server errors. It looks like it is because an earlier version of the site I'm doing some work for had those URLs, but the new site does not. In any case, there are now thousands of these pages in their index that error out. If I wanted to simply remove them all from the index, which is my best option: Disallow all 1,000 or so pages in the robots.txt ? Put the meta noindex in the headers of each of those pages ? Rel canonical to a relevant page ? Redirect to a relevant page ? Wait for Google to just figure it out and remove them naturally ? Submit each URL to the GWT removal tool ? Something else ? Thanks a lot for the help...
Technical SEO | | jim_shook0 -
Single URL not indexed
Hi everyone! Some days ago, I noticed that one of our URLs (http://www.access.de/karriereplanung/webinare) is no longer in the Google index. We never had any form of penalty, link warning etc. Our traffic by Google is constantly growing every month. This single page does not have an external link pointing to it - only internal links. The page has been indexed all the time. The HTTP status code is 200, there is no noindex or something in the code. I submitted the URL on GWMT to let Google send it to the index. It was crawled successfully by Google, sent to the index 5 days ago - nothing happened, still not indexed. Do you have any suggestions why this page is no longer indexed? It is well linked internally and one click away from the home page. There is still the PR of 5 showing, I always thought that pages with PR are indexed.......
Technical SEO | | accessKellyOCG0 -
Dealing with PDFs?
Hello fellow mozzers! One of our clients does an excellent job of providing excellent content, and we don't even have to nag them about it (imagine that!). This content is usually centered around industry reports, financial analyses, and economic forcasts; however, they always post them in the form of pdfs. How does Google view PDF's, and is there a way to optimize them? Ideally, I am going to try to get this client set up with a blog-like plateform that will use HTML text, rather than PDF's, but I wanted to see what info was out there for PDF's. Thanks!
Technical SEO | | tqinet0