Canonicalization of index.html - please help
-
I've read up on the subject but am new at this so I thought I would just put forth a simple question. We want our home page to be referred to as www.domain.com. We want the search engines to find and return this URl in search results. But the page has to have a name and the actual name is NOT to www.domain.com/index.html. This, I believe is what can cause duplicate cotnent issues (not really duplicate but perceived by the serach engines as duplicate content). Is it best to insert http://www.domain.com/" /> in the HEAD section of the index.html page or am I totally misunderstanding this concept?
-
When you do your 301 redirects as outlined by John don't forget to 301 redirect your non-www URL version to your www URL version (or visa-versa).
Here is an example of all the URLs that could be on your website.
http://www.domain.com
http://www.domain.com/index.html
http://domain.com
http://domain.com/index.html -
Hi Tag,
As John is suggesting, you could do a straight 301 but the problem is this will lead to an infinite loop and a page error. Your best bet is to use the technique here:http://www.askapache.com/htaccess/redirect-index-blog-root.html to avoid that. Happy hunting.
Hope this helps.
-
Yes, this does create a duplicate content issue. The best solution is to have /index.html 301 redirect to /. However, the canonical as you outlined above should also to fix the issue if you don't have access to your server configuration for redirects.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Pages (Wordpress)
Hello, recently I started noticing that google is not indexing our new pages or our new blog posts. We are simply getting a "Discovered - Currently Not Indexed" message on all new pages. When I click "Request Indexing" is takes a few days, but eventually it does get indexed and is on Google. This is very strange, as our website has been around since the late 90's and the quality of the new content is neither duplicate nor "low quality". We started noticing this happening around February. We also do not have many pages - maybe 500 maximum? I have looked at all the obvious answers (allowing for indexing, etc.), but just can't seem to pinpoint a reason why. Has anyone had this happen recently? It is getting very annoying having to manually go in and request indexing for every page and makes me think there may be some underlying issues with the website that should be fixed.
Technical SEO | | Hasanovic1 -
Purpose of static index.html pages?
Hi All, I am fairly new to the technical side of SEO and was hoping y'all could help me better understand the purpose of dynamic rendering with index.html pages and any implications they might hold for SEO. I work to support an eComm site that includes a subdomain for its product pages: products.examplesite.com. I recently learned from one of our developers that there are actually two sets of product pages - a set of pages that he terms "reactive," that are present on our site, that only display content when a user clicks through to them and are not retrievable by search engines. And then a second set of static pages that were created just for search engines and end in .index.html. So, for example: https://products.examplesite.com/product-1/ AND https://products.examplesite.com/product-1/index.html I am confused as to what specifically the index.html pages are doing to support indexation, as they do not show up in Google Site searches, but the regular pages do. Is there something obvious I am missing here?
Technical SEO | | Lauren_Brick0 -
Google not Indexing images on CDN.
My URL is: https://bit.ly/2hWAApQ We have set up a CDN on our own domain: https://bit.ly/2KspW3C We have a main xml sitemap: https://bit.ly/2rd2jEb and https://bit.ly/2JMu7GB is one the sub sitemaps with images listed within. The image sitemap uses the CDN URLs. We verified the CDN subdomain in GWT. The robots.txt does not restrict any of the photos: https://bit.ly/2FAWJjk. Yet, GWT still reports none of our images on the CDN are indexed. I ve followed all the steps and still none of the images are being indexed. My problem seems similar to this ticket https://bit.ly/2FzUnBl but however different because we don't have a separate image sitemap but instead have listed image urls within the sitemaps itself. Can anyone help please? I will promptly respond to any queries. Thanks
Technical SEO | | TNZ
Deepinder0 -
What is the best practice to re-index the de-indexed pages due to a bad migration
Dear Mozers, We have a Drupal site with more than 200K indexed URLs. Before 6 months a bad website migration happened without proper SEO guidelines. All the high authority URLs got rewritten by the client. Most of them are kept 404 and 302, for last 6 months. Due to this site traffic dropped more than 80%. I found today that around 40K old URLs with good PR and authority are de-indexed from Google (Most of them are 404 and 302). I need to pass all the value from old URLs to new URLs. Example URL Structure
Technical SEO | | riyas_
Before Migration (Old)
http://www.domain.com/2536987
(Page Authority: 65, HTTP Status:404, De-indexed from Google) After Migration (Current)
http://www.domain.com/new-indexed-and-live-url-version Does creating mass 301 redirects helps here without re-indexing the old URLS? Please share your thoughts. Riyas0 -
HTML Sitemap Pagination?
Im creating an a to z type directory of internal pages within a site of mine however there are cases where there are over 500 links within the pages. I intend to use pagination (rel=next/prev) to avoid too many links on the page but am worried about indexation issues. should I be worried?"
Technical SEO | | DMGoo0 -
Getting a link removed from brand search - please help!
Hello all you mozzers! Ive just come into work with an established company who have one major problem when you google "palicomp" the second link that comes up is to consumeractiongroup with a thread that has been damaging the business for over 2 years, this thread is absolutely not representative of the business today. Strangely stronger links in search have better authority but google has ranked this post as being highly relevant to the business, does anybody know of any strategies we can do to get this removed, we have contacted consumeractiongroup directly but they are not prepared to move it. Does anyone have any idea of removal ideas or what we can do its crippling our business, we cant work out as to why its ranking better! Chris
Technical SEO | | palicomp0 -
Please help to identify the following bots and spiders
Hello all, I would appreciate any help in identifying the following bots: Vagabondo/4.0 TwengaBot-2.0 FatBot 2.0 Googlebot/2.1 bingbot/2.0 Baiduspider/2.0 Yahoo! Slurp SeznamBot/3.0 ShopWiki/1.0 MJ12bot/v1.4.0 YandexBot/3.0 Sosospider+ Ezooms/1.0 Gigabot/3.0 Thanks Shehzad
Technical SEO | | Gareth_Cartman0 -
Google indexing page with description
Hello, We rank fairly high for a lot of terms but Google is not indexing our descriptions properly. An example is with "arnold schwarzenegger net worth". http://www.google.ca/search?q=arnold+schwarzenegger+net+worth&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a When we add content, we throw up a placeholder page first. The content gets added with no body content and the page only contains the net worth amount of the celebrity. We then go back through and re-add the descriptions and profile bio shortly after. Will that affect how the pages are getting indexed and is there a way we can get Google to go back to the page and try to index the description so it doesn't just appear as a straight link? Thanks, Alex
Technical SEO | | Anti-Alex0