Lately I have noticed Google indexing many files on the site without the .html extension
-
Hello,
Our site, while we convert, remains in HTML 4.0.
Fle names such as http://www.sample.com/samples/index.shtml are being picked up in the SERPS as http://www.sample.com/samples/ even when I use the "rel="canonical" tag and specify the full file name therein as recommended. The link to the truncated URL (http://www.sample.com/samples/) results in what MOZ shows as fewer incoming links than the full file name is shown as having incoming.
I am not sure if this is causing a loss in placement (the MOZ stats are showing a decline of late), which I have seen recently (of course, I am aware of other possible reasons, such as not being in HTML5 yet).
Any help with this would be great.
Thank you in advance
-
Can you clarify what you're concerned about for 301 redirects in terms of link juice?
301 redirects don't carry as much link juice as a direct link, but it doesn't impact correct links, just the links that, otherwise, wouldn't get link juice to your end destination at all. (Though, if your canonical is working correctly, it'll pass the same amount of link juice as a 301 redirect.)
Dr. Pete goes into this a bit more over here: https://moz.com/community/q/do-canonical-tags-pass-all-of-the-link-juice-onto-the-url-they-point-to
-
Many thanks for taking the time to respond Kristina.
-
I don't like to do redirects, as so many have warned of the consequences in terms of link juice
-
No, I don't link to the pages in question using "/" rather than the ".shtml" version of the page indexed.
-
A few external sources use the "/" version (recent linkers) I have found, but they likely only did so as they saw it displayed as such in the SERPs previously. No commercial or other affiliate sites do.
The reason I was really confused is that some pages are indexed using the "/", while others are not -- with no apparent reason I could locate. The "/" version for pages still remains on the first page for keywords, even with far less domain authorities and pages linking to them (for now!). We will be moving to another platform with a different default extension, so I wonder how that will be handled. Endless mysteries.
Thank you again for your time and suggestions,
Greg
-
-
Hmm, that doesn't seem good. It's hard to say whether this is causing the decline in your rankings, but either way, you want to make sure that you're not splitting your link equity between your / and .shtml pages. Here's what I'd do:
- If you can, 301 redirect / pages to .shtml pages. Obviously, it'd be easier if the canonical worked, but it sounds like it doesn't.
- Use ScreamingFrog or DeepCrawl to look through internal pages on your site to see if you're ever linking to the / version of pages rather than the .shtml pages. When Google chooses a different version of a URL over the canonical one, it's often because that's how it sees internal links pointing to the page. Make sure that you only have links to the .shtml version of the page.
- Use a tool like Moz or Ahrefs to find all internal links to your site. For any links that you built or have a partnership with the owners, make sure that they're linking to the .shtml version of the page. I could especially see your ad partners using / because it's a cleaner before parameters than .shtml.
After that, wait and see if Google fixes the problem.
Also worth noting: have you thought about changing your default to /? That's more common today, so you're probably getting a lot of external links with / instead of .shtml, and you'll never be able to fix that problem. If that's a possible solution, you may want to explore it.
Good luck!
Kristina
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Staging/Development Site Indexed?
So, my company's site has been pretty tough to try to get moving in the right direction on Google's SERPs. I had believed that it was mainly due to having a shortage of back links and a horrible home page load time. Everything else seems to be set up pretty well. I was messing around and used the site: Google search operator for our staging site. I found stage.site.com and a lot of our other staging pages in the search results. I have to think that this is the problem and causing a duplicate content penalty of the entire site. I guess I now need to 301 redirect the entire site? Has anyone every had this issue before and have fixed it? Thanks for any help.
Intermediate & Advanced SEO | | aua0 -
Google Webmaster Tools -> Sitemap suddent "indexed" drop
Hello MOZ, We had an massive SEO drop in June due to unknown reasons and we have been trying to recover since then. I've just noticed this yesterday and I'm worried. See: http://imgur.com/xv2QgCQ Could anyone help by explaining what would cause this sudden drop and what does this drop translates to exactly? What is strange is that our index status is still strong at 310 pages, no drop there: http://imgur.com/a1sRAKo And when I do search on google site:globecar.com everything seems normal see: http://imgur.com/O7vPkqu Thanks,
Intermediate & Advanced SEO | | GlobeCar0 -
Google is not indexing an updated website
We just relaunched a website that has 5 years old, we maintain all the old URLs and articles but for some reason google is not picking up the new website https://www.navisyachts.com. In Google Webmaster Tools we can see the sitemap with over 1000 pages submitted but shows nothing as indexed. The site is loosing traffic rapidly and positions, from the SEO side all looks fine for me. What can be wrong? I’ll appreciate any help. The new website is built over Joomla 3.4, we have it here at MOZ and other than some minor details it doesn't show that something can be wrong with the website. Thank you.
Intermediate & Advanced SEO | | FWC_SEO0 -
Google+ Verification - Site Speed Optimization
So the Google+ Badge verifies our site for Google direct connect. However, the javascript code for the badge itself causes the page to load 3 to 4 seconds longer, which is a big deal. Any ideas for a work around?
Intermediate & Advanced SEO | | inc.com0 -
Do you bother cleaning duplicate content from Googles Index?
Hi, I'm in the process of instructing developers to stop producing duplicate content, however a lot of duplicate content is already in Google's Index and I'm wondering if I should bother getting it removed... I'd appreciate it if you could let me know what you'd do... For example one 'type' of page is being crawled thousands of times, but it only has 7 instances in the index which don't rank for anything. For this example I'm thinking of just stopping Google from accessing that page 'type'. Do you think this is right? Do you normally meta NoIndex,follow the page, wait for the pages to be removed from Google's Index, and then stop the duplicate content from being crawled? Or do you just stop the pages from being crawled and let Google sort out its own Index in its own time? Thanks FashionLux
Intermediate & Advanced SEO | | FashionLux0 -
Why does google not show my ecommerce category page when I have the same keywords for many products in the product title?
I have found that google removes the google serach listing of a category from my site (ecommerce) when products within the category have the same key words. I sell golf shirts and have a category called "Mens Golf Shirts" Within the category I have added many products but when the too many of the products say mens golf shirt my link on google gets removed. Before i had products named: FUNKTION Mens Short Sleeve Golf Shirt Red / Black but now I have had to change it to: FUNKTION Red / Black I can understand that they may see this a keyword stuffing but how do I get around this to ensure that each product can rank on google for mens golf shirt
Intermediate & Advanced SEO | | funktiongolf0 -
Is there a way to find out how many 301 redirects a site gets?
If you do a search on "personal loans" on Google the first non-local/personal result is onemainfinanical.com. They have far fewer links showing in OSE and YSE than the other sites. I know onemainfinanical.com is a Citbank site so I'm trying to determine if they are ranking so high b/c they are getting 301 link juice from old Citibank.com authority pages. Is there anyway to check to see what sites are sending link juice through a 301 redirect instead of a direct link?
Intermediate & Advanced SEO | | fthead90 -
Is there a development solution for AJAX-based sites and indexing in Bing/Yahoo?
Hi. I have outlined a solution for an AJAX-based site in order to rank preserve indexing and rank in Google using the hashbang. I'm curious if anyone has some insight for doing the same for Bing/Yahoo! (a development question)
Intermediate & Advanced SEO | | OveritMedia0