Sitemap include all site links or just ones we want indexed?
-
Got a quick sitemap question. We have a clients site built in opencart and are getting ready to submit the sitmap. The default sitemap setting generates urls right off of the root. For example site.com/product. These urls are also accessible through the site itself. We prefer to give the site some depth and have structured the products so the urls are site.com/category/product. All of the product pages have canonicals including the category so we should not have to worry about duplicate content on the /product page vs the /category/product page. My question is both types of product pages are included in the sitemap at the moment. Since we don't want google to index the /product urls should we leave them off of the sitemap even though they are readily accessible from the frontend(though not linked)? Or just leave them and let the canonical tag be used in directing google as to which urls to index. Thanks in advance.
-
Hi again JS,
I think it's great that you continue to evaluate your platform from all perspectives and evaluate its strengths/weaknesses. Many times, a platform can do a lot of the basics well, but fall short on the details that differentiate us from our competition. For example, opencart may do the basic SEO requirements well, but not include ecommerce microdata (schema.org) which have a high impact on our search listings.
You can do a lot of harm/good with the robots.txt file - like deindex entire website (probably not a good thing) or block certain directories (your /product issue). I would gain some deeper knowledge about what you can do with the robots.txt file and how you need it to perform for your business.
-
Hey Raymond,
Thanks for the response, feel like I'm over thinking this a bit, as usually we just leave our opencart setups as is, other then a few minor tweaks. Lately I've really been scrutinizing opencart's SEO setup and how to improve it, since it seems there are a lot of gaps in he way it handles this.
I thought the robots.txt would have been a good way to block the pages, but the issue is I would need to block every single product page as opencart automatically creates a page for every product that is site.com/product and since we are adding lots of products there should be a better way to handle this. After I posted I came across this tidbit from a 6 year old google webmaster central blog post. Basically it states that 'While we can't guarantee that our algorithms will display that particular URL in search results, it's still helpful for you to indicate your preference by including that URL in your Sitemap. '. I think going this route along with the canonical should do the trick.
-
Hi JStrong,
Great question to be asking and an important topic to be doing your due diligence on, especially when dealing with an eCommerce related website.
Google uses a sitemap as a guideline for crawling your site. So, just because you put a URL in your sitemap, doesn't mean that they URL will actually be indexed. You can see those stats in your Google Webmaster Tools account, under the Sitemap area. It will display how many URLs are in the sitemap and how many out of those URLs are indexed.
If you do not want certain pages to be indexed by Google, then you would need to adjust your robots.txt file to give Google those instructions.
As long as you have the correct Canonical configurations, you should avoid any duplicate content issues from the URLs you've described above.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Structure question?
Hey guys, Sorry for posting this again but the last thread got a bit too wayword. I'll sum it up better here. We're producing a WordPress theme every 3-6 months. Each is differently niched (eg: ecommerce, restaurant, magazine, etc...) Which option is better for our products going forward (even the ones we've yet to launch...eg...which method will get future projects more "trust juice" from google): A: create a subfolder for each theme eg: http://bigbangthemes.net/TicketLab_WP/wordpress-ticket-system & http://bigbangthemes.net/Showoff_WP/landing-page/ **This is currently what we're doing.**B: have them all under bigbangthemes.net/wordpress-themes/ eg: bigbangthemes.net/wordpress-themes/wordpress-ticket-system & bigbangthemes.net/wordpress-themes/showoff-startup-agency-theme Thanks for the help!
On-Page Optimization | | andy.bigbangthemes0 -
SEO for E-Commerce Sites
Hi Everybody, I have two e-commerce sites just launched with not much content at the moment just user login pages for the clients to avail the service. The management is not interested to put much content there i think. Maximum what they will be putting only 5 pages of content in total, not more than this. Any practical tips how to optimize such sites especially when there is not much content. Best
On-Page Optimization | | Sequelmed0 -
Why are my pages de-indexed?
<form id="form-t3_37nfib9dz" class="usertext" action="http://www.reddit.com/r/SEO/comments/37nfib/why_were_my_pages_deindexed/#"> Hello all, I am very new to SEO. For some reason many of the pages on my site were de-indexed. Specifically the ones linked from this page: However other pages, like the ones linked from this page and this page were not de-indexed. http://www.lawyerconnection.ca/practice-areas/car-accident-injury-lawyers/[1] However the pages linked from this page were not de-indexed: http://www.lawyerconnection.ca/practice-areas/slip-and-fall-lawyers/[2] http://www.lawyerconnection.ca/podcastresources/[3] That first page itself was not de-indexed, just the site that it links to. It just happened today, so maybe I am jumping the gun but I doubt it. When I enter the page into google webmaster tools again and press fetch, one of the child pages, it re-indexes. What could be the problem here? I had someone re-write the content for every city but I have a feeling that there is less differences in the car accidents pages? Is this considered duplicated content do you think? Am I making some other mistake I can't think of? Is it just a one day blip (I doubt it) Let me know, thanks. </form>
On-Page Optimization | | RafeTLouis0 -
Use External Links
Hey 🙂 I noticed when analysing my pages that Moz gives the following advice about adding external links to my articles; "On any page specifically targeting a keyword, link externally to at least one (and possibly more than one) relevant, trusted resources as a best practice." As a small business I work pretty damn hard to get visitors to my website, so why on earth would I want to go to all that trouble just to send them away again to a trusted resouce? Secondly, what exactly is a "trusted resource"? Can I simply use search and use the top competitor, for example Moz or Wikipedia and does the anchor need to be an exact match or will a partial suffice. I say this because I already have the top spot for my longtail, so an exact match would be pointless. And lastly, I notice that pretty much all quality sites use external links to open in the same window i.e. not target=_blank, I never thought of it before today, but now that I'm considering using external linking in my articles I guess it's important to know the answer - i.e. Is this a best practice and does this give any seo benefit? Cheers, Lee :)
On-Page Optimization | | LeeC0 -
Too many page links warning... but each link has canonical back to main page? Is my page OK?
The Moz crawl warns me many of my pages have too many links, like this page http://www.webjobz.com/jobs/industry/Accounting ...... has 269 links but many of the links are like this /jobs/jobtitles/Accounting?k=&w=3&hiddenLocationID=463170&depth=2 and are used to refine search criteria.... when you click on those links they all have a canonical link back to http://www.webjobz.com/jobs/industry/Accounting Is my page being punished for this? Do I have to put "no follow" tags on every link I do not want the bots to follow and if I do so is Roger (moz bot) not going to count this as a link?
On-Page Optimization | | Webjobz0 -
23000 Links are not found- Should I redirect them?
Hi I have been deleting product links from my website but never redirect them. On my google webmaster, it shows there is total 23000 products are not found. Should I redirect them all back to the home page? For the pages with soft 404 response.. should I also redirect those original URL back to home page ? Thanks
On-Page Optimization | | ilovebodykits0 -
Site structure question
I'm currently working on a very awkward custom-WP setup, in which I can't maintain the present drop-down navigation menu without having those pages under a parent or without completely recoding everything. I have two requirements, for SEO purposes I'm looking for the following structure for each targeted landing page: www.example.com/landing-page as opposed to www.example.com/sub/landing-page Of course, having my landing pages as a child, I get the latter of the two. For navigational purposes they need to fall under a specific category in a drop-down menu. With any other theme or setup this is an easy fix, but not here. What I have now is that the landing pages are currently placed under a parent category page. But, they have custom permalinks. The permalinks are setup as follows www.example.com/landing-page But, technically the exact structure is still www.example.com/sub/landing-page which then redirects to the custom permalink. So, my question is - in an attempt to get my most important landing pages close to the root for better PR and crawlability, do I still get the same benefit with my current setup? Is this structure I have, better, worse, or indifferent? Thanks.
On-Page Optimization | | JayAdams320 -
Too many on-page links
I manualy counted the links on my website http://www.commensus.com which came to around 50, but SEO moz says I have over 100 and google isn't seeing them all.
On-Page Optimization | | jawl44630