Sitemap include all site links or just ones we want indexed?
-
Got a quick sitemap question. We have a clients site built in opencart and are getting ready to submit the sitmap. The default sitemap setting generates urls right off of the root. For example site.com/product. These urls are also accessible through the site itself. We prefer to give the site some depth and have structured the products so the urls are site.com/category/product. All of the product pages have canonicals including the category so we should not have to worry about duplicate content on the /product page vs the /category/product page. My question is both types of product pages are included in the sitemap at the moment. Since we don't want google to index the /product urls should we leave them off of the sitemap even though they are readily accessible from the frontend(though not linked)? Or just leave them and let the canonical tag be used in directing google as to which urls to index. Thanks in advance.
-
Hi again JS,
I think it's great that you continue to evaluate your platform from all perspectives and evaluate its strengths/weaknesses. Many times, a platform can do a lot of the basics well, but fall short on the details that differentiate us from our competition. For example, opencart may do the basic SEO requirements well, but not include ecommerce microdata (schema.org) which have a high impact on our search listings.
You can do a lot of harm/good with the robots.txt file - like deindex entire website (probably not a good thing) or block certain directories (your /product issue). I would gain some deeper knowledge about what you can do with the robots.txt file and how you need it to perform for your business.
-
Hey Raymond,
Thanks for the response, feel like I'm over thinking this a bit, as usually we just leave our opencart setups as is, other then a few minor tweaks. Lately I've really been scrutinizing opencart's SEO setup and how to improve it, since it seems there are a lot of gaps in he way it handles this.
I thought the robots.txt would have been a good way to block the pages, but the issue is I would need to block every single product page as opencart automatically creates a page for every product that is site.com/product and since we are adding lots of products there should be a better way to handle this. After I posted I came across this tidbit from a 6 year old google webmaster central blog post. Basically it states that 'While we can't guarantee that our algorithms will display that particular URL in search results, it's still helpful for you to indicate your preference by including that URL in your Sitemap. '. I think going this route along with the canonical should do the trick.
-
Hi JStrong,
Great question to be asking and an important topic to be doing your due diligence on, especially when dealing with an eCommerce related website.
Google uses a sitemap as a guideline for crawling your site. So, just because you put a URL in your sitemap, doesn't mean that they URL will actually be indexed. You can see those stats in your Google Webmaster Tools account, under the Sitemap area. It will display how many URLs are in the sitemap and how many out of those URLs are indexed.
If you do not want certain pages to be indexed by Google, then you would need to adjust your robots.txt file to give Google those instructions.
As long as you have the correct Canonical configurations, you should avoid any duplicate content issues from the URLs you've described above.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Structure question?
Hey guys, Sorry for posting this again but the last thread got a bit too wayword. I'll sum it up better here. We're producing a WordPress theme every 3-6 months. Each is differently niched (eg: ecommerce, restaurant, magazine, etc...) Which option is better for our products going forward (even the ones we've yet to launch...eg...which method will get future projects more "trust juice" from google): A: create a subfolder for each theme eg: http://bigbangthemes.net/TicketLab_WP/wordpress-ticket-system & http://bigbangthemes.net/Showoff_WP/landing-page/ **This is currently what we're doing.**B: have them all under bigbangthemes.net/wordpress-themes/ eg: bigbangthemes.net/wordpress-themes/wordpress-ticket-system & bigbangthemes.net/wordpress-themes/showoff-startup-agency-theme Thanks for the help!
On-Page Optimization | | andy.bigbangthemes0 -
Any scripts for automated interlinking of sites?
I have heard about similar plugins for Wordpress, but I need something like this to run on all kind of sites, no matter the CSM. Are there universal scripts capable of doing automatic interlinking of pages to rise their weight for SEO purposes? Could you share links to such scripts/sites?
On-Page Optimization | | poiseo0 -
Indexing pages after de-indexing them
I have been de-indexing duplicate content on my website which has almost 40 pages contain duplicate content from other websites. later on the website ranking drop down. so should i re index them or just wait ?
On-Page Optimization | | MohammadSabbagh0 -
Why Isnt My New Article Indexed?
I posted this article last night: http://www.londontri.com/325/tomtom-runner-gps-watch-review It didn't appear in Google's index this morning despite me pointing a few high quality links to it (not keyword optimized links, just links from high quality forum posts) On closer examination I thought that the problem could be due to a keyword stuffing penalty so I have made sure that I am not repeating too many words/word combinations using a keyword density checker but the article is still not indexed. Any ideas what could be going on?
On-Page Optimization | | ross88guy0 -
Site is not ranking for a particular keyword !!
One of my site is ranking for all the main keywords except one. This keyword is just a variant of those keywords which are all ranking in top 10 (page 1) in Google. Why is it happening? Does Google punishes site for one keyword. I know competition of keyword matters but other keywords with similar competition are ranking. And even the site is very well optimized for this keyword (titles and site copy without any stuffing) Any Solutions ?
On-Page Optimization | | Personnel_Concept0 -
What is the best setup for conical Links
Should I have the conical link state: 1. www.autoinsurancefremontca.com 2. www.autoinsurancefremontca.com/index.html 3. autoinsurancefremontca.com Also do you need a conical link on each page if you have more than one page on your site?
On-Page Optimization | | Greenpeak0 -
Mentioning own site and keywords on here?
I have noticed that sometimes posters will talk about a site without mentioning what it is. I assume this is because it one of their clients so there is confidentiality, is there any other reason I should be aware of? its just that as I am new I am usually cautious and am considering posting my own site and mentioning all my keywords to ask for people’s verdict for my on-page SEO. Still working on it, will be ready soon, thought I would ask in advance. Regards,
On-Page Optimization | | Zoolander0 -
Too many links on a page?
On my blog posts, I have links to all the categories and months, dating back 5-6 years. This make the number of links on each blog page well over 100, which I understand might decrease the value of each page. Is there a problem with having more than 100 links on a page?
On-Page Optimization | | rdreich492