Can anyone help me diagnose an indexing/sitemap issue on a large e-commerce site?
-
Hey guys. Wondering if someone can help diagnose a problem for me.
Here's our site: https://www.flagandbanner.com/
We have a fairly large e-commerce site--roughly 23,000 urls according to crawls using both Moz and Screaming Frog. I have created an XML sitemap (using SF) and uploading to Webmaster Tools. WMT is only showing about 2,500 urls indexed. Further, WMT is showing that Google is indexing only about 1/2 (approx. 11,000) of the urls. Finally (to add even more confusion), when doing a site search on Google (site:) it's only showing about 5,400 urls found. The numbers are all over the place!
Here's the robots.txt file:
User-agent: *
Allow: /
Disallow: /aspnet_client/
Disallow: /httperrors/
Disallow: /HTTPErrors/
Disallow: /temp/
Disallow: /test/Disallow: /i_i_email_friend_request
Disallow: /i_i_narrow_your_search
Disallow: /shopping_cart
Disallow: /add_product_to_favorites
Disallow: /email_friend_request
Disallow: /searchformaction
Disallow: /search_keyword
Disallow: /page=
Disallow: /hid=
Disallow: /fab/*Sitemap: https://www.flagandbanner.com/images/sitemap.xml
Anyone have any thoughts as to what our problems are??
Mike
-
A site running ASP should be perfectly fine. I bet you will see substantial increases in a lot of positive metrics by just pairing down that navigation.
-
Thanks so much for your response, Russ.
You're confirming one of the many issues we have identified (too many internal links) but I had not connected it to indexing or site speed. When I use the Google Page Speed Tool, many of our pages are not even registering. It seems like it's taking too long to load them so it times out. Could the crazy amount of links have to do with this, too?
Moreover, our mobile speed is especially poor. This could be an even bigger problem in mobile, no?
Are you familiar with .asp sites, in particular, having indexing issues...or is that a false assumption?
Mike
-
Thanks for the question!
First, it is very common to get inconsistent answers from GSC, site:, sitemap and crawl results. Don't worry too much about that.
Your goal is to get as many of your pages indexed and that is a function of links pointing to your site and internal link structure. While it is an imperfect analogy, we often refer to this as "crawl budget". There are essentially 2 solutions to this...
1. Get more/better backlinks to a diversity of pages on your site.
2. Improve your internal link architecture so that Googlebot finds your pages more quickly.
I think the problem in your case is that the site inundates bots with generic navigational links. For example, this page...
http://www.flagandbanner.com/products/chrome-air-force-lt-general-flag-kit.asp
has 1400 internal links! That is crazy!
This page has 1500!
https://www.flagandbanner.com/products/citizenship-gifts.asp
You need to reel this back in dramatically. Your navigation should like to top level categories or maybe a handful of subcategories. Once in a category, you can reveal deeper categories. This will increase the likelihood that the related and "also" buy links that you find on product pages will get found and followed by Googlebot.
Finally, on a different note, you need to make sure you standardize the casing of URLs (ie: /Products/ or /products/) I noticed that you have links both internal and external that do not take this into account, causing unnecessary duplicate content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
XML Sitemap Questions For Big Site
Hey Guys, I have a few question about XML Sitemaps. For a social site that is going to have presonal accounts created, what is the best way to get them indexed? When it comes to profiles I found out that twitter (https://twitter.com/i/directory/profiles) and facebook (https://www.facebook.com/find-friends?ref=pf) have directory pages, but Google plus has xml index pages (http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml). If we go the XML route, how would we automatically add new profiles to the sitemap? Or is the only option to keep updating your xml profiles using a third party software (sitemapwriter)? If a user chooses to not have their profile indexed (by default it will be index-able), how do we go about deindexing that profile? Is their an automatic way of doing this? Lastly, has anyone dappled with google sitemap generator (https://code.google.com/p/googlesitemapgenerator/) if so do you recommend it? Thank you!
Intermediate & Advanced SEO | | keywordwizzard0 -
Recommended e-commerce site search for Magento?
Does anyone have recommendations for any particular site searches for large e-commerce sites based on Magento? Some (hopeful) requirements: Possibility to segment product pages and blog content on results page Doesn't cause any major SEO or technical issues Understands semantic search Ability to filter results Ability to sort (e.g. by price, popularity, new in stock) It'd be really useful to see examples and know if there are any particular issues to be aware of. Thanks. 🙂
Intermediate & Advanced SEO | | Alex-Harford0 -
My site is always in the top 4 on google, and sometimes goes to #2\. But the site at #1 is always at #1 .. how can i beat them?
So i'm sure this is a very generic question.. of course everyone wants to be #1. We are an ecommerce web site. We have all sorts of products, user ratings, and are loved by our customers. We sell over 3 million a year. So let me give you some data.. First of all one of the sites that keeps taking the #2 or #3 spot is amazons category for what we sell.. (i'm not sure if I should say who we are here.. as I don't want the #1 spot to realize we are trying to take them over!) Amazon of course has a domain authority of 100. But they never take the #1 spot. The other site that takes the #2 and #3 spot is not even selling anything. Happens to be a technical term's with the same name wikipedia page! (i wish google would figure out people aren't looking for that!) Anyways.. every day we bouce back and forth between #4 and #2.. but #1 never changes.. Here are the stats of us verse #1 from moz: #1: Page Authority: 56.8, Root Domains Linking to page: 158, Domain Authority: 54.6: root domains linking to the root domain 1.42k my site: Page Authority: 60.6, Root domains linking to the page: 562, Domain Authority: 52.8: root domains linking to the root domain: 1.03k So they beat us in domain authority SLIGHTLY and in root domains linking to the root domain. So SEO masters.. what do I do to fix this? Get better backlinks? But how.... I can't just email GQ and ask them to write about us can I? I'm open to all things.. Maybe i'm not using moz data correctly.. We should at least be #2. We get #2 every other day.
Intermediate & Advanced SEO | | 88mph0 -
Entire site code copied - potential SEO issues?
Hi folks, We have noticed that our site has been directly duplicated by another site. They have copied the entire code, including the JS, CSS and most of the HTML and have simply switched their own text and images onto the template. (We discovered it because they even copied over our analytics tracking and were appearing in our reports - duh!) Does anyone know if there are potential SEO issues in copying the code like that, or do duplicate content issues only apply to indexable HTML content? Thanks! Matthew (I didn't want to out them by sharing their URL because it could have been an external contractor that built the site and they probably had no idea.)
Intermediate & Advanced SEO | | MattBarker0 -
Splash/Warning Pages at front of site
We are looking at working on a site that needs a warning for users visiting - This splash/warning page is the only just google sees this not performing well in search engine - The sites are Wordpress sites - Would we use script to force a full screen pop up? This would be needed on a visit but if the user leaves and returns to the site the warning would need to reappear. Any ideas?
Intermediate & Advanced SEO | | JohnW-UK0 -
Best Format for URLs on large Ecommerce Site?
I saw this article, http://www.distilled.net/blog/seo/common-ecommerce-technical-seo-problems/, and noticed that Geoff mentioned that product URLs format should be in one of the following ways: Product Page: site.com/product-name Product Page: site.com/category/sub-category/product-name However, for SEO, is there a preferred way? I understand that the top one may be better to prevent duplicate page issues, but I would imagine that the bottom would be better for conversion (maybe the user backtracks to site.com/category/sub-category/ to see other products that he may be interested in). Also, I'd imagine that the top URL would not be a great way to distribute link juice since everything would be attached to the root, right?
Intermediate & Advanced SEO | | eTundra0 -
[e-commerce] Should I index product variants?
Hi guys, I have e-commerce site, that sells car tires. I was wondering would I benefit from making all Product Variants ( for example each tire size ) as different page, that has link to the main product to provide some affiliation, or should I make each variant noindex, and add rel=canonical to the main product. The benefits from having each variant indexed can be many: greater click through rate more relative results for customers etc. But I'm not sure how to handle the duplicate content issue ( in this case, only the title, URL and H1 can be different ). Regards.
Intermediate & Advanced SEO | | seo220 -
WWW vs Non-WWW/Moving a site to a new CMS/Redirect all of the previous URLs
We are working on a new design for a website, which is currently on a CMS that has non-seo-friendly URLs. There is no redirection of 'www' to non-www or vice versa, or handling of homepage redirection so there is only one instance of 'home'. To move the site in the future, all of these URLs will have to be redirected to their new, and I hope, seo-friendly counterparts. Is it prudent now to redirect the four home page links so there is only one? and to redirect all non-www to 'www' so there is only one instance of each page? Or should I leave it and redirect all of them when the time comes?
Intermediate & Advanced SEO | | haan_seo0