Where do these URL's come from?! (Indexation issues)
-
We have an international webshop with languages in the URLs. Our URLs are now set up as follows:
http://thermalunderwear.eu/eng/category/product
Now, we know that there's some kind of strange redirect problem causing problems with our indexation, this is a technical issue that should be fixed soon. But whether this is the cause of some other strange problems, I do not know. I'd be happy with any help/advice/tips.
1. The SEOmoz site crawler starts at http://thermalunderwear.eu. This currently does not yet redirect to http://thermalunderwear.eu/eng like we want it to, but all the links on the page do include the default language code. So all links on the page are http://thermalunderwear.eu/eng/category etc. However, apart from those URLs, the site crawler finds many URLs in the form http://thermalunderwear.eu/category/product etc., so not including the language variable. Where it gets these I do not know, and since these URLs dont exist and the webshop simply shows the homepage, these URLs all have 50+ duplicate titles/content. Why oh why?
2. If I do a Google search for indexed URL's with English as language, I get many results formatted like this:
Coldpruf Enthusiast mens thermal shirt - Thermal wear for men ...
thermalunderwear.eu/eng/men/coldpruf-enthusiast-mens-thermal-shirt 170+ items – Fine-ribbed longsleeve thermal shirt men from Enthusiast ... {$SCRIPT_NAME} eng/men/coldpruf-enthusiast-mens-the {$ajax_url} http://thermalunderwear.eu/ajaxWhat are those variables doing there? It looks like it's taking something from our Smarty debug console, which is hidden but still active in the source code, but also the ajax URL which is in a completely different location. What is Google trying to show here?
-
It sees it as a list, its like rich snipits , its a huge amount of your content, and things it is the main content.
see these reullts. 40+ is a list i have in my page, it shows a few samples
-
I guess that is the only solution then. I don't quite understand why Google picks that information to show in the SERP text (as well as the 170+ items) but we'll try disabling the Smarty debugging when we're not actively using it. I hope it helps!
-
I looked in the souce code of this page
http://thermalunderwear.eu/eng/men/devold-alpine-knee-thermal-socks-electric-blue
And i found {$SCRIPT_NAME} eng/men/coldpruf-enthusiast-mens-the
Your dubug code is in the souce code. you need to get rid of it, disable it or something. I have not used smarty debug, so I cant help much.
-
Ah thanks Alan! It looks like there is a problem in the code that generates the breadcrumb URLs. We will get that fixed asap, whicih should lower the number of duplicate content warnings considerably.
-
Your first problem
Look at this page,
http://thermalunderwear.eu/eng/kids-thermal-underwear/coldpruf-enthusiast-kids-thermal-shirt
you will see a link to http://thermalunderwear.eu/kids-thermal-underwear/coldpruf-enthusiast-kids-thermal-shirt
I will look at your other porblem in a few minutes
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Length Issue
MOZ is telling me the URLs are too long. I did a little research and I found out that the length of the URLs is not really a serious problem. In fact, others recommend ignoring the situation. Even on their blog I found this explanation: "Shorter URLs are generally preferable. You do not need to take this to the extreme, and if your URL is already less than 50-60 characters, do not worry about it at all. But if you have URLs pushing 100+ characters, there's probably an opportunity to rewrite them and gain value. This is not a direct problem with Google or Bing - the search engines can process long URLs without much trouble. The issue, instead, lies with usability and user experience. Shorter URLs are easier to parse, copy and paste, share on social media, and embed, and while these may all add up to a fractional improvement in sharing or amplification, every tweet, like, share, pin, email, and link matters (either directly or, often, indirectly)." And yet, I have these questions: In this case, why do I get this error telling me that the urls are too long, and what are the best practices to get this out? Thank You
Moz Pro | | Cart_generation1 -
Crawlers reporting upper case letter url versions although these have been 301'd to lower case !?
Hi I have a client e-com site who's dev platform is on a windows server Their product pages have been auto-named after the product title, with the first letter in each word being upper case, which has hence translated to the URL having upper cases instances too. I asked them to set up 301 redirects for all url's that had upper case instances to lower case versions, which they say they have done. However I'm still seeing url's with upper case instances showing up in webmaster tools and moz crawl reports but when I copy & paste them into a browser they do redirect to, & resolve in, the lower case version. Its also upper case versions reported in the Google cache! So how come webmaster tools & Moz etc are reporting the upper case versions, surely if redirected it should be the lower case versions All Best Dan
Moz Pro | | Dan-Lawrence0 -
Issues with Moz producing 404 Errors from sitemap.xml files recently.
My last campaign crawl produced over 4k 404 errors resulting from Moz not being able to read some of the URLs in our sitemap.xml file. This is the first time we've seen this error and we've been running campaigns for almost 2 months now -- no changes were made to the sitemap.xml file. The file isn't UTF-8 encoded, but rather Content-Type:text/xml; charset=iso-8859-1 (which is what Moveable Type uses). Just wondering if anyone has had a similar issue?
Moz Pro | | BriceSMG0 -
Can't log into Firefox MozBar
I just downloaded and installed the MozBar for Firefox, but it will not let me login to my account. The Log In button is gray and none of the buttons do anything when I click on them. Please help! Thank you,
Moz Pro | | Instabill
Meghan0 -
In report section I don't see the last ranking : (
The image I see is like this http://pro.seomoz.org/campaigns/81837/rankings Nromally I was able to see the last report. Tanks for your help Fabio
Moz Pro | | Fabiosca630 -
Google Hiding Indexed Pages from SERPS?
Trying to troubleshoot an issue with one of our websites and noticed a weird discrepancy. Our site should only have 3 pages in the index. The main landing page with a contact form and two policy pages, yet google reports over 1,100 pages (that part is not a mystery, I know where they are coming from.....multi site installations of popular CMS's leave much to be desired in actually separating websites) Here is a screen shot showing the results of the site command: http://www.diigo.com/item/image/2jing/oseh I have set my search settings to show 100 (the max number of results) results per page. Everything is fine until I get to page three where I get the standard "In order to show you the most relevant results, we have omitted some entries very similar to the 122 already displayed." But wait a second, I clicked on page three, now there are only two pages of results and the number of results reported has dropped to 122 http://www.diigo.com/item/image/2jing/r8c9 When I click on the "show omitted results" I do get some more results, and the returned results jumps back up to 1,100. However I only get three pages of results. And when I click on the last page the number of results returned changes to 205 http://www.diigo.com/item/image/2jing/jd4h Is this a difference between indexes (same thing happens when I turn instant search back on, Shows over 1,100 results but when I get to the last page of results it changes to 205). Any other way of getting this info? I am trying to go in and identify how these pages are being generated, but I have to know what ones are showing up in the index for that to happen. Only being able to access 1/5th of the pages indexed is not cool. Anyone have any idea about this or experience with it? For reference I was going through with SEOmoz's excellent toolbar and exporting the results to csv (using the Mozilla plugin). I guess google doesn't like people doing that so maybe this is a way to protect against scraping by only showing limited results in the Site: command. Thanks!
Moz Pro | | prima-2535090 -
URLs getting re-directed to double http:// URLs
The "Notices" section under "Crawl Diagnostics" shows that there are 435 issues on my website. I checked out a few URLs to verify this issue and found that most of these pages are working perfectly. For instance, the above mentioned report shows that http://policycomplaints.com/about redirects to http://http://policycomplaints.com/about/ . Then, http://policycomplaints.com/aegon-religare/mis-selling-of-policy-by-aegon-religare/ redirects to http://http://policycomplaints.com/aegon-religare/mis-selling-of-policy-by-aegon-religare/ . However, when I open these pages, they seem to be working perfectly. I didn't find them getting re-directed to somewhere else. So, as per the report, it seems that all of these 435 http://URLs are getting re-directed to http://http://URL versions which in reality is not true because all the http://URLs are working perfectly. So, is this a problem with SEOmoz software? If not, what is the reason for these issues and how can I adddress them. Do notify if any further information is required for the same. Thanks. bNiEm.png
Moz Pro | | unknownID10