How to detect where Google gets indexed URL's
-
Google index some kind of way some links that create duplicate content. We doesn't understand how these are created so we would like detect where Google robots find these links.
We tried:
- Moz Crawl Diagnostics but it shows 0 as Internal Link Count for these kind of links.
- Find some information from Google Analytics, that maybe there is trace (site content - all content) from visitors side. There wan't.
- We tried to find some information in Webmaster Tools under Internal link and HTML Improvements but didn't find any trace.
- Tried some search commands. Is there maybe some good one to search.
- TO search URL's form code with https://search.nerdydata.com.
-
It really isn't possible for an outsider to know why your website is generating those URLs in error; you would have to talk to your developer about that.
As far as canonicals, if your problem is page.com is getting duplicated by added parameters: page.com/?id=1, page.com/?id=2, page.com/?id=3, etc. as long as you have the canonical on page.com, all of the parameter pages will have the correct canonical on them as well. (But you are right, you should track down the source; your developer will know.)
-
Thanks you for your answer but yes I know that these are generated by our site. But problem is that I can use canonical tag for these that are indexed right now but later new ones will be created someway. Problem root isn't that we doesn't know how to use canonical, it's how to get to know where these URL's are find/indexed/detected by Google.
These kind of URL's have been there for months so we can't just hope that somehow these will be droped. We need to find some kind of solution and detect real problem.
-
If you found those URLs by doing a site: search, then those parameters are being generated by your site. (I am surprised that Google is even indexing them; I assume that pretty soon all but one will be dropped.) Here is an article that explains more about those types of duplicate pages: http://moz.com/blog/which-page-is-canonical
You can fix this by using a canonical tag on your homepage with the version that doesn't have the parameter.
-
Our front page has almost 50 duplicate versions. These are shown when we do site:oursite.com, there are /et?id=xx, /et?productId=xx, etc. In URL xx are different numbers.
-
Where are you seeing these duplicate content links? Does Webmaster Tools say that they are duplicate content? Or does this show up in your Moz crawl? What do these URLs look like?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why No Goal is recorded in Google Analytics
Hello, I am not sure if i made an error. Can someone please point out. On our sales form, when a user submits the form, the URL displayed is - https://x-y.com/thank-you-2/ I created a Goal like this - 1 ) Under Goal set up, i choose Template option 2 ) Under Goal description, i choose Type > Destination 3 ) and finally, in Goal details field, destination equals to /thank-you-2/ But, no Goal is being tracked. In the first step, should i have selected 'Custom' instead of 'Template' Thanks
Reporting & Analytics | | Johnroger0 -
How to Configure Google Analytic API?
Hello All, I want to implement google analytic api can any one show me the whole process? Regards, Mit
Reporting & Analytics | | mit0 -
Identifying Bots in Google Analytics
Hi there, While you can now filter out bots and spiders in Google Analytics, I'm interested in how you identify a bots and spiders in the first place. For example, it used to be thought that Googlebot wouldn't appear in GA as it 'couldn't process Javascript' but now Google has announced new developments for its crawler with regards to interpreting javascript and CSS, this argument isn't as cut and dry. I'm not suggesting Googlebot appears in Google Analytics, but I am saying that you can't make the case that it won't appear only because it can't interpret JavaScript. So, I'm interested to see what metrics you use to identify a bot? For me, the mix of Users > Browser, Users > Operating System Version is still quite handy, but is it possible to identify individual bots and spiders within Google Analytics? And would Googlebot appear?
Reporting & Analytics | | ecommercebc0 -
Google Analytics Page Metrics and Redirects
Hi All- Context: A site has been redesigned. Pages were renamed in the process. Problem: It's very hard to compare before and after metrics because the page URLs are not the same. Question: Anyone know how to do this in Google Analytics? I'm hoping there's some simple trick I just don't know about. D
Reporting & Analytics | | DonnaDuncan0 -
Google as referring domain
Hi all, a colleague asked a question, which I could not answer (never even noticed this "problem") 😞 When we are logged into our GA account and go the referring domains section, we find Google. I always thought that these visitors came via Google Image Search, but not all of them do. Most of them come via "/imgres", but some come via "/" (always thought that "/" was the homepage?), "/url" and "//" Maybe I am just stupid, but honestly I could not explain what these strings mean... or how these visitors landed on our site... Can you help me???
Reporting & Analytics | | accessKellyOCG0 -
Google Webmaster says "0" pages indexed
Built my first Wordpress site. It launched a few months ago. Google has crawled 76 pages so far. But why are 0 indexed?
Reporting & Analytics | | cschwartzel0 -
Bing Won't Index Site - Help!
For the past few weeks I’ve been trying to figure out why my client's site is not indexed on bing and yahoo search engines. My Google analytics is telling me I’m getting traffic (very little traffic) from Bing almost daily but Bing webmaster tools is telling me I’ve received no traffic and no pages have been indexed into Bing since the beginning of December. At once point I was showing ranking in Bing for only one keyword then all of a sudden none of my pages were being indexed and I now rank for nothing for that website. From Google I’m getting over 1200 visits per month. I have been doing everything I can to possibly find the culprit behind this issue. I feel like the issue could be a redirect problem. In webmaster tools on Bing I’ve used “Fetch as Bingbot” and every time I use it I get a Status of “Redirection limit reached.”. I also checked the CRAWL Information and it’s saying all the URL’s to the site are under 301 redirect. A month or so ago the site was completely revamped and the canonical URL was changed from non www to www. I have tried manually adding pages to be indexed multiple times and Bing will not index any of the sites pages. I have submitted the sitemap to Bing and I am now at a loss. I don’t know what’s going on and why I can’t get the site listed on Bing. Any suggestions would be greatly appreciated. Thanks,
Reporting & Analytics | | VITALBGS
Stephen0 -
Ideas Why Our Google News Traffic Disappearing?
The traffic to our major market newspaper website from Google News has dropped nearly 100% in the past ten days. Any ideas why might be happening?
Reporting & Analytics | | wlis990