Strange Behavior - Dupe Content Via Query String URLs?
-
Hey y'all, could use community help with some strange behavior I'm seeing with a particular ranking.
A week ago a high volume keyword ranking above the fold dropped off the map. I immediately thought must be an algorithmic penguin penalty (no manual action message) or panda / dupe content issue. I think it's dupe content at this point because I found my former ranking page in the omitted results section for the keyword we used to rank for.
The strange thing is that without making any changes, Google would momentarily show our domain ranking high page one again, but with a strange query string URL. At first just domain.com/page/? whereas the old ranking was held by domain.com/page/ but now I see several long query string URLs floating around that the engines don't seem to know what to do with. Canonical tags are in place to canonicalize any query string URL back to the top and I have now designated query string URLs as unimportant in Search Console parameter filtering but these URLs persist.
I ended up deduplicating content to a page on another domain we own (think that was the original problem) and there seemed to be a positive effect but now we are top of page 2 with a much longer query string URL as the ranking page. It seems Google wants to rank everything but the former ranking URL even though it's the most authoritative by far, has canonical signals in place, and is now no longer duplicate content. Content checker tool showed 60% similarity to the other piece, which is a ratio I've never known to cause dupe content.
We found the source of the query string URLs to be from an external site that has a link to us but it's a buggy site so filtering on the page adds the string to our URL, so Google can find them and thinks they're significant.
Long question short, has anyone had trouble like this? Getting weird parameter / query URLs to get out of the index in favor of the non-parameter folder? Is it possible the main folder page got hit with Penguin and is "banned?" Still, I don't know why Google would go out of it's way to rank query string copy pages in its place if that were the case. Any help greatly appreciated.
An example of the URL looks like this:
domain.com/page/?CustomerSubscriptionTrack1PageSize=1&CustomerSubscriptionTrack1Order=Sorter_ID&CustomerSubscriptionTrack1Dir=ASC&CustomerSubscriptionTrack1Page=3&WorkOrder_TBLOrder=Sorter_AssetID&WorkOrder_TBLDir=ASC&ID=106 -
Hey James, sorry to hear you're getting blasted by negative links and appreciate your responses here.
I actually sorted this one out (fingers crossed it stays that way) by having the dev team implement a redirect rule that 301 redirects any query string back to the folder we want ranking. Similar signal to what the canonical tag would send but in my opinion a stronger signal since there is no longer a way to reach those weird query string URLs with a 200 response.
Once that was implemented the appropriate page was right back to its old high ranking position and the query strings are hardly to be seen in the index and are no longer preferred to the old ranking page - so looks like all is right with the world again.
We also disavowed the domain that was the source of many of the query string URLs. I don't think it was a case of negative SEO - just bad coding on their side. I'm not sure what exactly did the trick but I suspect strongly that the 301 redirects is what solidified the index due tot the strong correlation of that change with ranking recovery.
Maybe you can employ a similar solution whereby you can disavow domains where these links originate or set up server side handling to manage URLs of a specific pattern - for example, any URL containing "pornsite.com" if not any query string altogether (in our case we don't have any use for query strings in our URLs so just bagged them all).
Thanks again,
Matt -
Thanks for the response, James. The odd thing is that canonical tags are implemented correctly as far as I can tell. In the of each variation you can find the following code:
rel="canonical" href="https://www.domain.com/page/" />
(still using my example so as to keep the site anonymous)
And this code had been in place well before the issue arose. So yes, we are sending that signal to Google to apply canonical back to the top in every case, without query string.
Not sure what you're confused by in Search Console - the platform provides a tool to deal with parameter URLs just like the ones I'm seeing. I used it to mark all parameter URLs as not changing content, which should designate to engines to exclude them from the index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content issue: staging urls has been indexed and need to know how to remove it from the serps
duplicate content issue: staging url has been indexed by google ( many pages) and need to know how to remove them from the serps. Bing sees the staging url as moved permanently Google sees the staging urls (240 results) and redirects to the correct url Should I be concerned about duplicate content and request Google to remove the staging url removed Thanks Guys
Technical SEO | | Taiger0 -
Duplicate Page Content but where?
Hi All Moz is telling me I have duplicate page content and sure enough the PA MR mT are all 0 but it doesnt give me a link to this content! This is the page: http://www.orsgroup.com/index.php?page=Scanning-services But I cant find where the duplicate content is other than on our own youtube page which I will get removed here: http://www.youtube.com/watch?v=Pnjh9jkAWuA Can anyone help please? Andy
Technical SEO | | ORS-Group0 -
Content on subdomain...
We recently moved our Wordpress site to a new host (WPEngine). We had forums on the old web host, which we need to migrate to a new forum platform (Xenforo) and integrate into the WP site. Since WPEngine only allows Wordpress on their servers, we need to install the forums at another web host, on one of our other domains. We might point to the forums through a subdomain, like this: forums.our-primary-domain.com The main reason we're re-installing the forums is for SEO value. HOWEVER, since our forum content will be on another domain, will we have an issue? If so, is there a workaround that would give us 'credit' for that content? Thanks much.
Technical SEO | | jmueller08230 -
URL Structure Question
We are building a job board website that will have a decent amount of "career resources" type content and want to make sure we set up our url structure correctly. After researching on Google and here I have an idea how to structure it but would like some insight if we are on the right track. We are using Wordpress for the content part of our website. We will have about 5 content categories (like resume-tips, job-interviews, job-search etc.) The two options we are considering; www.domain.com/career-resources/index.html As content start page www.domain.com/career-resources/resume-tips/index.html category start page www.domain.com/career-resources/resume-tips/top-5-resume-mistakes.html article name is the /career-resources/ folder really needed or can we go something like; www.domain.com/career-resources/index.html As content start page www.domain.com/resume-tips/index.html category start page www.domain.com/resume-tips/top-5-resume-mistakes.html article name Are we on the right track... and is one way better for SEO that the other? Thanks! Shaun
Technical SEO | | aactive0 -
# in url affecting rank
Hi I am building links to a page www.companyname.com/category.index.php There is also another similar url www.companyname.com/category.index.php#. This page is linked to from the non # page. This is a new client and I'm not entirely sure why that link is there. Am I correct in thinking that these two urls are different in the eyes of the search engines? If so, would some of the link juice to www.companyname.com/category.index.php be transferred to www.companyname.com/category.index.php# and affect the ranking of the non # page? I hope this makes sense! Thanks
Technical SEO | | sicseo0 -
Omniture tracking code URLs creating duplicate content
My ecommerce company uses Omniture tracking codes for a variety of different tracking parameters, from promotional emails to third party comparison shopping engines. All of these tracking codes create URLs that look like www.domain.com/?s_cid=(tracking parameter), which are identical to the original page and these dynamic tracking pages are being indexed. The cached version is still the original page. For now, the duplicate versions do not appear to be affecting rankings, but as we ramp up with holiday sales, promotions, adding more CSEs, etc, there will be more and more tracking URLs that could potentially hurt us. What is the best solution for this problem? If we use robots.txt to block the ?s_cid versions, it may affect our listings on CSEs, as the bots will try to crawl the link to find product info/pricing but will be denied. Is this correct? Or, do CSEs generally use other methods for gathering and verifying product information? So far the most comprehensive solution I can think of would be to add a rel=canonical tag to every unique static URL on our site, which should solve the duplicate content issues, but we have thousands of pages and this would take an eternity (unless someone knows a good way to do this automagically, I’m not a programmer so maybe there’s a way that I don’t know). Any help/advice/suggestions will be appreciated. If you have any solutions, please explain why your solution would work to help me understand on a deeper level in case something like this comes up again in the future. Thanks!
Technical SEO | | BrianCC0 -
Our Development team is planning to make our website nearly 100% AJAX and JavaScript. My concern is crawlability or lack thereof. Their contention is that Google can read the pages using the new #! URL string. What do you recommend?
Discussion around AJAX implementations and if anybody has achieved high rankings with a full AJAX website or even a partial AJAX website.
Technical SEO | | DavidChase0 -
Duplicate homepage content
Hi, I recently did a site crawl using seomoz crawl test My homepage seems to have 3 cases of duplicate content.. These are the urls www.example.ie/ www.example..ie/%5B%7E19%7E%5D www.example..ie/index.htm Does anyone have any advise on this? What impact does this have on my seo?
Technical SEO | | Socialdude0