Strange Behavior - Dupe Content Via Query String URLs?
-
Hey y'all, could use community help with some strange behavior I'm seeing with a particular ranking.
A week ago a high volume keyword ranking above the fold dropped off the map. I immediately thought must be an algorithmic penguin penalty (no manual action message) or panda / dupe content issue. I think it's dupe content at this point because I found my former ranking page in the omitted results section for the keyword we used to rank for.
The strange thing is that without making any changes, Google would momentarily show our domain ranking high page one again, but with a strange query string URL. At first just domain.com/page/? whereas the old ranking was held by domain.com/page/ but now I see several long query string URLs floating around that the engines don't seem to know what to do with. Canonical tags are in place to canonicalize any query string URL back to the top and I have now designated query string URLs as unimportant in Search Console parameter filtering but these URLs persist.
I ended up deduplicating content to a page on another domain we own (think that was the original problem) and there seemed to be a positive effect but now we are top of page 2 with a much longer query string URL as the ranking page. It seems Google wants to rank everything but the former ranking URL even though it's the most authoritative by far, has canonical signals in place, and is now no longer duplicate content. Content checker tool showed 60% similarity to the other piece, which is a ratio I've never known to cause dupe content.
We found the source of the query string URLs to be from an external site that has a link to us but it's a buggy site so filtering on the page adds the string to our URL, so Google can find them and thinks they're significant.
Long question short, has anyone had trouble like this? Getting weird parameter / query URLs to get out of the index in favor of the non-parameter folder? Is it possible the main folder page got hit with Penguin and is "banned?" Still, I don't know why Google would go out of it's way to rank query string copy pages in its place if that were the case. Any help greatly appreciated.
An example of the URL looks like this:
domain.com/page/?CustomerSubscriptionTrack1PageSize=1&CustomerSubscriptionTrack1Order=Sorter_ID&CustomerSubscriptionTrack1Dir=ASC&CustomerSubscriptionTrack1Page=3&WorkOrder_TBLOrder=Sorter_AssetID&WorkOrder_TBLDir=ASC&ID=106 -
Hey James, sorry to hear you're getting blasted by negative links and appreciate your responses here.
I actually sorted this one out (fingers crossed it stays that way) by having the dev team implement a redirect rule that 301 redirects any query string back to the folder we want ranking. Similar signal to what the canonical tag would send but in my opinion a stronger signal since there is no longer a way to reach those weird query string URLs with a 200 response.
Once that was implemented the appropriate page was right back to its old high ranking position and the query strings are hardly to be seen in the index and are no longer preferred to the old ranking page - so looks like all is right with the world again.
We also disavowed the domain that was the source of many of the query string URLs. I don't think it was a case of negative SEO - just bad coding on their side. I'm not sure what exactly did the trick but I suspect strongly that the 301 redirects is what solidified the index due tot the strong correlation of that change with ranking recovery.
Maybe you can employ a similar solution whereby you can disavow domains where these links originate or set up server side handling to manage URLs of a specific pattern - for example, any URL containing "pornsite.com" if not any query string altogether (in our case we don't have any use for query strings in our URLs so just bagged them all).
Thanks again,
Matt -
Thanks for the response, James. The odd thing is that canonical tags are implemented correctly as far as I can tell. In the of each variation you can find the following code:
rel="canonical" href="https://www.domain.com/page/" />
(still using my example so as to keep the site anonymous)
And this code had been in place well before the issue arose. So yes, we are sending that signal to Google to apply canonical back to the top in every case, without query string.
Not sure what you're confused by in Search Console - the platform provides a tool to deal with parameter URLs just like the ones I'm seeing. I used it to mark all parameter URLs as not changing content, which should designate to engines to exclude them from the index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When is Duplicate Content Duplicate Content
Hi, I was wondering exactly when duplicate content is duplicate content? Is it always when it is word-for-word or if it is similar? For example, we currently have an information page and I would like to add a FAQ to the website. There is, however, a crossover with the content and some of it is repeated. However, it is not written word for word. Could you please advise me? Thanks a lot Tom
Technical SEO | | National-Homebuyers0 -
How to delete specific url?
I just ran drawl diagnostics and trying to delete pages such as "oops that page can't be found" or "404 (not found_ error response pages. Can anyone help?
Technical SEO | | sawedding0 -
Keywords, when are you overdoing it in the URL?
Hi guys, I'm auditing a site covering compensation for cancer. Keywords could include: Undiagnosed cancer 20 cancer compensation 10 undiagnosed cancer symptoms 10 cancer misdiagnosis claims 20 cancer claims 10 misdiagnosis of cancer 50 cancer misdiagnosis 70 So, when structuring the URL for the category, this was previously selected: www.site.co.uk/medical-negligence/cancer-misdiagnosis Although sub-pages appear like this: www.site.co.uk/medical-negligence/cancer-misdiagnosis/breast-cancer-misdiagnosis-claim/ 'Cancer misdiagnosis' as a keyword attracts the most traffic, but if we're using it on sub-pages - is there a need to include it twice on all sub-page URLs? With that in mind, would it be better to follow the following format? www.site.co.uk/medical-negligence/cancer-compensation www.site.co.uk/medical-negligence/cancer-compensation/breast-cancer-misdiagnosis-claim/ Or is there a better way to structure this? Thanks in advance guys!
Technical SEO | | Muhammad-Isap0 -
Duplicate Content
Crawl Diagnostics has returned several issues that I'm unsure how to fix. I'm guessing it's a canonical link issue but not entirely sure... Duplicate Page Content/Titles On a website (http://www.smselectronics.co.uk/market-sectors) with 6 market sectors but each pull the same 3 pages as child pages - certifications, equipment & case studies. On each products section where the page only shows X amount of items but there are several pages to fit all the products this creates multiple pages. There is also a similar pagination problem with the Blogs (auto generated date titles & user created SEO titles) & News listings. Blog Tags also seem to generate duplicate pages with the same content/titles as the parent page. Are these particularly important for SEO or is it more important to remove the duplication by deleting them? Any help would be greatly appreciated. Thanks
Technical SEO | | BBDCreative0 -
Duplicate Content
Hi, I'm working on a site and I'm having some issues with its structure causing duplicate content. The first issue is that the search pages will show up as duplicates.
Technical SEO | | OOMDODigital
A search for new inventory may be new.aspx
The duplicate may be something like new.aspx=page1, or something like that and so on. The second issue is with inventory. When new inventory gets put into the stock of the store, a new page for that item will be populated with duplicate content. There appears to be no canonical source for that page. How can I fix both of these? Thanks!0 -
Canonical URL
In our campaign, I see this notices Tag value
Technical SEO | | shebinhassan
florahospitality.com/ar/careers.aspx Description
Using rel=canonical suggests to search engines which URL should be seen as canonical. What does it mean? Because If I try to view the source code of our site, it clearly gives me the canonical url.0 -
URL query strings and canonical tag
Hi, I have recently been getting my comparison website redesigned and developed onto wordpress and the site is now 90% complete. Part of the redesign has meant that there are now dynamic urls in the format: http://www.mywebsite.com/10-pounds-productss/?display=cost&value=10 I have other pages similar to this but with different content for the different price ranges and these are linked to from the menus: http://www.mywebsite.com/20-pounds-products/?display=cost&value=20 Now my questions are: 1. I am using Joost's All-in-one SEO plugin and this adds a canonical tag to the page that is pointing to http://www.mywebsite.com/10-pounds-products/ which is the permalink. Is this OK as it is or should i change this to http://www.mywebsite.com/10-pounds-products/?display=cost&value=10 2. Which URL will get indexed, what gets shown as the display URL in the SERPs and what page will users land on? I'm a bit confused so apologies if these seem like silly questions. Thanks
Technical SEO | | bizarro10000 -
Duplicate content check picking up weird urls
Hi everyone, I love the duplicate content feature; we have a lot of duplicate content issues due to the way our site is structured. So, we're working on them. However, I'm not fully understanding the results. For example, say I have an article on breast cancer symptoms. It shows up as duplicate content, by having two urls that point to the exact same page. http://www.healthchoices.ca/articles/breast cancer symptoms and http://www.healthchoices.ca/somerandomstringofcode. I fully understand why that is duplicate content. I am not sure about this though, it picks up the same url twice and calls it duplicate content. For example, saying that http://www.healthchoices.ca/dr.-so-and-so and http://www.healthchoices.ca/dr.-so-and-so is duplicate...however is this not the same page? Is there something I'm missing? Many of the URL's are identical. Thanks, Erin
Technical SEO | | erinhealthchoices0