Removing duplicate &var=1 etc var name urls from google
-
Hi I had a huge drop in traffic around the 11th of july over 50% down with no recovery as yet... ~5000 organic visits per day down to barley over 2500.
I fixed up a problem that one script was introducing that had caused high bounce rates.
Now i have identified that google has indexed the entire news section 4 times, same content but with var=0 var=1 2 3 etc around 40,000 urls in total.
Now this would have to be causing problems.
I have fixed the problem and those url's 404 now, no need for 301's as they are not linked to from anywhere.
How can I get them out of the index? I cant do it one by one with the url removal request.. I cant remove a directory from url removal tool as the reuglar content is still there..
If I ban it in robots.txt those urls, wont it never try to index them again and thus not ever discover they are 404ing?
These urls are no longer linked to from anywhere, so how can google ever reach them by crawling to find them 404ing?
-
yes
-
Hi thanks, so if it cant find a page and finds no more links to a page. does that mean that it should drop out of the index within a month?
-
The definition of a 404 page is a page which cannot be found. So in that sense, no Google can't find the page.
Google's crawlers follow links. If there is not a link to the page, then there is no issue. If Google locates a link, they will attempt to follow that link.
-
Hi Thanks, so if a page is 404'ing but not linked to from anywhere google will still find it?
-
Hi Adam.
The preferred method to handle this issue would have been to only offer one version of the URL. Once you realized the other versions were active, you have a couple options to deal with the problem:
Use a 301 to redirect all the versions of the page to the main URL. This method would have allowed your existing Google links to work. Users would still find the correct page. Google would have noticed the 301 and adjusted their links.
Another option to consider IF the pages were helpful would be to keep them and use the canonical tag to indicate the URL of the primary page. This method would offer the same advantages mentioned above.
By removing the pages and allowing them to 404, everyone loses for the next month. Users who click on a search result will be taken to a 404 page rather then finding the content they seek. Google wont be offering the search results users are seeking. You will experience a high bounce rate as many users do not like 404 pages, and it will take a month for an average site to be fully crawled and the issue corrected.
If you block the pages in robots.txt, then Google wont attempt to crawl the links. In general, your robots.txt should not be used in this manner.
My recommendation is to fix this issue either with the proper 301s. If that is not an option, be sure your 404 page is helpful and as user friendly as possible. Include a site search option along with your main navigation. Google will crawl a small percent of your site each day. You will notice the number of 404 links diminish over time.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google keeps marking different pages as duplicates
My website has many pages like this: mywebsite/company1/valuation mywebsite/company2/valuation mywebsite/company3/valuation mywebsite/company4/valuation ... These pages describe the valuation of each company. These pages were never identical but initially, I included a few generic paragraphs like what is valuation, what is a valuation model, etc... in all the pages so some parts of these pages' content were identical. Google marked many of these pages as duplicated (in Google Search Console) so I modified the content of these pages: I removed those generic paragraphs and added other information that is unique to each company. As a result, these pages are extremely different from each other now and have little similarities. Although it has been more than 1 month since I made the modification, Google still marks the majority of these pages as duplicates, even though Google has already crawled their new modified version. I wonder whether there is anything else I can do in this situation? Thanks
Technical SEO | | TuanDo96270 -
Link to AMP VS AMP Google Cache VS Standard page?
Hi guys, During the link building strategy, which version should i prefer as a destination between: to the normal version (php page) to the Amp page of the Website to the Amp page of Google Cache The main doubt is between AMP of the website or standard Version. Does the canonical meta equals the situation or there is a better solution? Thank you so mutch!
Technical SEO | | Dante_Alighieri0 -
Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)
Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
2 websites 1 Google account?
quick question, I have set up my second website purely for seasonal stock so getting it online early ready for when the time is right. But i only have a single web master tools & ad words account, would there be any problem with the single accounts having the details for both websites? or would it be wise to have separate accounts for each site?
Technical SEO | | GarethEJones0 -
Removing a URL from Search Results
I recently renamed a small photography company, and so I transferred the content to the new website, put a 301-redirect on the old website URL, and turned off hosting for that website. But when I search for certain terms that the old URL used to rank highly for (branded terms) the old URL still shows up. The old URL is "www.willmarlowphotography.com" and when you type in "Will Marlow" it often appears in 8th and 9th place on a SERP. So, I have two questions: First, since the URL no longer has a hosting account associated with it, shouldn't it just disappear from SERPs? Second, is there anything else I should have done to make the transition smoother to the new URL? Thanks for any insights you can share.
Technical SEO | | williammarlow0 -
Duplicate Content Issue: Google/Moz Crawler recognize Chinese?
Hi! I am using Wordpress multisite and my Chinese version of the website is in www.mysite.com/cn Problem: I keep getting duplicate content errors within www.mysite.com/cn (NOT between www.mysite.com and www.mysite.com/cn) I have downloaded and checked the SEOmoz report and duplicate_page_content list in CSV file. I have no idea why it says they have the same content., they have nothing in common in content . www.mysite.com is the English version of the website,and the structure is the same for www.mysite.com/cn *I don't have any duplicate content issues within www.mysite.com Question: Does google Crawler properly recognizes chinese content??
Technical SEO | | joony20080 -
Duplicate content error - same URL
Hi, One of my sites is reporting a duplicate content and page title error. But it is the same page? And the home page at that. The only difference in the error report is a trailing slash. www.{mysite}.co.uk www.{mysite}.co.uk/ Is this an easy htaccess fix? Many thanks TT
Technical SEO | | TheTub1 -
Keywords in file names vs folder names
We understand the value of a keyword phrase included in the URL. Is there more value to having that phrase in the folder name of the URL or the file name or does it matter? Example: http://www.biztoolsone.com/website-design.php or http://www.biztoolsone.com/website-design/ Which is best? Thanks, Wick Smith
Technical SEO | | wcksmith0