Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Duplicate content issue: staging urls has been indexed and need to know how to remove it from the serps
-
duplicate content issue: staging url has been indexed by google ( many pages) and need to know how to remove them from the serps.
Bing sees the staging url as moved permanently
Google sees the staging urls (240 results) and redirects to the correct url Should I be concerned about duplicate content and request Google to remove the staging url removed
Thanks Guys
-
Thanks for helping Malika! To clarify for other readers, blocking in robots.txt after the pages have been indexed will actually prevent them from being removed from the index with a meta noindex tag, since Google won't be able to crawl the pages to see the noindex tag.
If staging URLs have been indexed already (and assuming they still need to exist), here's the steps I would take:
- Add meta noindex tags to every staging URLs
- If urgent, also do a URL removal request in Webmaster Tools (but this is usually not needed)
- Wait until the staging URLs are noindexed - you can check periodically by doing site: searches in Google.
- Only after they are noindexed, block Search Engines from crawling them with the robots.txt file.
-
Generally you'll want to hide your staging site from search engines and as Malika mentioned, the best way to do this is via robots.txt.
That lets you essentially set a rule stating that no crawlers are to access anything on that domain. Beyond that, nothing else is really relevant; if crawlers can't see your site, it doesn't matter what you do with it! You don't even need to worry about 301 redirects once this is done.
Once you apply that change in robots.txt, you may still see your staging site indexed for a little while (anywhere from hours to a couple of months) but this is normal and it will drop away soon enough.
Search engines are pretty good at determining which is the real site these days anyway!
-
Thanks for your suggestions Peter and Malika,
By the wayt The staging site had it's own url..
I think I need help with the canonical stuff, as I am not really sure how to use it.
-
Quick way to remove staging url is sending HTTP error 410 as result.
Other is to use in SearchConsole Remove URLs function https://www.google.com/webmasters/tools/url-removalAbout duplicate content - you must see actual canonical. If on stage URL there is canonical point to normal site then you shouldn't hesitating. But if staging and normal point to different URLs then you can see some algo filter.
-
I am assuming that these pages don't hold any authority or backlinks at all. You can simply delete these pages (if the purposes of these pages has been solved.
Or if you still need these pages live, use Robots.txt file to make these pages (or the whole subdomain/directory they are sitting as disallowed, no-index)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Indexing without content
Hello. I have a problem of page indexing without content. I have website in 3 different languages and 2 of the pages are indexing just fine, but one language page (the most important one) is indexing without content. When searching using site: page comes up, but when searching unique keywords for which I should rank 100% nothing comes up. This page was indexing just fine and the problem arose couple of days ago after google update finished. Looking further, the problem is language related and every page in the given language that is newly indexed has this problem, while pages that were last crawled around one week ago are just fine. Has anyone ran into this type of problem?
Technical SEO | | AtuliSulava1 -
URLs dropping from index (Crawled, currently not indexed)
I've noticed that some of our URLs have recently dropped completely out of Google's index. When carrying out a URL inspection in GSC, it comes up with 'Crawled, currently not indexed'. Strangely, I've also noticed that under referring page it says 'None detected', which is definitely not the case. I wonder if it could be something to do with the following? https://www.seroundtable.com/google-ranking-index-drop-30192.html - It seems to be a bug affecting quite a few people. Here are a few examples of the URLs that have gone missing: https://www.ihasco.co.uk/courses/detail/sexual-harassment-awareness-training https://www.ihasco.co.uk/courses/detail/conflict-resolution-training https://www.ihasco.co.uk/courses/detail/prevent-duty-training Any help here would be massively appreciated!
Technical SEO | | iHasco0 -
Sudden Indexation of "Index of /wp-content/uploads/"
Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you.
Technical SEO | | Tom3_150 -
Duplicate Content Issues with Pagination
Hi Moz Community, We're an eCommerce site so we have a lot of pagination issues but we were able to fix them using the rel=next and rel=prev tags. However, our pages have an option to view 60 items or 180 items at a time. This is now causing duplicate content problems when for example page 2 of the 180 item view is the same as page 4 of the 60 item view. (URL examples below) Wondering if we should just add a canonical tag going to the the main view all page to every page in the paginated series to get ride of this issue. https://www.example.com/gifts/for-the-couple?view=all&n=180&p=2 https://www.example.com/gifts/for-the-couple?view=all&n=60&p=4 Thoughts, ideas or suggestions are welcome. Thanks
Technical SEO | | znotes0 -
Old URLs Appearing in SERPs
Thirteen months ago we removed a large number of non-corporate URLs from our web server. We created 301 redirects and in some cases, we simply removed the content as there was no place to redirect to. Unfortunately, all these pages still appear in Google's SERPs (not Bings) for both the 301'd pages and the pages we removed without redirecting. When you click on the pages in the SERPs that have been redirected - you do get redirected - so we have ruled out any problems with the 301s. We have already resubmitted our XML sitemap and when we run a crawl using Screaming Frog we do not see any of these old pages being linked to at our domain. We have a few different approaches we're considering to get Google to remove these pages from the SERPs and would welcome your input. Remove the 301 redirect entirely so that visits to those pages return a 404 (much easier) or a 410 (would require some setup/configuration via Wordpress). This of course means that anyone visiting those URLs won't be forwarded along, but Google may not drop those redirects from the SERPs otherwise. Request that Google temporarily block those pages (done via GWMT), which lasts for 90 days. Update robots.txt to block access to the redirecting directories. Thank you. Rosemary One year ago I removed a whole lot of junk that was on my web server but it is still appearing in the SERPs.
Technical SEO | | RosemaryB3 -
How to Remove /feed URLs from Google's Index
Hey everyone, I have an issue with RSS /feed URLs being indexed by Google for some of our Wordpress sites. Have a look at this Google query, and click to show omitted search results. You'll see we have 500+ /feed URLs indexed by Google, for our many category pages/etc. Here is one of the example URLs: http://www.howdesign.com/design-creativity/fonts-typography/letterforms/attachment/gilhelveticatrade/feed/. Based on this content/code of the XML page, it looks like Wordpress is generating these: <generator>http://wordpress.org/?v=3.5.2</generator> Any idea how to get them out of Google's index without 301 redirecting them? We need the Wordpress-generated RSS feeds to work for various uses. My first two thoughts are trying to work with our Development team to see if we can get a "noindex" meta robots tag on the pages, by they are dynamically-generated pages...so I'm not sure if that will be possible. Or, perhaps we can add a "feed" paramater to GWT "URL Parameters" section...but I don't want to limit Google from crawling these again...I figure I need Google to crawl them and see some code that says to get the pages out of their index...and THEN not crawl the pages anymore. I don't think the "Remove URL" feature in GWT will work, since that tool only removes URLs from the search results, not the actual Google index. FWIW, this site is using the Yoast plugin. We set every page type to "noindex" except for the homepage, Posts, Pages and Categories. We have other sites on Yoast that do not have any /feed URLs indexed by Google at all. Side note, the /robots.txt file was previously blocking crawling of the /feed URLs on this site, which is why you'll see that note in the Google SERPs when you click on the query link given in the first paragraph.
Technical SEO | | M_D_Golden_Peak0 -
Removing URL Parentheses in HTACCESS
Im reworking a website for a client, and their current URLs have parentheses. I'd like to get rid of these, but individual 301 redirects in htaccess is not practical, since the parentheses are located in many URLs. Does anyone know an HTACCESS rule that will simply remove URL parantheses as a 301 redirect?
Technical SEO | | JaredMumford0 -
Duplicate canonical URLs in WordPress
Hi everyone, I'm driving myself insane trying to figure this one out and am hoping someone has more technical chops than I do. Here's the situation... I'm getting duplicate canonical tags on my pages and posts, one is inside of the WordPress SEO (plugin) commented section, and the other is elsewhere in the header. I am running the latest version of WordPress 3.1.3 and the Genesis framework. After doing some testing and adding the following filters to my functions.php: <code>remove_action('wp_head', 'genesis_canonical'); remove_action('wp_head', 'rel_canonical');</code> ... what I get is this: With the plugin active + NO "remove action" - duplicate canonical tags
Technical SEO | | robertdempsey
With the plugin disabled + NO "remove action" - a single canonical tag
With the plugin disabled + A "remove action" - no canonical tag I have tried using only one of these remove_actions at a time, and then combining them both. Regardless, as long as I have the plugin active I get duplicate canonical tags. Is this a bug in the plugin, perhaps somehow enabling the canonical functionality of WordPress? Thanks for your help everyone. Robert Dempsey0