Skip to content
    Moz logo Menu open Menu close
    • Products
      • Moz Pro
      • Moz Pro Home
      • Moz Local
      • Moz Local Home
      • STAT
      • Moz API
      • Moz API Home
      • Compare SEO Products
      • Moz Data
    • Free SEO Tools
      • Domain Analysis
      • Keyword Explorer
      • Link Explorer
      • Competitive Research
      • MozBar
      • More Free SEO Tools
    • Learn SEO
      • Beginner's Guide to SEO
      • SEO Learning Center
      • Moz Academy
      • SEO Q&A
      • Webinars, Whitepapers, & Guides
    • Blog
    • Why Moz
      • Agency Solutions
      • Enterprise Solutions
      • Small Business Solutions
      • Case Studies
      • The Moz Story
      • New Releases
    • Log in
    • Log out
    • Products
      • Moz Pro

        Your all-in-one suite of SEO essentials.

      • Moz Local

        Raise your local SEO visibility with complete local SEO management.

      • STAT

        SERP tracking and analytics for enterprise SEO experts.

      • Moz API

        Power your SEO with our index of over 44 trillion links.

      • Compare SEO Products

        See which Moz SEO solution best meets your business needs.

      • Moz Data

        Power your SEO strategy & AI models with custom data solutions.

      NEW Keyword Suggestions by Topic
      Moz Pro

      NEW Keyword Suggestions by Topic

      Learn more
    • Free SEO Tools
      • Domain Analysis

        Get top competitive SEO metrics like DA, top pages and more.

      • Keyword Explorer

        Find traffic-driving keywords with our 1.25 billion+ keyword index.

      • Link Explorer

        Explore over 40 trillion links for powerful backlink data.

      • Competitive Research

        Uncover valuable insights on your organic search competitors.

      • MozBar

        See top SEO metrics for free as you browse the web.

      • More Free SEO Tools

        Explore all the free SEO tools Moz has to offer.

      What is your Brand Authority?
      Moz

      What is your Brand Authority?

      Check yours now
    • Learn SEO
      • Beginner's Guide to SEO

        The #1 most popular introduction to SEO, trusted by millions.

      • SEO Learning Center

        Broaden your knowledge with SEO resources for all skill levels.

      • On-Demand Webinars

        Learn modern SEO best practices from industry experts.

      • How-To Guides

        Step-by-step guides to search success from the authority on SEO.

      • Moz Academy

        Upskill and get certified with on-demand courses & certifications.

      • SEO Q&A

        Insights & discussions from an SEO community of 500,000+.

      Unlock flexible pricing & new endpoints
      Moz API

      Unlock flexible pricing & new endpoints

      Find your plan
    • Blog
    • Why Moz
      • Small Business Solutions

        Uncover insights to make smarter marketing decisions in less time.

      • Agency Solutions

        Earn & keep valuable clients with unparalleled data & insights.

      • Enterprise Solutions

        Gain a competitive edge in the ever-changing world of search.

      • The Moz Story

        Moz was the first & remains the most trusted SEO company.

      • Case Studies

        Explore how Moz drives ROI with a proven track record of success.

      • New Releases

        Get the scoop on the latest and greatest from Moz.

      Surface actionable competitive intel
      New Feature

      Surface actionable competitive intel

      Learn More
    • Log in
      • Moz Pro
      • Moz Local
      • Moz Local Dashboard
      • Moz API
      • Moz API Dashboard
      • Moz Academy
    • Avatar
      • Moz Home
      • Notifications
      • Account & Billing
      • Manage Users
      • Community Profile
      • My Q&A
      • My Videos
      • Log Out

    The Moz Q&A Forum

    • Forum
    • Questions
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. Home
    2. Digital Marketing
    3. Web Design
    4. Problems preventing Wordpress attachment pages from being indexed and from being seen as duplicate content.

    Moz Q&A is closed.

    After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

    Problems preventing Wordpress attachment pages from being indexed and from being seen as duplicate content.

    Web Design
    2
    4
    3959
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with question management privileges can see it.
    • SEOguy1
      SEOguy1 last edited by

      Hi

      According to a Moz Crawl, it looks like the Wordpress attachment pages from all image uploads are being indexed and seen as duplicate content..or..is it the Yoast sitemap causing it? I see 2 options in SEO Yoast:

      1. Redirect attachment URLs to parent post URL.
      2. Media...Meta Robots: noindex, follow

      I set it to (1) initially which didn't resolve the problem.  Then I set it to option (2) so that all images won't be indexed but search engines would still associate those images with their relevant posts and pages.

      However, I understand what both of these options (1) and (2) mean, but because I chose option 2, will that mean all of the images on the website won't stand a chance of being indexed in search engines and Google Images etc?

      As far as duplicate content goes, search engines can get confused and there are 2 ways for search engines
      to reach the correct page content destination. But when eg Google makes the wrong choice a portion of traffic drops off (is lost hence errors) which then leaves the searcher frustrated, and this affects the seo and ranking of the site which worsens with time.

      My goal here is - I would like all of the web images to be indexed by Google, and for all of the image attachment pages to not be indexed at all (Moz shows the image attachment pages as duplicates and the referring site causing this is the sitemap url which Yoast creates) ; that sitemap url has been submitted to the search engines already and I will resubmit once I can resolve the attachment pages issues..

      Please can you advise.

      Thanks.

      1 Reply Last reply Reply Quote 0
      • SEOguy1
        SEOguy1 last edited by

        Hi Kate,

        Here is an update as to what is happening so far. Please excuse the length of this message.

        • The database according to the host is fine (please see below) but WordPress is still calling https:

        • In the WP database wp-actions, http is definitely being called* All certificates are ok and SSL is not active* The WordPress database is returning properly* The WP database mechanics are ok* The WP config-file is not doing https returns, it is calling http correctly

        • They said that the only other possibility could be one of the plugins causing the problem. But how can a plugin cause https problems?...I can see 50 different https pages indexed in Google.  Bing has been checked and there are no https pages indexed there. All internal urls always have been http only and that is still the case.

        • I have Google fetched the website pages and in the 50 https pages most are images which I think probably must have came from the Yoast sitemap which was originally submitted to the search engines (more recently though I have taken all media image url's out of the Yoast sitemap and put noindex, follow on all image attachments files (the pages and the images on the pages will still be crawled and indexed in Google and search engines, it just means that any image url's won't. What will happen to those unwanted https files though? If I place rel canonical links on the pages that matter will the https pages drop out of the index eventually? I just wish I could find what is causing it (analogy: best to fix a hole in a roof to stop having to use a bowl to catch the water each time it rains).

        • ** I looked at analytics today and saw something really interesting (see attached image) - you can see 5 instances of the trailing slash at the home page and to my knowledge there should only be 1 for a website. The Moz Crawl shows just 1 home domain  http://example.co.uk/ so I am somewhat confused. Google search results showed 256 results for https url references, and there were 50 available to click on. So perhaps there are 50 https pages being referenced for each trailing slash (could there be 4 other trailing slash duplicate pages indexed and how would I fix it if that is the case?). This might sound naive but I don't have the skillset to fix this at this time so any help and advice would be appreciated.

        • Would Search and Replace plugin help at all or would it be a waste of time since the WordPress database mechanics seem to be ok.

        • I can't place any https to http 301 redirects for the 50 https url's that are indexed in Google, and I can't add any https rewrite rules in htaccess since that type of redirect will only work if a SSL is active. I already tried several redirect rules in htaccess and as expected they wouldn't work which again would probably mean that the SSL is not active for the site.

        • When https is entered instead of http, there should be an automatic resolve to http without me having to worry about that, but I tried again and the https version with a red diagonal line through it appears instead. The problem is that once a web visitor lands on that page they stay in that land of https (visually the main nav bar contents stretch across the page and the images and videos don't appear), and so the traffic will drop off..so hence a bad experience for the user and dropped traffic, decreasing income and bad for seo (split page juice, decreased rankings). There are no crawl errors in Google Search Console and Analytics shows Google Fetch completed for all pages - but when I request fetch and render for the home page it shows as partial instead of completed.

        • I don't want to request any https url removals through Google and search engines - it's not recommended because Google states that http version could be removed as well as https.

        • I did look at this last week:

        http://www.screamingfrog.co.uk/5-easy-steps-to-fix-secure-page-https-duplicate-content/

        • Do you think that the https urls are indexed because of links pointing to the site are using https?  Perhaps most of the backlinks are https but the preferred setting in Webmaster Tools / Search Console is already set to the non-www version instead of the www version; there has never been a https version of the site.

        • This was one possibility re duplicate content. Here are two pages and the listed duplicates:

        • The first Moz crawl I ever requested came back with hundreds of duplicate errors and I have resolved this. Google crawl had not picked this up previously (so I figured everything had been ok) and it was only realised after that Moz crawl. So https links were seen to be indexed and so the goals are to stop the root cause of the problem and to fix the damage so that any https url's can drop off out of the serps and the index.

        • I considered that the duplicate links in question might not be considered as true duplicates as such - it is actually just that the duplicate pages (these were page attachments created by WordPress for each image uploaded to the site) have no real content so the template elements outweighed the actual unique content elements which was flagging them as duplicates in the moz tool. So I thought that these were unlikely to hurt as they were not duplicates as such but they were indexed thin content. I did a content audit and tidy tidied things up as much as I could (blank pages and weak ones) hence the new recent sitemap submission and fetch to Google.

        • I have already redirected all attachments to the parent page in Yoast, and removed all attachments from the Yoast sitemap and set all media content (in Yoast) to 'noindex, follow'.

        • Naturally it's really important to eliminate the https problem before external backlinks link back to any of the unwanted https pages that are currently indexed. Luckily I haven't started any backlinking work yet, and any links I have posted in search land have all been http version.  As I understand it, most server configurations should redirect by default to http when https isn’t configured, so I am confused as to where to take this especially as the host has given the WP database the all clear.

        • It could be  taxonomies related to the theme or a slider plugin as I have learned these past few weeks. Disallowing and deindexing those unwanted http URLs would be amazing since I have so far spent weeks already trying to get to the bottom of the problem.

        • Ideally I understand from previous weeks that these 2 things would be very important:

        (1)301 redirects from http to https (the host in this case cannot enable this directly through their servers and I can only add these redirects in the htaccess file if there is an active SSL in place).(2)Have in place a canonical url using http for both the http and https variations. Both of those solutions might work on their own and if the 301 redirect can't work with the host then the canonical will fix it?  I saw that I could just set a canonical with a fixed transport protocol of http:// - then Google will then sort out the rest. Not preferred from a crawl perspective but would suffice? (Even so I don't know how to put that in place).

        • There are around 180 W3C validation errors. Would it help matters to get these fixed? Would this help to fix the problem do you know? The homepage renders with critical errors and a couple of warnings.

        • The 907 Theme scores well for its concept and functionality but its SEO reviews aren't that great.

        • Duplicate problems are not related to the W3 Total Cache plugin which is one of the plugins in place.

        • Regarding addons (trailing slash): Example: http://domain.co.uk/events redirects to http://domain.co.uk/events/  the addon must only do it on active urls - even if it didn't there were no reports of  / duplicate errors in the Moz Crawl so its a different issue that would need looking at separately I would think.

        • At the bottom of each duplicate page there is an option for noindex. There are page sections and parallax sections that make up the home page, and each has to be published to become a live part of the home page. This isn't great for SEO I understand that because only the top page section is registered in Yoast as being the home page the other sections on the home page are not crawled as part of the home page but are instead separate page sections. Is it ok to index those page sections? If I noindex, follow them would that be good practice here. The theme does not auto block the page section from appearing in search engines.

        • Can noindex only be put on whole pages and not the specific page sections? I just want to make sure that the content on all the pages (media and text) and page sections are crawlable.

        • To ultimately fix the https problem re indexed pages out there could this eventually be a case of having to add SSL to the site just because there is no better way - just so the https to http redirect rule can be added to the htaccess file? If so, I don't think that would fix the root cause of the problem, but the root cause could be one of the plugins? Confused.

        • With Canonical url's does that mean the https links that don't have canonicals will deindex eventually? Are the https links giving a 404 (I'm worried because normally 404's need 301's as you know and I can't put a 301 on a https url in this situation). Do I have to do set a canonical for every single page on the website because of the extent of the problem that has occurred?

        • Nearly all of the traffic is being dropped after visiting the home page, and I can't for the life of me see why. Is it because of all these https pages? Once canonicals are in place how long will it take for everything to return to how it should be? Is it worthwhile starting a ppc campaign or should I wait until everything has calmed down on the site?

        • Is this a case of setting the canonical URL and then the rest will sort itself out? (please see the screenshot attached regarding the 5 home pages that each have a trailing slash).

        • This is the entire current situation. I understand this might not be so straight forward but I would really appreciate help as the site continues to drop traffic and income.  Others will be able to learn from this string of questions and responses too. Thank you for reading this far and have a nice day.  Kind Regards,

        1 Reply Last reply Reply Quote 0
        • SEOguy1
          SEOguy1 last edited by

          Hi Paul

          I did (1) which did not resolve the problem, so I then set media to noindex. follow

          I have already exclude attachment URLs from sitemap

          When you say: When adding media, make certain the Link to box does NOT point to the attachment page. Are you saying to edit all the link settings to current images, or do you mean for future image uploads? Or in both cases?

          Thanks

          1 Reply Last reply Reply Quote 0
          • ThompsonPaul
            ThompsonPaul last edited by

            In order to accomplish your goal, setup Yoast SEO to:

            1. redirect attachment URLs to parent post
            2. exclude attachment URLs from sitemap (it's a checkbox under the Post Types tab in the XML Sitemaps section of Yoast SEO Settings)
            3. leave all media indexed and followed.
            4. When adding media, make certain the Link to box does NOT point to the attachment page.

            What this accomplished is to allow the actual image file to still be indexed and hence show up in Image search. It also ensures that the pointless image attachment pages don't waste crawl budget and don't show up to the search crawlers as thin/dupe content. Win!

            Hope that helps?

            Paul

            1 Reply Last reply Reply Quote 1
            • 1 / 1
            • First post
              Last post

            Got a burning SEO question?

            Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.


            Start my free trial


            Browse Questions

            Explore more categories

            • Moz Tools

              Chat with the community about the Moz tools.

            • SEO Tactics

              Discuss the SEO process with fellow marketers

            • Community

              Discuss industry events, jobs, and news!

            • Digital Marketing

              Chat about tactics outside of SEO

            • Research & Trends

              Dive into research and trends in the search industry.

            • Support

              Connect on product support and feature requests.

            • See all categories

            Related Questions

            • RyanUK

              Do Wordpress sites outrank SquareSpace?

              I was a big fan of Wordpress. I used it for 10 years. However, because I run a very small business, the constant upkeep needed on WP in the end started to frustrate me in the end, so I moved to SquareSpace. However, I am beginning to question my decision, as one of my sites is struggling really badly, and I mean badly. The other sites are okay. So I started asking around, and most people are saying there shouldn't be a difference. A few people have said their Wordpress sites always outranks their SquareSpace sites. Then I read what Rand Fishkin said in the below Twitter thread, now I am even more confused. I am very reluctant to move to Wordpress, its just so much hassle. But at the same time, if a site doesn't get much traffic then it's useless. https://twitter.com/drew_pickard/status/991659074134556673 https://twitter.com/randfish/status/991974456477278209 Please let me know your thoughts and experience.

              Web Design | | RyanUK
              0
            • vtmoz

              Dead end pages are really an issue?

              Hi all, We have many pages which are help guides to our features. These pages do not have anymore outgoing links (internal / external). We haven't linked as these are already 4th level pages and specific about particular topic. So these are technically dead end pages. Do these pages really hurt us? We need to link to some other pages? Thanks

              Web Design | | vtmoz
              0
            • A_Fotografy

              Should i be using shortcodes for my my page content.

              Hello, I have a question. Sorry if this is been answered before. Recently I decided to do a little face lift to my main website pages. I wanted to make my testimonials more pretty. Found this great plugin for testimonials which creates shortcodes. I love how it looks like, but just realised that when I use images in shortcodes, these are not picked up by search engines 😞 only text is. Image search ability is pretty important for me and I'm not sure if I should stick with my plain design and upload images manually with all alt tags and title tags or there is a way to adjust shortcode so it shows images to search engines. You can see example here. https://a-fotografy.co.uk/maternity-photographer-edinburgh/ Let me know your thoughts guys. Regards, Armands

              Web Design | | A_Fotografy
              1
            • SEOguy1

              Https pages indexed but all web pages are http - please can you offer some help?

              Dear Moz Community, Please could you see what you think and offer some definite steps or advice.. I contacted the host provider and his initial thought was that WordPress was causing the https problem ?: eg when an https version of a page is called, things like videos and media don't always show up. A SSL certificate that is attached to a website, can allow pages to load over https. The host said that there is no active configured SSL it's just waiting as part of the hosting package just in case, but I found that the SSL certificate is still showing up during a crawl.It's important to eliminate the https problem before external backlinks link to any of the unwanted https pages that are currently indexed. Luckily I haven't started any intense backlinking work yet, and any links I have posted in search land have all been http version.I checked a few more url's to see if it’s necessary to create a permanent redirect from https to http. For example, I tried requesting domain.co.uk using the https:// and the https:// page loaded instead of redirecting automatically to http prefix version.  I know that if I am automatically redirected to the http:// version of the page, then that is the way it should be. Search engines and visitors will stay on the http version of the site and not get lost anywhere in https. This also helps to eliminate duplicate content and to preserve link juice. What are your thoughts regarding that?As I understand it, most server configurations should redirect by default when https isn’t configured, and from my experience I’ve seen cases where pages requested via https return the default server page, a 404 error, or duplicate content. So I'm confused as to where to take this.One suggestion would be to disable all https since there is no need to have any traces to SSL when the site is even crawled ?. I don't want to enable https in the htaccess only to then create a https to http rewrite rule; https shouldn't even be a crawlable function of the site at all.RewriteEngine OnRewriteCond %{HTTPS} offor to disable the SSL completely for now until it becomes a necessity for the website.I would really welcome your thoughts as I'm really stuck as to what to do for the best, short term and long term.Kind Regards

              Web Design | | SEOguy1
              0
            • LuaMarketing2

              Are pages not included in navigation given less "weight"

              Hi, we recently updated our website and our main navigation was dramatically slimmed down to just three pages and no drop down under those. Yet we have many more important pages, which are linked to once on one of those main three pages. However, will this hurt those other pages because they are not included in navigation (some of which were starting to get good traction in rankings)?
              Thanks!

              Web Design | | LuaMarketing2
              0
            • Kingalan1

              Lots of Listing Pages with Thin Content on Real Estate Web Site-Best to Set them to No-Index?

              Greetings Moz Community: As a commercial real estate broker in Manhattan I run a web site with over 600 pages. Basically the pages are organized in the following categories: 1. Neighborhoods (Example:http://www.nyc-officespace-leader.com/neighborhoods/midtown-manhattan)  25 PAGES Low bounce rate 2. Types of Space (Example:http://www.nyc-officespace-leader.com/commercial-space/loft-space) 
              15 PAGES Low bounce rate. 3. Blog (Example:http://www.nyc-officespace-leader.com/blog/how-long-does-leasing-process-take 
              30 PAGES Medium/high bounce rate 4. Services (Example:http://www.nyc-officespace-leader.com/brokerage-services/relocate-to-new-office-space)  High bounce rate
              3 PAGES 5. About Us (Example:http://www.nyc-officespace-leader.com/about-us/what-we-do
              4 PAGES High bounce rate 6. Listings (Example:http://www.nyc-officespace-leader.com/listings/305-fifth-avenue-office-suite-1340sf)
              300 PAGES High bounce rate (65%), thin content 7. Buildings (Example:http://www.nyc-officespace-leader.com/928-broadway
              300 PAGES  Very high bounce rate (exceeding 75%) Most of the listing pages do not have more than 100 words.  My SEO firm is advising me to set them "No-Index, Follow". They believe the thin content could be hurting me. Is this an acceptable strategy? I am concerned that when Google detects 300 pages set to "No-Follow" they could interpret this as the site seeking to hide something and penalize us. Also, the building pages have a low click thru rate. Would it make sense to set them to "No-Follow" as well? Basically, would it increase authority in Google's eyes if we set pages that have thin content and/or low click thru rates to "No-Follow"? Any harm in doing this for about half the pages on the site? I might add that while I don't suffer from any manual penalty volume has gone down substantially in the last month. We upgraded the site in early June and somehow 175 pages were submitted to Google  that should not have been indexed. A removal request has been made for those pages. Prior to that we were hit by Panda in April 2012 with search volume dropping from about 7,000 per month to 3,000 per month. Volume had increased back to 4,500 by April this year only to start tanking again. It was down to 3,600 in June. About 30 toxic links were removed in late April and a disavow file was submitted with Google in late April for removal of links from 80 toxic domains. Thanks in advance for your responses!! Alan

              Web Design | | Kingalan1
              0
            • markadoi84

              URLs with Hashtags - Does Google Index Them?

              Hi there, I have a potential issue with a site whereby all pages are dynamically populated using Javascript.  Thus, an example of an URL on their site would be www.example.com/#!/category/product. I have read lots of conflicting information on the web - some says Google will ignore everything after the hashtag; other people say that Google will now index everything after the hashtag. Does anybody have any conclusive information about this?  Any links to Google or Matt Cutts as confirmation would be brilliant. P.S. I am aware about the potential issue of duplicate content, but I can assure you that has been dealt with.  I am only concerned about whether Google will index full URLs that contain hashtags. Thanks all! Mark

              Web Design | | markadoi84
              0
            • Melia

              Indexing Dynamic Pages

              Hi, I am having an issues among others, regarding indexing dynamic pages. Our website, www.me-by-melia, was just put live and I am concerned the bottom naviagtion pages (http://www.me-by-melia.com/#store, http://www.me-by-melia.com/#facebook, etc) will not be indexed and create duplicate pages. Also, when you open these pages in a new tab, it takes you to homepage. The website was created in HTML5. Please advise.

              Web Design | | Melia
              0

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy

            Looks like your connection to Moz was lost, please wait while we try to reconnect.