Dulpicate Content being reported
-
Hi
I have a new client whose first MA crawl report is showing lots of duplicate content.
The main batch of these are all the HP url with an 'attachment' part at the end such as:
www.domain.com/?attachment_id=4176
As far as i can tell its some sort of slide show just showing a different image in the main frame of each page, with no other content. Each one does have a unique meta title & H1 though.
Whats the best thing to do here ?
-
Not a problem and leave as is
-
Use the paremeter handling tool in GWT
-
Canonicalise, referencing the HP
or other solution ?
Many Thanks
Dan
-
-
Hi Dan,
Actually it looks like ctrl L will do it (you are creating an excel table). You usually need to erase the first few rows from the export so you have the column header in row 1 and then select all and create the table checking the 'my table has headers' so that you can then filter using the headers
-
Sorry Lynn but what is the 'windows' bit in control-windows-L since cant see on my keyboard, can it have a different icon/symbol etc?
-
Great stuff thanks Lynn !! Ill tell their dev to do that
many many thanks
All Best
Dan
-
cool cheers Don
-
Hi Dan,
The robots must be getting the urls from somewhere so it is worth finding out where. If you download the moz report in csv and open in excel you can control-windows-L to get a filterable list. If you filter for duplicates and find these urls on the left then on the far right it should reference where they are being linked from. I suspect you will find pages in the site that have these images in them and are linking to the attachment_id urls (often it is from gallery pages).
Once you have found the pages, then try applying the yoast redirects and see if they work as expected (ie redirect the attachment_id links to the relevant gallery page for example). Ideally you would get rid of the links completely from the code - this will probably need a bit of dev work on the template but should be pretty straightforward since you are likely just removing the A tag from around the images.
-
Gotcha, definitely don't want to nix pages then. I would imagine Lynn's response is more appropriate then, it is likely that he is using a plugin that has been updated to better SEO practices that he hasn't yet updated.
-
Many thanks Don
ill ask client but dont think so (doubt any links pointing to them) but due to varying kw rich meta titles and h1's think client may have implemented this for some seo reason (hes very seo savvy but bit old school) prob not aware needs more content on page beyond a pic & some meta & an h1.
On a side note do you think these could be dragging sites rankings down (there are 350 of them) ?
All Best
Dan
-
Thanks Lyn
Yes it is wp i think
If i click on the image it loads page with image (another duplicate) in the series next
I'm not sure what the normal page is since can only find these via the cralw reports, they dont seem to be linked to in any site nav etc
Does that sound to you then like best solution is via Yoast redirects etc ?
On a side note do you think these could be dragging sites rankings down (there are 350 of them) ?
Cheers
Dan
-
Hi Dan,
If these pages have no SEO value then you can just stop them from being crawled, thus preventing any duplicate content penalties. If you see some backlinks (SEO value) to any of these then I would use Canonical.
robots.txt
User-agent:: *
Disallow: /*attachment_id
Hope it helps,
Don
-
Hi Dan,
Is the site running wordpress? If so it sounds like maybe a badly coded template which is showing links somewhere in the code to the attachments (if you click on the image in its normal page does it take you to the duplicate url you mention?). It would be best to find out where the linking is happening and correct it so the links are removed if at all possible. The Yoast plugin also has a setting where you can redirect attachment ids to their related post (its in the permalinks settings of the yoast plugin) - that might help solve the problem.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Collapsible sections - content
**Hi,****I am looking to improve the aesthetics of some pages on my website by adding written content into collapsible tabs. I was wondering whether the content that is ‘hidden’ by tabs is given less weight by Google from the perspective of SEO? **Some articles I have read suggest that tabbed content is weighted equally with the content which is already immediately visible to the user, but others suggest that this may not be the case. **Please, can I request opinions on the matter? Any advice would be greatly appreciated, many thanks.**Katarina
Technical SEO | | Katarina-Borovska0 -
Duplicate content and 404 errors
I apologize in advance, but I am an SEO novice and my understanding of code is very limited. Moz has issued a lot (several hundred) of duplicate content and 404 error flags on the ecommerce site my company takes care of. For the duplicate content, some of the pages it says are duplicates don't even seem similar to me. additionally, a lot of them are static pages we embed images of size charts that we use as popups on item pages. it says these issues are high priority but how bad is this? Is this just an issue because if a page has similar content the engine spider won't know which one to index? also, what is the best way to handle these urls bringing back 404 errors? I should probably have a developer look at these issues but I wanted to ask the extremely knowledgeable Moz community before I do 🙂
Technical SEO | | AliMac260 -
Sharing/hosting of content questions...
I just wanted to get opinion on some of the fundamentals and semantics of optimisation and content generation/distribution - your thoughts and opinions are welcome. OK, for example, lets assume (for illustration purposes) that I have a site - www.examplegolfer.com aimed at golfers with golf related content. The keywords I would like to optimise for are: golf balls golf tees lowering your golf handicap drive a golf ball further Now, I'm going to be creating informative, useful content (infographics, articles, how to guides, video demonstrations etc) centred around these topics/keywords, which hopefully our audience/prospects will find useful and bookmark, share and monition our site/brand on the web, increasing (over time) our position of these terms/keywords in the SERP's. Now, once I've researched and created my content piece, where should I place it? Let's assume it's an infographic - should this be hosted on an infographic sharing site (such as Visually) or on my site, or both? If it's hosted or embedded on my site, should this be in a blog or on the page I'm optimising for (and I've generated my keyword around)? For example, if my infographic is around golf balls, should this be embedded on the page www.examplegolfer.com/golf-balls (the page I'm trying to optimise) and if so, and it's also placed elsewhere around the internet (i.e on Visually for example), this could technically be seen as duplicated content as the infographic is on my site and on Visually (for example)? How does everyone else share/distribute/host their created content in various locations whilst avoiding the duplicated content issue? Or have I missed something? Also, how important is it to include my keyword (golf balls) in the pieces' title or anchor text? Or indeed within the piece itself? One final question - should the content by authoured/shared as the brand/company or an individual (spokesperson if you like) on behalf of the company (i.e. John Smith)? I'm all for creating great, interesting, useful content for my audience, however I want to ensure we're getting the most out of it as researching influencers, researching the piece and creating it and distributing it isn't a quick or easy job (as we all know!). Thoughts and comments welcome. Thanks!
Technical SEO | | Carl2870 -
#hashtag Anchor text within content
Hi, i have a question about anchor text within my sites content. It 'jumps' to content displayed further down the page via a side navigation at the top. These links don't take you away to any other page, instead take you further down the page to the relavent content. My question is this: I've noticed in the URL that the anchor text - #jumpnavlink is placed at the end of the pages URL like so.. www.mywebsite.com/example-page.php#jumpnavlink Is this creating a problem for duplicate content?
Technical SEO | | SeoSheikh
Is it creating a new URL for viewers to use?
Is it ok to have lots of these running throughout my sites content pages? Many thanks for any light that is shed on this one! Cheers
Alex0 -
Duplicate Content and URL Capitalization
I have multiple URLs that SEOMoz is reporting as duplicate content. The reason is that there are characters in the URL that may, or may not, be capitalized depending on user input. A couple examples are: www.househitz.com/Pennsylvania/Houses-for-sale www.househitz.com/Pennsylvania/houses-for-sale www.househitz.com/Pennsylvania/Houses-for-rent www.househitz.com/Pennsylvania/houses-for-rent There are currently thousands of instances of this on the site. Is this something I should spend effort to try and resolve (may not be minor effort), or should I just ignore it and move on?
Technical SEO | | Jom0 -
How do I get rid of duplicate content
I have a site that is new but I managed to get it to page one. Now when I scan it on SEO Moz I see that I have duplicate content. Ex: www.mysite.com, www.mysite.com/index and www.mysite.com/ How do I fix this without jeopardizing my SERPS ranking? Any tips?
Technical SEO | | bronxpad0 -
Canonical pagination content
Hello We have a large ecommerce site, as you are aware that ecommerce sites has canonical issues, I have read various sources on how best to practice canonical on ecommerce site but I am not sure yet.. My concert is pagination where I am on category product listing page.. the pagination will have all different product not same however the meta data will be same so should I make let's say page 2 or 3 to main category page or keep them as is to index those pages? Another issue is using filters, where I am on any page and I filter by price or manufacturer basically the page will be same so here It seems issue of duplicate content, so should I canonical to category page only for those result types? So basically If I let google crawl my pagination content and I only canonical those coming with filter search result that would be best practice? and would google webmaster parameter handling case would be helpful in this scenario ? Please feel free to ask in case you have any queries regards
Technical SEO | | CNMOnline28
Carl0 -
Duplicate content
This is just a quickie: On one of my campaigns in SEOmoz I have 151 duplicate page content issues! Ouch! On analysis the site in question has duplicated every URL with "en" e.g http://www.domainname.com/en/Fashion/Mulberry/SpringSummer-2010/ http://www.domainname.com/Fashion/Mulberry/SpringSummer-2010/ Personally my thoughts are that are rel = canonical will sort this issue, but before I ask our dev team to add this, and get various excuses why they can't I wanted to double check i am correct in my thinking? Thanks in advance for your time
Technical SEO | | Yozzer0