Mapping Internal Links (Which are causing duplicate content)
-
I'm working on a site that is throwing off a -lot- of duplicate content for its size. A lot of it appears to be coming from bad links within the site itself, which were caused when it was ported over from static HTML to Expression Engine (by someone else).
I'm finding EE an incredibly frustrating platform to work with, as it appears to be directing 404's on sub-pages to the page directly above that subpage, without actually providing a 404 response. It's very weird.
Does anyone have any recommendations on software to clearly map out a site's internal link structure so that I can find what bad links are pointing to the wrong pages?
-
You could try: http://validator.w3.org/checklink
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content - working with CMS constraints
Hi, We use an industry-specific CMS and I'm struggling to figure out how we can fix duplicate content issues. Thankfully, the vendor has agreed to work on 301 vs 302 redirects. However, they aren't currently able to give us the ability to add rel=canonical tags to page headers (we've put it in their "suggestion box" which tends to take a long time, if ever, to materialize). My understanding is that the tag will not be recognized if it's in the body code, correct? (aka the part of the page we can edit from the CMS) Is there anything else I can do?
Technical SEO | | combska0 -
Https Duplicate Content
My previous host was using shared SSL, and my site was also working with https which I didn’t notice previously. Now I am moved to a new server, where I don’t have any SSL and my websites are not working with https version. Problem is that I have found Google have indexed one of my blog http://www.codefear.com with https version too. My blog traffic is continuously dropping I think due to these duplicate content. Now there are two results one with http version and another with https version. I searched over the internet and found 3 possible solutions. 1 No-Index https version
Technical SEO | | RaviAhuja
2 Use rel=canonical
3 Redirect https versions with 301 redirection Now I don’t know which solution is best for me as now https version is not working. One more thing I don’t know how to implement any of the solution. My blog is running on WordPress. Please help me to overcome from this problem, and after solving this duplicate issue, do I need Reconsideration request to Google. Thank you0 -
Caps in URL creating duplicate content
Im getting a bunch of duplicate content errors where the crawl is saying www.url.com/abc has duplicate at www.url.com/ABC The content is in magento and the url settings are lowercase, and I cant figure out why it thinks there is duplicate consent. These are pages with a decent number of inbound links.
Technical SEO | | JohnBerger0 -
Duplicate Content and URL Capitalization
I have multiple URLs that SEOMoz is reporting as duplicate content. The reason is that there are characters in the URL that may, or may not, be capitalized depending on user input. A couple examples are: www.househitz.com/Pennsylvania/Houses-for-sale www.househitz.com/Pennsylvania/houses-for-sale www.househitz.com/Pennsylvania/Houses-for-rent www.househitz.com/Pennsylvania/houses-for-rent There are currently thousands of instances of this on the site. Is this something I should spend effort to try and resolve (may not be minor effort), or should I just ignore it and move on?
Technical SEO | | Jom0 -
404's and duplicate content.
I have real estate based websites that add new pages when new listings are added to the market and then deletes pages when the property is sold. My concern is that there are a significant amount of 404's created and the listing pages that are added are going to be the same as others in my market who use the same IDX provider. I can go with a different IDX provider that uses IFrame which doesn't create new pages but I used a IFrame before and my time on site was 3min w/ 2.5 pgs per visit and now it's 7.5 pg/visit with 6+min on the site. The new pages create new content daily so is fresh content and better on site metrics (with the 404's) better or less 404's, no dup content and shorter onsite metrics better? Any thoughts on this issue? Any advice would be appreciated
Technical SEO | | AnthonyLasVegas0 -
Duplicate content, how to solve?
I have about 400 errors about duplicate content on my seomoz dashboard. However I have no idea how to solve this, I have 2 main scenarios of duplication in my site: Scenario 1: http://www.theprinterdepo.com/catalogsearch/advanced/result/?name=64MB+SDRAM+DIMM+MEMORY+MODULE&sku=&price%5Bfrom%5D=&price%5Bto%5D=&category= 3 products with the same title, but different product models, as you can note is has the same price as well. Some printers use a different memory product module. So I just cant delete 2 products. Scenario 2: toners http://www.theprinterdepo.com/brother-high-capacity-black-toner-cartridge-compatible-73 http://www.theprinterdepo.com/brother-high-capacity-black-toner-cartridge-compatible-75 In this scenario, products have a different title but the same price. Again, in this scenario the 2 products are different. Thank you
Technical SEO | | levalencia10 -
Internal Linking: Site-wide VS Content Links
I just watched this video in which Matt Cutts talks about the ancient 100 links per page limit. I often encounter websites which have massive navigation (elaborate main menu, side bar, footer, superfooter...etc) in addition to content area based links. My question is do you think Google passes votes (PageRank and anchor text) differently from template links such as navigation to the ones in the content area, if so have you done any testing to confirm?
Technical SEO | | Dan-Petrovic0 -
Duplicate Content Home Page
Hello, I am getting Duplicate Content warning from SEOMoz for my home page: http://www.teacherprose.com http://www.teacherprose.com/index html I tried code below in .htaccess: redirect 301 /index.html http://www.teacherprose.com This caused error "too many re-directs" in browser Any thoughts? Thank You, Eric
Technical SEO | | monthelie10