I have 404 errors but can't find where these links are?
-
The 4xx report had 0 errors, and then on the recent crawl it found over 200. They are all variations on real URLs e.g.:
Real URL:
http://www.bullseyeuk.com/10-up-deluxe-literature-holder.html
404 Error URL:
http://www.bullseyeuk.com/10-up-deluxe-literature-holder.html ��
None of them are linked to the root domain and I can't find where they are coming from.
Any ideas?
Thanks
Jack
-
I have found out where they're from! I exported the crawl report and saw under the referring column where the links come from. It's in a directory which I haven't blocked in the robots.txt, it's in the process of being changed so hopefully when the website is next crawled it won't find these URLs in the first place.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Finding a specific link - Duplicating my own content
Hi Mozzers, This may be a bit of a n00b question and i feel i should know the answer but alas, here i am asking. I have a page www.website.co.uk/page/ and im getting a duplicate page report of www.website.co.uk/Page/ i know this is because somewhere on my website a link will exists using the capitalised version. I have tried everything i can think of to find it but with no luck, any little tricks? I could always rewrite the urls to lowercase, but I have downloadable software etc also on the website that i dont want to take the capitals out of. So the best solution seems to be finding the link and remove it. Most link checkers I use treat the capitalised and non capitalised as the same thing so really arent helping lol.
Technical SEO | | ATP0 -
Site's IP showing WMT 'Links to My Site'
I have been going through, disavowing spam links in WMT and one of my biggest referral sources is our own IP address. Site: Covers.com
Technical SEO | | evansluke
IP: 208.68.0.72 We have recently fixed a number of 302 redirects, but the number of links actually seems to be increasing. Is this something I should ignore / disavow / fix using a redirect?0 -
Best way to handle pages with iframes that I don't want indexed? Noindex in the header?
I am doing a bit of SEO work for a friend, and the situation is the following: The site is a place to discuss articles on the web. When clicking on a link that has been posted, it sends the user to a URL on the main site that is URL.com/article/view. This page has a large iframe that contains the article itself, and a small bar at the top containing the article with various links to get back to the original site. I'd like to make sure that the comment pages (URL.com/article) are indexed instead of all of the URL.com/article/view pages, which won't really do much for SEO. However, all of these pages are indexed. What would be the best approach to make sure the iframe pages aren't indexed? My intuition is to just have a "noindex" in the header of those pages, and just make sure that the conversation pages themselves are properly linked throughout the site, so that they get indexed properly. Does this seem right? Thanks for the help...
Technical SEO | | jim_shook0 -
Strange 404 Error(Answered)
Hi everyone! I recently took over a new account and I was running an initial crawl on the site and a weird 404 error popped up. http://www.directcolors.com/products/liquid-colored-antique/top
Technical SEO | | rblake
http://www.directcolors.com/applications/concrete-antiquing/top
http://www.directcolors.com/applications/concrete-countertops/top I understand that the **top **could be referring to an actual link that brings users to the top of a page, but on these pages there is no such link. Am I missing something?1 -
Unfindable 404's
So I have noticed that my site has some really strange 404's that are only being linked to from internal links from the site.
Technical SEO | | Adamshowbiz
When I go to the pages that Web master tools suggests I can't actaully find the link which is pointing to the 404. In that instance what do you do? Any help would be much appreciated 🙂0 -
404 Errors & Redirection
Hi, I'm working with someone who recently had two websites redesigned. The old permalink structure consisted of domain/year/month/date/post-name. Their developer changed the new permalink structure to domain/post-name, but apparently he didn't redirect the old URLs to the new ones so we're finding that links from external sites result in 404 errors (once I remove the date in the URL, the links work fine). Each site has 3-4 years worth of blog posts, so there are quite a few that would need to be changed. I was thinking of using the Redirection plugin - would that be the best way to fix this sitewide on both sites?Any suggestions would be appreciated. Thanks, Carolina
Technical SEO | | csmm0 -
Unnatural Link Warning Removed - WMT's
Hi, just a quick one. We had an unnatural link warning for one of our test sites, the message appeared on the WMT's dashboard. The message is no longer there, has it simply expired or could this mean that Google no longer sees an unatural backlink profile? Hoping it's the latter but doubtful as we haven't tried to remove any links.. as I say it's just a test site. Thanks in advance!
Technical SEO | | Webpresence0 -
How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
Today's sitemap webinar made me think about the disallow feature, seems opposite of sitemaps, but it also seems both are kind of ignored in varying ways by the engines. I don't need help semantically, I got that part. I just can't seem to find a contemporary answer about what should be blocked using the robots.txt file. For example, I have folders containing site comps for clients that I really don't want showing up in the SERPS. Is it better to not have these folders on the domain at all? There are also security issues I've heard of that make sense, simply look at a site's robots file to see what they are hiding. It makes it easier to hunt for files when they know the directory the files are contained in. Do I concern myself with this? Another example is a folder I have for my xml sitemap generator. I imagine google isn't going to try to index this or count it as content, so do I need to add folders like this to the disallow list?
Technical SEO | | SpringMountain0