Crawl issues/ .htacess issues
-
My site is getting crawl errors inside of google webmaster tools. Google believe a lot of my links point to index.html when they really do not. That is not the problem though, its that google can't give credit for those links to any of my pages. I know I need to create a rule in the .htacess but the last time I did it I got an error. I need some assistance on how to go about doing this, I really don't want to lose the weight of my links.
Thanks
-
WordPress does it automatically if you've got your permalinks set up.
WordPress .htaccess should look like this:
BEGIN WordPress
<ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule>END WordPress
And it should be .php not .html anyway in WP
Is WMT finding links to .html pages from pages on your site or from external links?
-
My site was done in wordpress so do I need to do anything differently. Also this code will redirect http://www.mysite.com/index.html to http://www.mysite.com?
Thanks a lot
-
Are you asking for the code to redirect index.html to / ?
This should work (put the whole thing in your .htaccess and replace example with your site)
RewriteEngine On Options +FollowSymLinks
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://www.example.com/$1 [R=301,L]However, you should also change your internal links to point to the redirected version (/) and not /index.html
-
Sean,
Here are some resources that I have for you
http://www.webforgers.net/mod-rewrite/mod-rewrite-syntax.php
http://roshanbh.com.np/2008/03/url-rewriting-examples-htaccess.html
Hope they help you in understanding how to go about .htaccess.
As far as I understand, you are facing the issue since you pointed yourinterlinking URLs to your .index.html page rather than your absolute URLs.
-
Could you please give some more details?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Home Page Being Indexed / Referral URLs /
I have a few questions related to home page URLs being indexed, canonicalization, and GA reporting... 1. I can view the home page by typing in domain.com , domain.com/ and domain.com/index.htm There are no redirects and it's canonicalized to point to domain.com/index.htm -- how important is it to have redirects? I don't want unnecessary redirects or canonical tags, but I noticed the trailing slash can sometimes be typed in manually on other pages, sometimes not. 2. When I do a site search (site:domain.com), sometimes the HP shows up as "domain.com/", never "domain.com/index.htm" or "domain.com", and sometimes the HP doesn't show up period. This seems to change several times a day, sometimes within 15 minutes. I have no idea what is causing it and I don't know if it has anything to do with #1. In a perfect world, I would ask for the /index.htm to be dropped and redirected to .com/, and the canonical to point to .com/ 3. I've noticed in GA I see / , /index.htm, and a weird Google referral URL (/index.htm?referrer=https://www.google.com/) all showing up as top pages. I think the / and /index.htm is because I haven't setup a default URL in GA, but I'm not sure what would cause the referrer. I tracked back when the referrer URL started to show up in the top pages, and it was right around the time they moved over to https://, so I'm not sure what the best option is to remove that. I know this is a lot - I appreciate any insight anyone can provide.
Technical SEO | | DigMS0 -
Duplicate Page Titles Issue in Campaign Crawl Error Report
Hello All! Looking at my campaign I noticed that I have a large number of 'duplicate page titles' showing up but all they are the various pages at the end of the URL. Such as, http://thelemonbowl.com/tag/chocolate/page/2 as a duplicate of http://thelemonbowl.com/tag/chocolate. Any suggestions on how to address this? Thanks!
Technical SEO | | Rich-DC0 -
Mirrored content/ images
We are currently in the process of creating a new website in place of our old site (same URL etc.) We've recently created another website which has the same design/ layout/ pictures and general site architecture as our new site will have. If I was to add alt test to images only on one site would we still be penalised by Google as the sites 'look' the same, event thought they will have completely different URL's and different focusses on a similar topic. Content will be different also, but both sites will focus on a similar subject. Thanks
Technical SEO | | onlinechester0 -
Strange Webmaster Tools Crawl Report
Up until recently I had robots.txt blocking the indexing of my pdf files which are all manuals for products we sell. I changed this last week to allow indexing of those files and now my webmaster tools crawl report is listing all my pdfs as not founds. What is really strange is that Webmaster Tools is listing an incorrect link structure: "domain.com/file.pdf" instead of "domain.com/manuals/file.pdf" Why is google indexing these particular pages incorrectly? My robots.txt has nothing else in it besides a disallow for an entirely different folder on my server and my htaccess is not redirecting anything in regards to my manuals folder either. Even in the case of outside links present in the crawl report supposedly linking to this 404 file when I visit these 3rd party pages they have the correct link structure. Hope someone can help because right now my not founds are up in the 500s and that can't be good 🙂 Thanks is advance!
Technical SEO | | Virage0 -
Noindex nofollow issue
Hi, For some reason 2 pages on my website, time to time get noindex nofollow tags they disappear from search engine, i have to log in my thesis wp theme and uncheck box for "noindex" "nofollow" and them update, in couple days my website is back up. here is screen shot http://screencast.com/t/A6V6tIr2Cb6 Is that something in thesis theme that cause the problem? even though i unchecked the box and updated but its still stays checked http://screencast.com/t/TnjDcYfsH4sq appreciated for your help!
Technical SEO | | tonyklu0 -
3 pages crawled?
For some reason, my account says it only crawled 3 pages this week, where its usually about 3K. This is my robots which shouldnt affect http://www.theprinterdepo.com/robots.txt and this is my site http://www.theprinterdepo.com any idea?
Technical SEO | | levalencia10 -
Http:// vs http://www.
Why is it that when I run an "On Page Optimization Keyword Report" for my website I get a different score when using http://www.tandmkitchens.com vs http://tandmkitchens.com. My keyword is "Kitchen Remodeling" http://www.tandmkitchens.com scores an A http://tandmkitchens.com scores a B It's the same page yet one url scores higher than the other. Any help! Thanks
Technical SEO | | fun52dig
Gary0 -
Duplicate Content Issue
Hello, We have many pages in our crawler report that are showing duplicate content. However, the content is not duplicateon the pages. It is somewhat close, but different. I am not sure how to fix the problem so it leaves our report. Here is an example. It is showing these as duplicate content to each other. www.soccerstop.com/c-119-womens.aspx www.soccerstop.com/c-120-youth.aspx www.soccerstop.com/c-124-adult.aspx Any help you could provide would be most appreciated. I am going through our crawler report and resolving issues, and this seems to be big one for us with lots in the report, but not sure what to do about it. Thanks
Technical SEO | | SoccerStop
James0