Slash at end of URL causing Google crawler problems
-
Hello,
We are having some problems with a few of our pages being crawled by Google and it looks like the slash at the end of the URL is causing the problem. Would appreciate any pointers on this.
We have a redirect in place that redirects the "no slash" URL to the "slash" URL for all pages. The obvious solution would be to try turning this off, however, we're unable to figure our where this redirect is coming from. There doesn't appear to be an instruction in our .htaccess file doing this, and we've also tried using "DirectorySlash Off" in the .htaccess file, but that doesn't work either. (if it makes a difference it is a 302 redirect doing this, not a 301)
If we can't get the above to work, then the other solution would be to somehow reconfigure the page so that it is recognizable with the slash at the end by Google. However, we're not sure how this would be done.
I think the quickest solution would be to turn off the "add slash" redirect. Any ideas on where this command might be hiding, and how to turn it off would be greatly appreciated. Or any tips from people who have had similar crawl problems with google and any workarounds would be great!
Thanks!
-
Satchmo does this automatically - http://www.satchmoproject.com/docs/dev/configuration.html?highlight=trailing slash - however, as far as I can see from the documentation and forums there's no way to disable it
I'm unfamiliar with Satchmo though, hit up the Google Group - http://groups.google.com/group/satchmo-users/topics - and ask there.
-
Thanks, Ryan -- we're taking a look into this right now, and will let you know how it goes!
-
I think we should rule out the possibility that your CMS or a SEO extension or other add-on for your CMS is adjusting your URLs.
Can you add a page to your site at your root that is not part of your CMS? Drop in a test.html file and see what happens.
-
Hi Ryan -- thanks for your help.
We're hosted on a VPS, running Linux/Apache. We use Satchmo as our CMS/shopping engine. As far as I know, we haven't put explicit redirect instructions into the CMS. Do you think the CMS may be adding the slash?
-
What type of server is your site hosted on? Is it Windows or Apache? Is it shared hosting, VPS or dedicated?
What type of site do you have? Is there a CMS or other software which may modify or rewrite URLs?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL slash creating duplicate content
Hi All, I currently have an issue whereby by domain name (just homepage) has: mydomain.com and: mydomain.com/ Moz crawler flags this up as duplicate content - does anyone know of a way I can fix this? Thanks! Jack
Technical SEO | | Jack11660 -
Has anyone had problems with Wordpress plugins on their blog causing payment issues on the main site?
Looking to migrate a subdomain Wordpress site onto the main domain, but the payment system breaks based on one or more of the plugins used on the blog having been linked with spammy activity in the past. Need to isolate the plugin and remove before migrating or it'll break the site! Has anyone had any similar issues with some of the following plugins? Akismet Wordfence Security Subscribe2 Timber Backup Buddy
Technical SEO | | Amelia.Coleby0 -
What's going on with google index - javascript and google bot
Hi all, Weird issue with one of my websites. The website URL: http://www.athletictrainers.myindustrytracker.com/ Let's take 2 diffrenet article pages from this website: 1st: http://www.athletictrainers.myindustrytracker.com/en/article/71232/ As you can see the page is indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:dfbzhHkl5K4J:www.athletictrainers.myindustrytracker.com/en/article/71232/10-minute-core-and-cardio&hl=en&strip=1 (that the "text only" version, indexed on May 19th) 2nd: http://www.athletictrainers.myindustrytracker.com/en/article/69811 As you can see the page isn't indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:KeU6-oViFkgJ:www.athletictrainers.myindustrytracker.com/en/article/69811&hl=en&strip=1 (that the "text only" version, indexed on May 21th) They both have the same code, and about the dates, there are pages that indexed before the 19th and they also problematic. Google can't read the content, he can read it when he wants to. Can you think what is the problem with that? I know that google can read JS and crawl our pages correctly, but it happens only with few pages and not all of them (as you can see above).
Technical SEO | | cobano0 -
# in url affecting rank
Hi I am building links to a page www.companyname.com/category.index.php There is also another similar url www.companyname.com/category.index.php#. This page is linked to from the non # page. This is a new client and I'm not entirely sure why that link is there. Am I correct in thinking that these two urls are different in the eyes of the search engines? If so, would some of the link juice to www.companyname.com/category.index.php be transferred to www.companyname.com/category.index.php# and affect the ranking of the non # page? I hope this makes sense! Thanks
Technical SEO | | sicseo0 -
Crawling issues in google
Hi everyone, I think i have crawling issues with one of my sites. It has vanished form Google rankings it used to rank for all services i offered now it doesn't anymore ever since September 29th. I have resubmitted to Google 2 times and they came back with the same answer: " We reviewed your site and found no manual actions by the web spam team that might affect your site's ranking in Google. There's no need to file a reconsideration request for your site, because any ranking issues you may be experiencing are not related to a manual action taken by the webspam team. Of course, there may be other issues with your site that affect your site's ranking. Google's computers determine the order of our search results using a series of formulas known as algorithms. We make hundreds of changes to our search algorithms each year, and we employ more than 200 different signals when ranking pages. As our algorithms change and as the web (including your site) changes, some fluctuation in ranking can happen as we make updates to present the best results to our users. If you've experienced a change in ranking which you suspect may be more than a simple algorithm change, there are other things you may want to investigate as possible causes, such as a major change to your site's content, content management system, or server architecture. For example, a site may not rank well if your server stops serving pages to Googlebot, or if you've changed the URLs for a large portion of your site's pages. This article has a list of other potential reasons your site may not be doing well in search. " How i detected that it may be a crawling issue is that 2 weeks ago i changed metas - metas are very slow in getting updated and for some of my pages never did update Do you know any good tools to check for bad code that could slow down the crawling. I really don't know where to look other than issues for crawling. I validated the website with w3c validator and ran xenu and cleaned these up but my website is still down. Any ideas are appreciated.
Technical SEO | | CMTM0 -
Home page URL disappears in Google after switching to WordPress
It was a 10 page static HTML page website. 3 year old, PR2. Monday night, copied a WordPress from somewhere to this website's public_html folder and activate it. The home page was "index.html" before switching to WordPress. Now this html file (index.html) has been deleted, so WordPress' Home page can work. All other 9 static html pages are still there in Google index. Just notice it today that the home page URL disappears in Google completely. Why? All other 9 static html pages' URL are still in Google. robots.txt is Allow: / What may have gone wrong to remove the home domain URL from Google index? Thank you for your help!
Technical SEO | | johnzhel0 -
Why is this url showing as "not crawled" on opensiteexplorer, but still showing up in Google's index?
The below url is showing up as "not crawled" on opensitexplorer.com, but when you google the title tag "Joel Roberts, Our Family Doctors - Doctor in Clearwater, FL" it is showing up in the Google index. Can you explain why this is happening? Thank you http://doctor.webmd.com/physician_finder/profile.aspx?sponsor=core&pid=14ef09dd-e216-4369-99d3-460aa3c4f1ce
Technical SEO | | nicole.healthline0 -
Why google index my IP URL
hi guys, a question please. if site:112.65.247.14 , you can see google index our website IP address, this could duplicate with our darwinmarketing.com content pages. i am not quite sure why google index my IP pages while index domain pages, i understand this could because of backlink, internal link and etc, but i don't see obvious issues there, also i have submit request to google team to remove ip address index, but seems no luck. Please do you have any other suggestion on this? i was trying to do change of address setting in Google Webmaster Tools, but didn't allow as it said "Restricted to root level domains only", any ideas? Thank you! boson
Technical SEO | | DarwinChinaSEO0