Why do old URL format are still being crawled by Rogerbot?
-
Hi,
In the early days of my blog, I used permalinks with the following format:
http://www.mysitesamp.com/2009/02/04/heidi-cortez-photo-shoot/
I then decided to change this format using .htaccess to this format:
http://www.mysitesamp.com//heidi-cortez-photo-shoot/
My question is, why do rogerbot still crawls my old URL format since these urls' no longer exists in my website or blog.
-
Thanks Alan,
That solved my problem...
-
-
Hi Alan,
After disallowing the directory in robots.txt, Rogerbot still includes the non-existing URLs. Here is a sample URL that is being reported by Rogerbot
www.lugaluda.com/2009/08/05/chase-online-banking-chase-checking-bonus/
-
If you give me the url, i can crawl it fior you if you like.
-
Thanks Alan, I really appreciate your help. Gave me an idea since all the old URLs are coming from a virtual 2009 directory, I tried to add a disallow statement for that directory in the robots.txt section. Hopefully this will help solve the problem.
I will let you know the results after rogerbot finishes recrawling my site...
Thanks Dude....
-
You need to search your site, but bots start on a page and follow the links, if the report them then they must of found them, bots like googlebot or bingbot can find them on other sites, but rogerbot is only crawling within your site.
-
How will I know if they still exists on my site? If I tried to access the specific URLs, they are no longer active.
-
The old format must still exist in your site somewhere, bots follow links from your home page though your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URLs are not indexed
My website has 0.5 million pages with urls like this- **http://www.mycity4kids.com/Delhi-NCR/collage-painting-classes-%3cnear%3e-shalimar-bagh ****, **none of these urls are indexed. Question 1- What can be the possible reason for this issue? Users see this url as : http://www.mycity4kids.com/Delhi-NCR/collage-painting-classes-<near>-shalimar-bagh</near>
Intermediate & Advanced SEO | | prsntsnh
The symbol "<" and ">" get converted into "%3c" and "%3e" respectively, is this the reason for these urls not getting indexed?0 -
What is the best URL structure for categories?
A client's site currently uses the URL structure: www.website.com/�tegory%/%postname% Which I think is optimised fairly well, as the categories are keywords being targeted. However, as they are using a category hierarchy, often times the URL looks like this: www.website.com/parent-category/child-category/some-post-titles-are-quite-long-as-they-are-long-tail-terms Best practise often dictates (such as point 3 in this Moz article) that shorter URLs are better for several reasons. So I'm left with a few options: Remove the category from the URL Flatten the category hierarchy Shorten post titles two a word or two - which would hurt my long tail search term traffic. Leave it as it is What do we think is the best route to take? Thanks in advance!
Intermediate & Advanced SEO | | underscorelive0 -
Can multiple redirects from old URLs hurt SEO?
We have a client that had an existing site with existing rankings. We rebuilt the site using DNN 7 and created/tested 301 redirects from all the Original URLs to the new DNN URLs which are nasty and have /tabid/1234 and will not allow for dashes (-)'s We have found a DNN module that will make the DNN 7 URLs search friendly. However, that will cause us to 301 the current DNN urls to the new URLs so in fact the original will redirect to the DNN and the DNN will redirect to the rewritten SEO friendly URLs. What should we know here before proceeding?
Intermediate & Advanced SEO | | tjkirgin0 -
How canonical url harm our website???
Even though my website has no similar/copied content, i used rel=canonical for all my website pages. Is Google or yahoo make any harm to my SERP's?? EX: http://www.seomoz.org is my site, in that i used canonical as rel="<a class="attribute-value">canonical</a>" href="http://www.seomoz.org" to my home page like that similar to all pages, i created rel=canonical. Is search engine harm my website???
Intermediate & Advanced SEO | | MadhukarSV0 -
Could this URL issue be affecting our rankings?
Hi everyone, I have been building links to a site for a while now and we're struggling to get page 1 results for their desired keywords. We're wondering if a web development / URL structure issue could be to blame in what's holding it back. The way the site's been built means that there's a 'false' 1st-level in the URL structure. We're building deeplinks to the following page: www.example.com/blue-widgets/blue-widget-overview However, if you chop off the 2nd-level, you're not given a category page, it's a 404: www.example.com/blue-widgets/ - [Brings up a 404] I'm assuming the web developer built the site and URL structure this way just for the purposes of getting additional keywords in the URL. What's worse is that there is very little consistency across other products/services. Other pages/URLs include: www.example.com/green-widgets/widgets-in-green www.example.com/red-widgets/red-widget-intro-page www.example.com/yellow-widgets/yellow-widgets I'm wondering if Google is aware of these 'false' pages* and if so, if we should advise the client to change the URLs and therefore the URL structure of the website. This is bearing in mind that these pages haven't been linked to (because they don't exist) and therefore aren't being indexed by Google. I'm just wondering if Google can determine good/bad URL etiquette based on other parts of the URL, i.e. the fact that that middle bit doesn't exist. As a matter of fact, my colleague Steve asked this question on a blog post that Dr. Pete had written. Here's a link to Steve's comment - there are 2 replies below, one of which argues that this has no implication whatsoever. However, 5 months on, it's still an issue for us so it has me wondering... Many thanks!
Intermediate & Advanced SEO | | Gmorgan0 -
Brackets in a URL String
Was talking with a friend about this the other day. Do Brackets and or Braces in a URL string impact SEO? (I know short human readable etc... but for the sake of conversation has anyone relaised any impacts of these particular Characters in a URL?
Intermediate & Advanced SEO | | AU-SEO0 -
Crawl questions
My first website crawl indicating many issues. I corrected the issues, requested another crawl and received the results. After viewing the excel file I have some questions. 1. There are many pages with missing Titles and Meta Descriptions in the Excel file. An example is http://www.terapvp.com/threads/help-us-decide-on-terapvp-com-logo.25/page-2 That page clearly has a meta description and title. It is a forum thread. My forum software does a solid job of always providing those tags. Why would my crawl report not show this information? This occurs on numerous pages. 2. I believe all my canonical URLs are properly set. My crawl report has 3k+ records, largely due to there being 10 records for many pages. These extra records are various sort orders and style differences for the same page i.e. ?direction=asc. My need for a crawl report is to provide actionable data so I can easily make SEO improvements to my site where necessary. These extra records don't provide any benefit. IF the crawl report determined there was not a clear canonical URL, then I could understand. But that is not the case. An example is http://www.terapvp.com/forums/news/ If you look at the source you will clearly see Where is the benefit to including the 10 other records in the Crawl report which show this same page in various sort orders? Am I missing anything? 3. My robots.txt appropriately blocks many pages that I do not wish to be crawled. What is the benefit to including these many pages in the crawl report? Perhaps I am over analyzing this report. I have read many articles on SEO, but now that I have found SEOmoz, I can see I will need to "unlearn what I have learned". Many things such as setting meta keyword tags are clearly not helpful. I wish to focus my energy and I was looking to the crawl report as my starting point. Either I am missing something, or the report design needs improvement.
Intermediate & Advanced SEO | | RyanKent0 -
Page URL Issue
Hey Friend, I am having sort of a problem. I currently have a subpage with the url of: /musclecars/ I also have a subpage at /muscle-cars/muscle-car-restoration.html Obviously my main url is not listed here. My problem is I am trying to rank for the term Muscle Cars but the first URL does not have the keywords seperated so I rank no where. If I type MuscleCars into google I rank though (but nobody types the keyword in like that). So my question is can I create muscle-cars.mydomainname.com and rank well with that? Or is it better to just use mydomainname.com/muscle-cars/ even though that second term I am ranking for already has that in its url?
Intermediate & Advanced SEO | | shandaman0