Ben,
I doubt that crawlers are going to access the robots.txt file for each request, but they still have to validate any url they find against the list of the blocked ones.
Glad to help,
Don
Welcome to the Q&A Forum
Browse the forum for helpful insights and fresh discussions about all things SEO.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Job Title: IT Manager
Company: Columbia Engineered Rubber, Inc
Website Description
O-Ring & Rubber Bellow Distribution site.
Favorite Thing about SEO
It never ends
Ben,
I doubt that crawlers are going to access the robots.txt file for each request, but they still have to validate any url they find against the list of the blocked ones.
Glad to help,
Don
Hi Bob,
About the nofollow vs blocked. In the end I suppose you have the same results, but in practice it works a little differently. When you nofollow a link it tells the crawler as soon as it encounters the link not to request or follow that link path. When you block it via robots the crawler still attempts to access the url only to find it not accessible.
Imagine if I said go to the parking lot and collect all the loose change in all the unlocked cars. Now imagine how much easier that task would be if all the locked cars had a sign in the window that said "Locked", you could easily ignore the locked cars and go directly to the unlocked ones. Without the sign you would have to physically go check each car to see if it will open.
About link juice, if you have a link, juice will be passed regardless of the type of link. (You used to be able to use nofollow to preserve link juice but no longer). This is bit unfortunate for sites that use search filters because they are such a valuable tool for the users.
Don
Hi Bob,
You can "suggest" a crawl rate to Google by logging into your webmasters tools on Google and adjusting it there.
As for indexing pages.. I looked at your robots and site. It really looks like you need to employ some No Follow on some of your internal linking, specifically on the product page filters, that alone could reduce the total number of URLS that the crawlers even attempts to look at.
Additionally your sitemap http://premium-hookahs.nl/sitemap.xml shows a change frequency of daily, and probably should be broken out between Pages / Images so you end up using two sitemaps one for images and one for pages. You may also want to review what is in there. Using ScreamingFrog (free) the sitemap I made (link) only shows about 100 urls.
Hope it helps,
Don
Hi Will,
I'm of 2 minds when it comes to directories. My general advice would be to ignore them all together, unless there are some very industry specific ones that make sense. I say general advice because the vast majority of industries I have researched have only 1 known good directory (Dmoz.org), the rest are at best, relic sites that have basically run their course in usefulness and give little to no value in terms of traffic or link juice. Why? because it is atypical for somebody to use anything other then Google / Yahoo / Bing / Baidu to find anything on the internet.
That being said, I do place some value on directories for some specific industries and lead generation. For example, in my current industry there is a site that has been around since the 90's and many people before the rise of search engine dominance found it as a great resource for finding business to business partnerships. Many of those people who got acclimated to the site are still working today and use it as their go to source for specific project requirements. In other words they have used it for so long and it has worked for so long they never found the need to branch out and rely on search engines. And in all honesty even Google would have a hard time returning pertinent results for lets say a rubber manufacturer who has experience with over molding fda approved buna-n rubber to an aluminum substrate. But the good directory sites can list those sorts of capabilities.
Because this is a public question I had to give both my opinions on directory sites. Again I wouldn't seek them out as any form of link building, but I also wouldn't ignore ones that seem capable of delivering either traffic or leads, I will say with the exception of Dmoz.org any of the good directories sites I have run across are very industry specific and they are certainly not free.
Hope that helps
Don
Hello Bob,
Here is some food for thought. If you disallow a page in Robots.txt, google for example will not crawl that page. That does not however mean they will remove it from the index if it had previously been crawled. It simply treats it as inaccessible and moves on. It will take some time, months before Google finally says, we have no fresh crawls of page x, its time to remove it from the index.
On the other hand if you specifically allow Google to crawl those pages and show a no-index tag on it, Google now has a new directive it can act upon immediately.
So my evaluation of the situation would be to do 1 of 2 things.
1. Remove the disallow from robots and allow Google to crawl the pages again. However, this time use no-index, no-follow tags.
2. Remove the disallow from robots and allow Google to crawl the pages again, but use canonical tags to the main "filter" page to prevent further indexing the specific filter pages.
Which option is best depends on the amount of urls being indexed, a few thousand canonical would be my choice. A few hundred thousand, then no index would make more sense.
Whichever option, you will have to insure Google re-crawls, and then allow them time to re-index appropriately. Not a quick fix, but a fix none the less.
My thoughts and I hope it makes sense,
Don
Hi James,
For page load, network, speed test I have used Pingdom.com in the past. They recently went more pay to use but it is a nice set of tools for basic test. There is still some free stuff you can use at tools.pingdom.com
For a quick SEO pass. Man I love ScreamingFrog! You can quickly identify header errors, long titles, descriptions, lack of h1 tags and so much more. When I do general quick audits for trouble shooting problems posted on this board, its my go to.
Hope this helps,
Don
Hi,
You may also want to check the domain variations. http://penn-criminallawyers.com/ and http://www.penn-criminallawyers.com/
If you look at the www version you will see a spam score of 4/10. You'll need to make sure you set the filter to "this root domain"
It could be that at some point the website used the www but have since switched to no www. There are some inconsistency like that with the tool. Technically "www" and "no www" are 2 different domains, in practicality we use them interchangeably.
Hope this helps,
Don
Hi Netkernz_ag,
It is just good practice to have those types of pages available. While I wouldn't say it is an absolute requirement, it should be something you do for your users. The page you pointed to is a general checklist of things to do, and not to do for your users. Creating a Site Index maybe a bit dated, but I still tend to do them as they are fairly easy to create. (example).
Hope this helps,
Don
Hi James,
An interesting question, my initial thoughts is that the Yoast article is wrong, or at the very least kind of wrong. In the sense that no one strategy is going to work for every site.
Do you have a link?
Interested to hear other opinions as well.
Don
Hello,
Yes, multiple sitemaps are okay, and sometimes even advised!
You can read Google's official response here."..it's fine for multiple Sitemaps to live in the same directory (as many as you want!)..."
And you can see a case study showing how multiple sitemaps has helped traffic here on Moz.
Hope this helps,
Don
The basic question (correct me if I'm wrong) is how to rank better for Image searches.
The answer, has little to do with the options you listed.
The best way to achieve this
Hope this helps
Hi James,
For page load, network, speed test I have used Pingdom.com in the past. They recently went more pay to use but it is a nice set of tools for basic test. There is still some free stuff you can use at tools.pingdom.com
For a quick SEO pass. Man I love ScreamingFrog! You can quickly identify header errors, long titles, descriptions, lack of h1 tags and so much more. When I do general quick audits for trouble shooting problems posted on this board, its my go to.
Hope this helps,
Don
Hi Malika,
Blocking image directories or images themselves in robots.txt only prevents the image from being added to "image" search results. You will still get the full benefit of the alt text on the page, the image just won't appear in the image results.
How this actually works is the crawler will crawl the site and index all the text and weight (h1, h2, alt etc..) then when the crawler moves to add the image to the search cache it finds it can't access it due to robots.txt and simply ignores it and goes on.This leaves your original text as what is indexed as a search result, and nothing for image results.
If you are using Apache you may want to not use robots.txt as the method of blocking images. I would recommend using the .htaccess file with a code like this...
<filesmatch ".(bmp|gif|jpg|png|tif)$"="">Header set X-Robots-Tag "noindex"</filesmatch>
This is a blanket declaration and would prevent indexing of any images with the noted extensions on your site. This is particularly useful if you have multiple image directories. Further more if there are a few images you want indexed you could pick a particular extension like .jpeg for example (note jpeg not jpg), then just convert those few images and know they will be indexed as they are not in the exclusion list.
Another benefit of handling it this way is if you already have images that are indexed, using the noindex tag will get them out out of the image directory much faster than blocking them. The reason is you are giving Google a new directive which is "noindex", otherwise they will just treat them as inaccessible and move on, leaving any cached version to appear in the directory for some time.
Hope that makes sense and helps,
Don
A dash is considered a word separator. Similar topics can be found here in regards to dashes vs apostrophes, though this mostly references is in regards to a URL.
The problem you're up against is one that I've personally dealt with on my site o-ring vs o ring vs oring vs o'ring.
Do you use them all? Do you target just one? How does Google treat each one?
Well there is NO good answer I have seen. If you target more then one variant and they are treated as the same keyword then you maybe hit for spamming, but if you don't you may lose potential traffic / sales.
The only thing I can offer is more of a suggestion then anything. Go to Google and plug in the phrases in the keyword analysis tool found in the adwords account tools. In my case I found that three of the 4 of my keywords had the exact same monthly searches. This told me that those 3 words at least are being treated the same. So we picked the correct US English spelling and targeted that word.
In your case, I would assume AK-47 and AK 47 are the same so really are left with targeting just the 2 you have been AK-47 and AK47.
Great response Egol.
I'm always impressed with people who can think outside the box. Our company was recently purchased by a much larger corporation and instead of trying to roll us together they recognized that separately with their additional resources we are more valuable.
My personal experience has been up to six months.
May not be the best of news for you, but I certainly wouldn't change it back. That could further complicate things.
For now I would continual link building where you can and give it some more time.
Hope this helps,
Don
Hi Satish,
Google will treat your title and description as suggestions. The reason they do this is because they want to provide the best possible description to their users. In such case they may generate their own description based off words, alt text, your description or combinations of all three.
This is nothing new, Google does this to everybody. You can view a video by Matt Cutts on this exact situation here.
https://www.youtube.com/watch?v=L3HX_8BAhB4
Hope this helps,
Don
Hi Satish,
Then my first response is appropriate. Google will use whatever "title / Description" it feels is best, again I recommend you watch the YouTube video from Matt Cutts (former Google SEO guy) about why that is.
In your particular case it could be the Doctor Deepu Chandru is driving more traffic to the site then your focused keywords, so as long as people keep engaging with Google's suggestion they will keep using it. Or perhaps you have made changes and not allowed Google enough time to re-index the site.
My suggestion remove the H1 tags from the docotor's names you are using multiple h1 tags there, this could be a bit of confusion for Google, change them to H2 tags. Then under your Logo write in H1 the Page title "Liposuction Surgery and Laser Hair Removal". This will tell Google the most important thing on this page is those specific keywords. Give it 1-3 weeks to get re-crawled and Google to get its indexes changed and you should see the difference.
Hope this helps,
Don
I really appreciate all of you taking the time to tell me about your experiences.
I am only sorry I could only mark 3 of you as good answers. All your input was welcomed and helpful to me in one way or another.
I hope to continue to see you on the boards, maybe just not as much.
Don
To follow up on what Keszi has said, it is not uncommon for this value to fluctuate from update to update. The reason is because there are so many things that factor into this score. This is further complicated by the fact that the crawl will not crawl every domain every time. This means you may see less / more "linking domains", and "linking c-blocks" which are factors in DA and PA.
I know when you see a 10% fluctuation you may get a little worried. However, when DA is in the 20's the fluctuation will be more prominent than if you were up in the 40's. Reason being is it is exponentially harder to reach higher domain authorities, which means that there will be a larger swab of linking domains and less likely to change much if a few are missed on each crawl.
Hope this helps,
Don
Website development for over 15 years. Experienced with html, css, javascript, php, sql, languages. SEO and marketing is an art.
Looks like your connection to Moz was lost, please wait while we try to reconnect.