DMOZ help
-
So yesterday I got a DMOZ editor account. I would like to know if Google indexes the editor profile pages on DMOZ:
http://www.dmoz.org/public/profile?editor=
here are some examples
http://www.dmoz.org/public/profile?editor=thehelper
http://www.dmoz.org/public/profile?editor=raph3988
http://www.dmoz.org/public/profile?editor=skasselea
I would like to know if it is worth while to build up this page so it will pass link juice. And can anyone tell me how frequently Google crawls for new editors (if that's possible?)
-
Hello,
I wouldn't bet on it, but there's no harm in trying
-
You can confirm this yourself.
First, do a Google search for site:http://www.dmoz.org/public/profile?editor=
You see the meta descriptions aren't indexed in the results? Instead, Google puts a default message, with a link to this page.. https://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449 - check that out. Note the paragraph:
"While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results."
So whilst they may appear in Google's index (and indeed the OSE one) because of the links pointing to them, the content isn't crawled at all (by any spiders that obey robots.txt).
-
Oh yes he is correct, good call Neil, I had no idea that the robot.txt would be publicly accessible. I actually never seen a site have their robot txt visible.. I guess it's the "open source"...
-
Can anyone co firm this?
-
Take a look at their robots.txt - http://www.dmoz.org/robots.txt
They disallow the /public and /editors subfolders. The editor pages, whilst indexed by Google, aren't crawled.. so whilst the location of the pages themselves is indexed (because of links to those pages), the contents of the pages aren't indexed. This means any links on the page too, obviously..
For this reason, I don't agree with Reload Media. For me, there's no point expending any effort promoting the page for link equity benefit.
The fact they show good authority on OSE is something of an anomoly. They can accrue authority (and indeed Google PR) from their inbound links, however, they are a bit of a dead end, due to the fact that no actual content is indexed.
-
Hi Raphael,
Well done on getting an editor account. Remember with great power comes great responsibility
Yes they do get indexed. The way to check this is to Google the url in "" i.e. "http://www.opensiteexplorer.org/links?site=www.dmoz.org%2Fpublic%2Fprofile%3Feditor%3Dthehelper"
Some of those editor pages have great Authority. http://www.opensiteexplorer.org/links?site=www.dmoz.org%2Fpublic%2Fprofile%3Feditor%3Dthehelper
If it's related to your niche, then would be worth pursuing.
Hope that helps
Iain - Reload Media
-
Using http://pro.seomoz.org/tools/on-page-keyword-optimization, you can check individual pages, in the keyword field I put the helper and in the URL I put http://www.dmoz.org/public/profile?editor=thehelper... So it seems like it does get indexed : )
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Trying to pinpoint why 1 keyword moved down 100 positions in 2 weeks. Help me speculate?
Hi there, One of my client's sites, a very large and successful ecommerce website with great SEO performance, has seen a significant drop in rankings in the past 2 weeks. The rankings have begun to somewhat stabilize today, except one particular keyword with a search volume of 74k has gone from 1 to 100. Here is what has taken place in 2 weeks, sitewide: I revised and improved upon title tags and meta descriptions to make them more user-friendly and contain more optimized terms. Following all of Google's best practices, as always. Google still appears to be indexing these changes (has anyone seen an initial drop in rankings while this takes place?) The site has seen a very significant increase in 404 errors due to one feature of the site breaking. We got a message about it in Webmaster Tools, and this appears to coincide with when overall rankings dropped. The development team is working quickly to get this resolved. As of today, I am seeing the highest page-load time than any other day in 2015. With regard to the particular page/keyword in question: The keyword is no longer "exact match" at the beginning of the title tag, but rather broken up throughout the title tag so the whole title sounds better for users. **Have you found that this type of change is sufficient for a keyword rank to move down ~100 positions?? **(Either way, I have asked the client to revise the title to start with the exact match keyword, once again.) Google has indexed the page 2 days ago, but is still displaying the old title tag in search results. I have not found any instances of internal or external links to this page being removed. With all this information, does anyone see anything that seems like it could have reasonably caused such a huge tank in rankings? Is this a blip in time? Is there anything I am not considering? Should I just be patient?
Intermediate & Advanced SEO | | FPD_NYC0 -
'?q=:new&sort=new' URL parameters help...
Hey guys, I have these types of URLs being crawled and picked up on by MOZ but they are not visible to my users. The URLs are all 'hidden' from users as they are basically category pages that have no stock, however MOZ is crawling them and I dont understand how they are getting picked up as 'duplicate content'. Anyone have any info on this? http://www.example.ch/de/example/marken/brand/make-up/c/Cat_Perso_Brand_3?q=:new&sort=new Even if I understood the technicality behind it then I could try and fix it if need be. Thanks Guys Kay
Intermediate & Advanced SEO | | eLab_London0 -
Need some expert help – My Client bought out competitor and now wants to completely duplicate the current site with the same stock & categories using the Competitor brand
I am the SEO consultant for a large online homewares store. This company currently ranks very well in Google. I can PM the domain name if anyone needs however i don't want to post it on this forum. The company has bought out a competitor and plan to use the same warehouse, same products, and same back-end system as the current site, so they want to completely duplicate the current website. Titles, meta descriptions, product descriptions will all be renamed/rewritten/reworded (however keep in mind there are not many ways to reword a 3 piece saucepan set) Pricing will mostly be the same (some difference though), images cannot be renamed, categories cannot be renamed... the structure of the site will be exactly the same... placement etc. (however will have different banners, logo etc.) I personally don't believe the new site will rank, because it will be too similar. Can someone please offer me a 2nd opinion... Thanks
Intermediate & Advanced SEO | | ryanlenton0 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
Need help with huge spike in duplicate content and page title errors.
Hi Mozzers, I come asking for help. I've had a client who's reported a staggering increase in errors of over 18,000! The errors include duplicate content and page titles. I think I've found the culprit and it's the News & Events calender on the following page: http://www.newmanshs.wa.edu.au/news-events/events/07-2013 Essentially each day of the week is an individual link, and events stretching over a few days get reported as duplicate content. Do you have any ideas how to fix this issue? Any help is much appreciated. Cheers
Intermediate & Advanced SEO | | bamcreative0 -
Robot.txt help
Hi, We have a blog that is killing our SEO. We need to Disallow Disallow: /Blog/?tag*
Intermediate & Advanced SEO | | Studio33
Disallow: /Blog/?page*
Disallow: /Blog/category/*
Disallow: /Blog/author/*
Disallow: /Blog/archive/*
Disallow: /Blog/Account/.
Disallow: /Blog/search*
Disallow: /Blog/search.aspx
Disallow: /Blog/error404.aspx
Disallow: /Blog/archive*
Disallow: /Blog/archive.aspx
Disallow: /Blog/sitemap.axd
Disallow: /Blog/post.aspx But Allow everything below /Blog/Post The disallow list seems to keep growing as we find issues. So rather than adding in to our Robot.txt all the areas to disallow. Is there a way to easily just say Allow /Blog/Post and ignore the rest. How do we do that in Robot.txt Thanks0 -
What next to help with my rankings
I'm after a fresh set of eyes and any suggestions to help me with my site on what next I should be doing to help increase rankings. The site is: http://bit.ly/VR6xIm Currently the site is ranking around 9-11th on google.co.uk for it's main term which is the name of the site. The site is around a year old, when it launched it went initially up towards positions 3-5 but has since settled at around where it is now. I have a free tool webmasters can use to implement our speed test into their sites which also includes a link back to our site in it to recognise that we are providing the tool for free, I periodically change the link achor text so it is not always the same anchor text that every site uses. Is there anything obvious I should be doing or that is missing that would help with my rankings? *Just as a note, I am not after a review on the actual speed test on the site, a new one will be developed to help further increase accuracy.
Intermediate & Advanced SEO | | Wardy0 -
Crazy long weird URLs... help
I have a HTML website, mysite1.com, and I placed a link on the home page to another one of my sites, mysite2.com Today I checked the links to mysite2.com in Majestic and noticed 24 links coming from the mysite1.com instead of just one link. The URLs from mysite1.com that are showing in Majestic are like this mysite1.com/?epl=4donafvFK3fMXxZXMWQRQLodmPchoXCK5C7-kbBv_agkwlkJrZAoaSDVUlhqFmUqt0f8c2Q6jF6GO6DNMnbidqRsikriF-IEBEt5okmICLEB0FxP36GrsxoPGQ3SGBo1PVR7itDUA4CYmjypn5gi mysite1.com,was inherited from a friend and I believe that it was originally built in Frontpage. Can you tell me how I can get rid of these multiple links as I only want 1 showing from the home page Thanks in advance
Intermediate & Advanced SEO | | JohnPeters0