Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Why are bit.ly links being indexed and ranked by Google?
-
I did a quick search for "site:bit.ly" and it returns more than 10 million results.
Given that bit.ly links are 301 redirects, why are they being indexed in Google and ranked according to their destination?
I'm working on a similar project to bit.ly and I want to make sure I don't run into the same problem.
-
Given that Chrome and most header checkers (even older ones) are processing the 301s, I don't think a minor header difference would throw off Google's crawlers. They have to handle a lot.
I suspect it's more likely that either:
(a) There was a technical problem the last time they crawled (which would be impossible to see now, if it had been fixed).
(b) Some other signal is overwhelming or negating the 301 - such as massive direct links, canonicals, social, etc. That can be hard to measure.
I don't think it's worth getting hung up on the particulars of Bit.ly's index. I suspect many of these issues are unique to them. I also expect problems will expand with scale. What works for hundreds of pages may not work for millions, and Google isn't always great at massive-scale redirects.
-
Here's something more interesting.
Bitly vs tiny.cc
I used http://web-sniffer.net/ to grab the headers of both and with bitly links, I see an HTTP Response Header of 301, followed by "Content", but with tiny.cc links I only see the header redirect.
Two links I'm testing:
Bitly response:
Content (0.11 <acronym title="KibiByte = 1024 Byte">KiB</acronym>)
<title></span>bit.ly<span class="tag"></title> <a< span="">href="https://twitter.com/KPLU">moved here</a<> -
I was getting 301->403 on SEO Book's header checker (http://tools.seobook.com/server-header-checker/), but I'm not seeing it on some other tools. Not worth getting hung up on, since it's 1 in 70M.
-
I wonder why you're seeing a 403, I still see a 200.
http://www.wlns.com/story/24958963/police-id-adrian-woman-killed-in-us-127-crash
200: HTTP/1.1 200 OK
- Server IP Address: 192.80.13.72
- ntCoent-Length: 60250
- Content-Type: text/html; charset=utf-8
- Server: Microsoft-IIS/6.0
- WN: IIS27
- P3P: CP="CAO ADMa DEVa TAIa CONi OUR OTRi IND PHY ONL UNI COM NAV INT DEM PRE"
- X-Powered-By: ASP.NET
- X-AspNet-Version: 4.0.30319
- wn_vars: CACHE_DB
- Content-Encoding: gzip
- Content-Length: 13213
- Cache-Control: private, max-age=264
- Expires: Wed, 19 Mar 2014 21:38:36 GMT
- Date: Wed, 19 Mar 2014 21:34:12 GMT
- Connection: keep-alive
- Vary: Accept-Encoding
-
I show the second one (bit.ly/O6QkSI) redirecting to a 403.
Unfortunately, these are only anecdotes, and there's almost no way we could analyze the pattern across 70M indexed pages without a massive audit (and Bitly's cooperation). I don't see anything inherently wrong with their setup, and if you noticed that big of a jump (10M - 70M), it's definitely possible that something temporarily went wrong. In that case, it could take months for Google to clear out the index.
-
I looked at all 3 redirects and they all showed a single 301 redirect to a 200 destination for me. Do you recall which one was a 403?
Looking at my original comment in the question, last month bit.ly had 10M results and now I'm seeing 70M results, which means there was a [relatively] huge increase with indexed shortlinks.
I also see 1000+ results for "mz.cm" which doesn't seem much strange, since mz.cm is just a CNAME to the bitly platform.
I found another URL shortner which has activity, http://scr.im/ and I only saw the correct pages being indexed by Google, not the short links. I wonder if the indexing is particular to bitly and/or the IP subnet behind bitly links.
I looked at another one, bit.do, and their shortlinks are being indexed. Back to square 1.
-
One of those 301s to a 403, which is probably thwarting Google, but the other two seem like standard pages. Honestly, it's tough to do anything but speculate. It may be that so many people are linking to or sharing the short version that Google is choosing to ignore the redirect for ranking purposes (they don't honor signals as often as we like to think). It could simply be that some of them are fairly freshly created and haven't been processed correctly yet. It could be that these URLs got indexed when the target page was having problems (bad headers, down-time, etc.), and Google hasn't recrawled and refreshed those URLs.
I noticed that a lot of our "mz.cm" URLs (Moz's Bitly-powered short domain) seem to be indexed. In our case, it looks like we're chaining two 301s (because we made the domain move last year). It may be that something as small as that chain could throw off the crawlers, especially for links that aren't recrawled very often. I suspect that shortener URLs often get a big burst of activity and crawls early on (since that's the nature of social sharing) but then don't get refreshed very often.
Ultimately, on the scale of Bit.ly, a lot can happen. It may be that 70M URLs is barely a drop in the bucket for Bit.ly as well.
-
I spot checked a few and I noticed some are only single 301 redirects.
And looking at the results for site:bit.ly, some even have breadcrumbs ironically enough.
Here are a few examples
<cite class="_md">bit.ly/M5onJO</cite>
None of these should be indexed, but for some reason they are.
Presently I see 70M pages indexed for "bit.ly"
I see almost 600,000 results for "bitly.com"
-
It looks like bit.ly is chaining two 301s: the first one goes to feedproxy.google.com (FeedProxy is like AdSense for feeds, I think), and then the second 301 goes to the destination site. I suspect this intermediary may be part of the problem.
-
I wasn't sure on this one, but found this on readwrite.com.
"Bit.ly serves up links to Calais and gets back a list of the keywords and concepts that the linked-to pages are actually about. Think of it as machine-performed auto tagging with subject keywords. This structured data is much more interesting than the mere presence of search terms in a full text search."
Perhaps this structured data is submitted to Google?? Any other ideas?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can a .ly domain rank in the United States?
Hello members. I have a question that I am seeking to confirm whether or not I am on the right track. I am interested in purchasing a .ly domain which is the ccTLD for Libya. The purpose of the .ly domain would be for branding purposes however at the same time I do not want to kill the websites ability to rank in Google.com (United States searches) because of this domain. Google does not consider .ly to be one of those generic ccTLDs like. io, .cc, .co, etc. that can rank and Bitly has also moved away from the .ly extension to a .com extension. Back in 2011 when there was unrest in Lybia, a few well known sites that utilized the .ly extension had their domains confiscated such as Letter.ly, Advers.ly and I think Bitly may have been on that list too however with the unrest behind us it is possible to purchase a .ly so being able to obtain one is not an issue. From what I can tell, I should be able to specify in Google Search Console that the website utilizing the .ly extension is a US based website. I can also do this with Google My Business and I will keep the Whois info public so the whois data can been seen as a US based website. Based on everything I just said do any of you think I will be OK if I were to register and use the .ly domain extension and still be able to rank in Google.com (US Searches). Confirmation would help me sleep better. Thanks in advance everyone and have a great day!!
Intermediate & Advanced SEO | | joemaresca0 -
Bit.ly backlinks
Hi all, what experience do you have with Bit.ly links? Can I use it for backlinking management?
Intermediate & Advanced SEO | | Tormar3 -
Does sharing same Business Name affect Google ranking?
Hey guys, We have been working for a client who is offering graphic design work almost 2 months. It is a new business and let's say the business name is ABC Graphic Design. So far all the pages are indexed, we built natural links through local directories, blog postings on relevant niche blogs and social media. We optimised the content and meta tags like we always do. However, none of the target keywords appear on the first 10 pages. This is quite odd considering we had a client who was doing the same business and we managed to show some progress in the first 2 months. We did some research and noticed that there are 2 ABC design websites with similar domain names and offering same services. They have nothing to do with my client and they are located in overseas. When i search ABC Graphic Design, the results show other companies instead of my client. My question is whether having a similar business name would affect the ranking. Obviously the other 2 websites have longer history and better ranking. Any suggestions?
Intermediate & Advanced SEO | | owengna0 -
Google indexing pages from chrome history ?
We have pages that are not linked from site yet they are indexed in Google. It could be possible if Google got these pages from browser. Does Google takes data from chrome?
Intermediate & Advanced SEO | | vivekrathore0 -
How is Google crawling and indexing this directory listing?
We have three Directory Listing pages that are being indexed by Google: http://www.ccisolutions.com/StoreFront/jsp/ http://www.ccisolutions.com/StoreFront/jsp/html/ http://www.ccisolutions.com/StoreFront/jsp/pdf/ How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why. If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site? Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content. For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page: http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff As you can see, this results in duplicate content problems. Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result? For example: Disallow: /StoreFront/jsp/ Disallow: /StoreFront/jsp/html/ Disallow: /StoreFront/jsp/pdf/ Can we do this without risking blocking Googlebot from content we do want crawled and indexed? Many thanks in advance for any and all help on this one!
Intermediate & Advanced SEO | | danatanseo0 -
Google Ranking Generally in Germany - Keywords & Umlauts
Hi Mozzers, I was hoping i could get some advice/opinions on a website ranking problem i have been working on, in particular one of the pages. This is our German language website which is hosted from Germany and a flaunt German speaking member of staff from our German office moderates the text content of the website for us.Our website seems to get good traffic ,visitor navigation and conversions. One of the keywords i focus building around is Schallpegelmessgerät which is one way of basically saying Sound level meter in German. The keyword uses an umlaut which i cannot use in the URL, but google is picking up and putting into the snippets, but apart from that our on-page optimization is good according to the moz tool. I have been trying to improve our content and we post many blog articles around the topic/keyword but google.de seems to choose not to even display this on the first couple of pages and sometimes ranks our blog articles around the third page. We are even been outranked by some low quality cheap online shop websites some of which with low quality content and low page and domain authorities. I had accepted this but after looking at bing.de and doing a search i find our page in the top 5 results, i understand that google and bing's algorhythms are different but just struggling to get my head around it all. Here is our website & page - http://www.cirrusresearch.de/produkte/schallpegelmessgerat/ Any advice on this situation would be greatly appreciated, thank you very much for reading this James
Intermediate & Advanced SEO | | Antony_Towle0 -
Can Affiliate Links Harm Your Rank?
Does Google interpret Affiliate links as paid links? If so, can Affiliate links harm your rank if they are not properly tagged with a no-follow? Thanks.
Intermediate & Advanced SEO | | AWCthreads0 -
I have a .com site but I am only ranking good on google for Canada and not the USA.
We are located in Canada but sell our products world wide. We are ranking ok on google.ca but are not in the top 50 on google.com. Is it due to my ip address? Is there any tips that you can give me to help up my rating for google.com. Any info you can provide me with will be amazing. Thanks,
Intermediate & Advanced SEO | | drewzal0