External Links from own domain
-
Hi all,
I have a very weird question about external links to our site from our own domain.
According to GWMT we have 603,404,378 links from our own domain to our domain (see screen 1) We noticed when we drilled down that this is from disabled sub-domains like m.jump.co.za.
In the past we used to redirect all traffic from sub-domains to our primary www domain. But it seems that for some time in the past that google had access to crawl some of our sub-domains, but in december 2010 we fixed this so that all sub-domain traffic redirects (301) to our primary domain. Example http://m.jump.co.za/search/ipod/ redirected to http://www.jump.co.za/search/ipod/
The weird part is that the number of external links kept on growing and is now sitting on a massive number.
On 8 April 2011 we took a different approach and we created a landing page for m.jump.co.za and all other requests generated 404 errors. We added all the directories to the robots.txt and we also manually removed all the directories from GWMT.
Now 3 weeks later, and the number of external links just keeps on growing: Here is some stats:
11-Apr-11 - 543 747 534
12-Apr-11 - 554 066 716
13-Apr-11 - 554 066 716
14-Apr-11 - 554 066 716
15-Apr-11 - 521 528 014
16-Apr-11 - 515 098 895
17-Apr-11 - 515 098 895
18-Apr-11 - 515 098 895
19-Apr-11 - 520 404 181
20-Apr-11 - 520 404 181
21-Apr-11 - 520 404 181
26-Apr-11 - 520 404 181
27-Apr-11 - 520 404 181
28-Apr-11 - 603 404 378
I am now thinking of cleaning the robots.txt and re-including all the excluded directories from GWMT and to see if google will be able to get rid of all these links.
What do you think is the best solution to get rid of all these invalid pages.
-
We had 301s for about 6 months, and the old URLs did not disappear from google. Thats why we decided to change them to 404s, with the thinking that Google might remove them quicker. But the number of links from sub-domains just keeps on growing.
I am worried that by having these problem urls listed in the robots.txt actually prevents google from following them and seeing that it should be removed and that it returns a 404
-
Instead of trying to manage a massive 301 list, can you just customize your 404 page to redirect?
{script to test page URL}
$location = "http://www.YourSite.com/";
header("HTTP/1.1 301 Moved Permanently");
header("Location: {$location}");
exit;
}
-
Update:
There are 2 things that still puzzles me with this:
If you go to http://www.google.co.za/search?q=site:jump.co.za+-www&hl=en&rlz=1C1GPCK_enZA426ZA426&prmd=ivns&filter=0&biw=1920&bih=979 you notice all sorts of weird sub-domains, and all of these are invalid and have been removed from GWMT.
If you manage the domain m.jump.co.za on GWMT you also notice that it still reports on keywords, queries and all sorts of data, although the site is disabled and all the URLs generate 404 errors
There is only a few of these weird sub-domains that are causing the problems:
0www.
iiiiiwww.
iwww.
m.
wtfwww.
www.www.
wwww.All these domains feels very fimiliar to me and I am almost 100% sure that its domains that used to test when we found the problem on apache, meaning google took the data from the toolbar queries and probably started indexing these sub-domains. But now I can't get rid of them, and Google seems to be out of control with these.
So the main question is probably, should we just give 404s or should we add to Robots.txt as well?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Linking to my Site so I should Link Back?
I remember hearing a few years ago that it was a good practice to link back to a site that was linking to you. My company's site was referenced and linked to in a news article. The news company has an above average domain authority, which is pretty good for my company's backlink profile. Is it still or was ever a "best practice" to link back to this website/domain? I feel like linking back was a best practice, but when I try to search this, all I get back is backlinking 101 and backlinking articles. Nothing really answering my question straight forward. Thanks for any help.
Technical SEO | | aua0 -
Keeping external links after moving from http to https?
Hi, Does anyone have experience moving a website to https? I am about to do so. I have 84 linking root domains and around 2k+ external links. If i move a website to https will these links be lost? And how to keep these links? Many thanks, Dusan
Technical SEO | | Chemometec0 -
What is the advantage of using sub domains instead of pages on the root domain?
Have a look at this example http://bannerad.designcrowd.com/ For each category of design, they have a landing page on the sub domain. Wouldn't it be better to have them as part of the same domain? What is the strategy behind using sub domains?
Technical SEO | | designquotes0 -
302 multiple domains...
Hello, I have a few domain names with orthographic variations that I'd like to redirect to my main site. The problem is my registrar (OVH) does only 302 redirects, so what are my options ? Can I keep a dozen 302's ? Do I have to change all their DNS (it's a load on my server...) ? Thanks for any ideas Johann.
Technical SEO | | JohannCR0 -
What i should do about bad links ?
Hi, my blog is http://www.dota2club.com/ and i have many bad links to my blog what i should do about that and how ? i started 10 days ago guest blogging but my bad links from before are hurting my blog. please help 🙂 thank you !!!
Technical SEO | | wolfinjo0 -
Why would a link shown on OSE appear differently than the page containing the link?
I recently traded links with a site that I will call www.example.com When I used open site explorer to check the link it came back with a different page authority as www.example.com/index.htm yet the link does appear on the www.example.com page. Why would this be?
Technical SEO | | casper4340 -
Multiple Domains, Same IP address, redirecting to preferred domain (301) -site is still indexed under wrong domains
Due to acquisitions over time and the merging of many microsites into one major site, we currently have 20+ TLD's pointing to the same IP address as our "preferred domain:" for our consolidated website http://goo.gl/gH33w. They are all set up as 301 redirects on apache - including both the www and non www versions. When we launched this consolidated website, (April 2010) we accidentally left the settings of our site open to accept any of our domains on the same IP. This was later fixed but unfortunately Google indexed our site under multiple of these URL's (ignoring the redirects) using the same content from our main website but swapping out the domain. We added some additional redirects on apache to redirect these individual pages pages indexed under the wrong domain to the same page under our main domain http://goo.gl/gH33w. This seemed to help resolve the issue and moved hundreds of pages off the index. However, in December of 2010 we made significant changes in our external dns for our ip addresses and now since December, we see pages indexed under these redirecting domains on the rise again. If you do a search query of : site:laboratoryid.com you will see a few hundred examples of pages indexed under the wrong domain. When you click on the link, it does redirect to the same page but under the preferred domain. So the redirect is working and has been confirmed as 301. But for some reason Google continues to crawl our site and index under this incorrect domains. Why is this? Is there a setting we are missing? These domain level and page level redirects should be decreasing the pages being indexed under the wrong domain but it appears it is doing the reverse. All of these old domains currently point to our production IP address where are preferred domain is also pointing. Could this be the issue? None of the pages indexed today are from the old version of these sites. They only seem to be the new content from the new site but not under the preferred domain. Any insight would be much appreciated because we have tried many things without success to get this resolved.
Technical SEO | | sboelter0 -
Question about domain redirects
One of my clients has an odd domain redirect situation. See if you can get your head round this: Domain A is set-up as a domain alias of Domain B Entering domain A or domain B takes you to default.asp on domain B. The default.asp includes VB script to check the HTTP_HOST variable. It checks whether the main doman name for domain A is present in the HTTP_HOST and if so redirects it to domain A/sub-folder/index.htm. If not present it redirects to domain B/index.htm. In both cases the redirect uses a response.Redirect clause. I think what is trying to be achieved is to redirect requests to Domain A to a sub-folder of Domain B. It works but seems extremely convoluted. Can anyone see problems with this set-up? Will link juice be lost along the redirect paths?
Technical SEO | | bjalc20110