External Links from own domain
-
Hi all,
I have a very weird question about external links to our site from our own domain.
According to GWMT we have 603,404,378 links from our own domain to our domain (see screen 1) We noticed when we drilled down that this is from disabled sub-domains like m.jump.co.za.
In the past we used to redirect all traffic from sub-domains to our primary www domain. But it seems that for some time in the past that google had access to crawl some of our sub-domains, but in december 2010 we fixed this so that all sub-domain traffic redirects (301) to our primary domain. Example http://m.jump.co.za/search/ipod/ redirected to http://www.jump.co.za/search/ipod/
The weird part is that the number of external links kept on growing and is now sitting on a massive number.
On 8 April 2011 we took a different approach and we created a landing page for m.jump.co.za and all other requests generated 404 errors. We added all the directories to the robots.txt and we also manually removed all the directories from GWMT.
Now 3 weeks later, and the number of external links just keeps on growing: Here is some stats:
11-Apr-11 - 543 747 534
12-Apr-11 - 554 066 716
13-Apr-11 - 554 066 716
14-Apr-11 - 554 066 716
15-Apr-11 - 521 528 014
16-Apr-11 - 515 098 895
17-Apr-11 - 515 098 895
18-Apr-11 - 515 098 895
19-Apr-11 - 520 404 181
20-Apr-11 - 520 404 181
21-Apr-11 - 520 404 181
26-Apr-11 - 520 404 181
27-Apr-11 - 520 404 181
28-Apr-11 - 603 404 378
I am now thinking of cleaning the robots.txt and re-including all the excluded directories from GWMT and to see if google will be able to get rid of all these links.
What do you think is the best solution to get rid of all these invalid pages.
-
We had 301s for about 6 months, and the old URLs did not disappear from google. Thats why we decided to change them to 404s, with the thinking that Google might remove them quicker. But the number of links from sub-domains just keeps on growing.
I am worried that by having these problem urls listed in the robots.txt actually prevents google from following them and seeing that it should be removed and that it returns a 404
-
Instead of trying to manage a massive 301 list, can you just customize your 404 page to redirect?
{script to test page URL}
$location = "http://www.YourSite.com/";
header("HTTP/1.1 301 Moved Permanently");
header("Location: {$location}");
exit;
}
-
Update:
There are 2 things that still puzzles me with this:
If you go to http://www.google.co.za/search?q=site:jump.co.za+-www&hl=en&rlz=1C1GPCK_enZA426ZA426&prmd=ivns&filter=0&biw=1920&bih=979 you notice all sorts of weird sub-domains, and all of these are invalid and have been removed from GWMT.
If you manage the domain m.jump.co.za on GWMT you also notice that it still reports on keywords, queries and all sorts of data, although the site is disabled and all the URLs generate 404 errors
There is only a few of these weird sub-domains that are causing the problems:
0www.
iiiiiwww.
iwww.
m.
wtfwww.
www.www.
wwww.All these domains feels very fimiliar to me and I am almost 100% sure that its domains that used to test when we found the problem on apache, meaning google took the data from the toolbar queries and probably started indexing these sub-domains. But now I can't get rid of them, and Google seems to be out of control with these.
So the main question is probably, should we just give 404s or should we add to Robots.txt as well?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 Domain Redirect from old domain with HTTPS
My domain was indexed with HTTPS://WWW. now that we redirected it the certificate has been removed and if you try to visit the old site with https it throws an obvious error that this sites not secure and the 301 does not happen. My question is will googles bot have this issue. Right now the domain has been in redirection status to the new domain for a couple months and the old site is still indexed, while the new one is not ranking well for half its terms. If that is not causing the problem can anyone tell me why would the 301 take such a long time. Ive double and quadruple checked the 301's and all settings to ensure its being redirected properly. Yet it still hasn't fully redirected. Something is wrong and my clients ready to ditch the old domain we worked on for a good amount of time. backgorund:About 30 days ago we found some redirect loops .. well not loop but it was redirecting from old domain to the new domain several times without error. I removed the plugins causing the multi redirects and now we have just one redirect from any page on the old domain to the new https version. Any suggestions? This is really frustrating me and I just can't figure it out. My only answer at this point is wait it out because others have had this issue where it takes up to 2 months to redirect the domain. My only issue is that this is the first domain redirect out of many that have ever taken more than a week or three.
Technical SEO | | waqid0 -
Blog.domain or domain.com/blog
My client can't do domain.com/blog because he's on wix. I'm thinking blog.domain.com. Do you have any resources for the pros and cons of this? I understand that google looks at them very similarly now, is that true for google +?
Technical SEO | | tylerfraser0 -
Domain Hosting
I'm currently working with a client who provides products in Ireland Is it massively beneficial for the sited to be hosted on an irish server or will there not be much difference with it being hosted in England?
Technical SEO | | Sandeep_Matharu0 -
If I switch my domain name, will I keep link juice from redirect?
I am moving one of my sites wellington-florida-homes.com to a new domain (wellingtonfloridahomes.com). Most of our link-juice is coming from another domain we purchased (theshockleyteam.com) that is redirected to our primary site. If I simply redirect theshockleyteam.com to our new domain (wellingtonfloridahomes.com) I will still have the link juice passing through, right? Thanks!
Technical SEO | | RickyShockley0 -
Why is there duplicates of my domain
When viewing crawl diagnostics in SEOmoz I can see both "www.website.com" and a truncated version "website.com" is this normal and why is it showing (I do not have duplicates of my site on the server)? E.g.: http://www.klinehimalaya.com/
Technical SEO | | gorillakid
http://klinehimalaya.com/0 -
301 for old domain to new domain - Joomla plugin or cpanel?
A client changed domains and both are being indexed. There are thousands of content pages. I can install a 301 redirect Joomla plugin and configure it so that each page redirects to the new domain. I have a feeling I will need to manual set every page. OR I can create a domain level redirect setting in cpanel using wildcards. I think this will automatically pass every old URL to the new URL. Which is the better approach? The cpanel option sounds like less work.
Technical SEO | | designquotes0 -
New domain
Hi, I have a domain with no keywords on it, and I´ve been using it for years. Now I bought another domain with the keyword on it. I whant to work on seo for the second domain, with the keyword. What is the better way to work this out? 301? Duplicate de site? redirect in another way?
Technical SEO | | mgfarte0 -
External Sitewide Links and SEO
I have one big question about the potential SEO value -- and possibly also dangers? -- of "followed" external sitewide links. Examples of these would be: a link to your site from another site's footer a blogroll link a link to your site from another site's global navigation Aside from the link's position in the HTML file (the higher the better, presumably), are these links essentially the same from an SEO point of view or different (and how)? There used to be an influential view out there that the link juice value of a sitewide link was the same as that of a single link (presumably from the linking site's home page), even though a sitewide link may in fact result a huge number individual links. Is this true or false? What is the math here? Should one worry about having "too many" sitewide links, in the sense that this may raise red flags by way of the algo? I talked to someone a few months ago (before the recent algo updates) who believed that he had got a minus 10 penalty or whatever it was for getting too many sitewide links We offer website design and development as well as SEO, and we put a keyworded link to ourselves in the footer. I think this is a fairly common practice. Is this a good or bad idea SEO-wise? One opinion is that for external sitewide footer links, you should best have a dofollow link on the home page, but nofollow it on all other pages. What is your opinion about that? Is there anything else that is distinct, interesting or important about sitewide links' SEO value and pitfalls? Thank you!
Technical SEO | | Philip-SEO1