How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Prioritise a page in Google/why is a well-optimised page not ranking
Hello I'm new to Moz Forums and was wondering if anyone out there could help with a query. My client has an ecommerce site selling a range of pet products, most of which have multiple items in the range for difference size animals i.e. [Product name] for small dog
Intermediate & Advanced SEO | | LauraSorrelle
[Product name] for medium dog
[Product name] for large dog
[Product name] for extra large dog I've got some really great rankings (top 3) for many keyword searches such as
'[product name] for dogs'
'[product name]' But these rankings are for individual product pages, meaning the user is taken to a small dog product page when they might have a large dog or visa versa. I felt it would be better for the users (and for conversions and bounce rates), if there was a group page which showed all products in the range which I could target keywords '[product name]', '[product name] for dogs'. The page would link through the the individual product pages. I created some group pages in autumn last year to trial this and, although they are well-optimised (score of 98 on Moz's optimisation tool), they are not ranking well. They are indexed, but way down the SERPs. The same group page format has been used for the PPC campaign and the difference to the retention/conversion of visitors is significant. Why are my group pages not ranking? Is it because my client's site already has good rankings for the target term and Google does not want to show another page of the site and muddy results?
Is there a way to prioritise the group page in Google's eyes? Or bring it to Google's attention? Any suggestions/advice welcome. Thanks in advance Laura0 -
Duplicate page content on numerical blog pages?
Hello everyone, I'm still relatively new at SEO and am still trying my best to learn. However, I have this persistent issue. My site is on WordPress and all of my blog pages e.g page one, page two etc are all coming up as duplicate content. Here are some URL examples of what I mean: http://3mil.co.uk/insights-web-design-blog/page/3/ http://3mil.co.uk/insights-web-design-blog/page/4/ Does anyone have any ideas? I have already no indexed categories and tags so it is not them. Any help would be appreciated. Thanks.
Intermediate & Advanced SEO | | 3mil0 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
Does Google only look at LSI per page or context of the Site?
From what I have read i should optimise each page for a keyword/phrase, however, I read recently that google may also look at the context of the site to see if there are other similar words. For example i have different pages optimised for Funeral Planning, funeral plans, funeral plan costs, compare funeral plans, why buy a funeral plan, paying for a funeral, prepaid funeral plans. Is this the best strategy when the words/phrases are so close or should i go for longer pages with the variations on one page or at least less pages? Thanks Ash
Intermediate & Advanced SEO | | AshShep10 -
Why is my XML sitemap ranking on the first page of google for 100s of key words versus the actual relevant page?
I still need this question answerd and I know it's something I must have changed. But google is ranking my sitemap for 100s of key terms versus the actual page. It's great to be on the first page but not my site map...... Geeeez.....
Intermediate & Advanced SEO | | ursalesguru0 -
Multiple Google+ Local (Google Place) under one email address
As a automotive dealership group, we have 15+ business listings set up under one Google+ local account. Google+ Local (Google Places) offers the ability to upload a data file for 10+ listings, so we've kept all listings under one login for efficiency. Is there any specific local SEO benefit or any general benefit at all to having each business listing set up under their own separate email address?
Intermediate & Advanced SEO | | autoczar0 -
Forced Trailing Slash on PHP Pages In Google!
Hi, have never seen this before so am hoping soemone can shed some light! Google seems to be forcing a trailing slash at the end of our php pages within the search results. For example, Google is indexing http://www.kingstoncabinets.co.uk/heat-convection.php/ (trailing slash) rather than http://www.kingstoncabinets.co.uk/heat-convection.php (without slash). The trailing slash page doesn't display properly and has no page authority. I've ran the site through Screaming Frog and there are no internal links the the trailing slash URL's, nor does there seem to be any external links to these pages. Why is Google forcing the slash? Am confused 😞
Intermediate & Advanced SEO | | Webpresence0 -
Any idea why I can't add a Panoramio image link to my Google Places page?
Hey guys & gals! Last week, I watched one of the Pro Webinars on here related to Google Places. Since then, I have begun to help one of my friends with his GP page to get my feet wet. One of the tips from the webinar was to geotag images in Panoramio to use for your images on the Places page. However, when I try to do this, I just get an error that says they can't upload it at this time. I tried searching online for answers, but the G support pages that I have found where someone asks the same question, there is no resolution. Can anyone help? PS - I would prefer not to post publicly the business name, URL, etc. So, if that info is needed, I can PM. Thanks a lot!
Intermediate & Advanced SEO | | strong11