URLs are not indexed
-
My website has 0.5 million pages with urls like this- **http://www.mycity4kids.com/Delhi-NCR/collage-painting-classes-%3cnear%3e-shalimar-bagh ****, **none of these urls are indexed.
Question 1- What can be the possible reason for this issue? Users see this url as : http://www.mycity4kids.com/Delhi-NCR/collage-painting-classes-<near>-shalimar-bagh</near>
The symbol "<" and ">" get converted into "%3c" and "%3e" respectively, is this the reason for these urls not getting indexed? -
Hi Prashant,
If these URLs are only created when a user searches for a term, but are not linked to anywhere in HTML either on your site or elsewhere on the Internet, this is a very good reason why they are not indexed.
Google does not usually perform queries on a site (e.g. fill in forms) to "discover" what content might be displayed when those forms are filled in. It's tried and tested method of crawling is just that - crawling links and text in HTML. It has become more adventurous with different technology and sometimes finds things that it wouldn't have previously, but linking is still the primary way to ensure something gets crawled.
In many cases, you wouldn't want Google finding content that it has to perform queries / fill in forms or searches to get to: this is how some sites create massive amounts of duplicate content by accident. So in a way, Google is doing everyone a favour by not indexing URLs like this.
We have submitted these urls on google webmaster using a "sitemap" for indexing still none of them are indexed.
I tend to think of sitemaps like road maps: they're a guide. The site itself is the road. If a map tells me that I can drive across a river but when I get there, this is no bridge, I'm not going to drive across the river. Maybe I will if I have a huge four-wheel-drive car with a snorkel, but possibly not. Maybe Google will index URLs it can't find on the web itself, but possibly not.
If you want the URLs to be found, link to them
Cheers,
Jane
-
Just Google the URL it can't be simpler, if there is a result it's indexed.
-
Thanks Martijn
Could you please tell how many ways are there to check whether a page is indexed or not?
Someone on a different forum told me that we can check "indexed status" of a url by directly googling that url, there is no need to use "site:" operator.
-
Hi Prashant,
Then that's definitely your issue, I wouldn't index any pages to be honest that I couldn't find in any navigation besides self finding the pages. I would link from any related pages to the pages that you want to have indexed.
-
Thanks Martijn,
Yes, if a user enter a search term then these URL get formed. for example if you visit : www.mycity4kids.com and enter Collage Painting Classes in 1st search box and Shalimar Bagh in 2nd search box then you will see this:** http://www.mycity4kids.com/Delhi-NCR/collage-painting-classes-<near>-shalimar-bagh</near>** url in address bar.
We have submitted these urls on google webmaster using a "sitemap" for indexing still none of them are indexed.
-
Hi Prashant,
Can these pages be found via the navigation or do you have to enter a search term to get to them? I can't find any links to the first page you've listed in your question. If that's the case than it's quite normal that Google didn't index the page as they just aren't able to find the page in the first place.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Product Pages not indexed by Google
We built a website for a jewelry company some years ago, and they've recently asked for a meeting and one of the points on the agenda will be why their products pages have not been indexed. Example: http://rocks.ie/details/Infinity-Ring/7170/ I've taken a look but I can't see anything obvious that is stopping pages like the above from being indexed. It has a an 'index, follow all' tag along with a canonical tag. Am I missing something obvious here or is there any clear reason why product pages are not being indexed at all by Google? Any advice would be greatly appreciated. Update I was told 'that each of the product pages on the full site have corresponding page on mobile. They are referred to each other via cannonical / alternate tags...could be an angle as to why product pages are not being indexed.'
Intermediate & Advanced SEO | | RobbieD910 -
How to de-index old URLs after redesigning the website?
Thank you for reading. After redesigning my website (5 months ago) in my crawl reports (Moz, Search Console) I still get tons of 404 pages which all seems to be the URLs from my previous website (same root domain). It would be nonsense to 301 redirect them as there are to many URLs. (or would it be nonsense?) What is the best way to deal with this issue?
Intermediate & Advanced SEO | | Chemometec0 -
Google and PDF indexing
It was recently brought to my attention that one of the PDFs on our site wasn't showing up when looking for a particular phrase within the document. The user was trying to search only within our site. Once I removed the site restriction - I noticed that there was another site using the exact same PDF. It appears Google is indexing that PDF but not ours. The name, title, and content are the same. Is there any way to get around this? I find it interesting as we use GSA and within GSA it shows up for the phrase. I have to imagine Google is saying that it already has the PDF and therefore is ignoring our PDF. Any tricks to get around this? BTW - both sites rightfully should have the PDF. One is a client site and they are allowed to host the PDFs created for them. However, I'd like Mathematica to also be listed. Query: no site restriction (notice: Teach for america comes up #1 and Mathematica is not listed). https://www.google.com/search?as_q=&as_epq=HSAC_final_rpt_9_2013.pdf&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=&as_occt=any&safe=images&tbs=&as_filetype=pdf&as_rights=&gws_rd=ssl#q=HSAC_final_rpt_9_2013.pdf+"Teach+charlotte"+filetype:pdf&as_qdr=all&filter=0 Query: site restriction (notice that it doesn't find the phrase and redirects to any of the words) https://www.google.com/search?as_q=&as_epq=HSAC_final_rpt_9_2013.pdf&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=&as_occt=any&safe=images&tbs=&as_filetype=pdf&as_rights=&gws_rd=ssl#as_qdr=all&q="Teach+charlotte"+site:www.mathematica-mpr.com+filetype:pdf
Intermediate & Advanced SEO | | jpfleiderer0 -
URL rewrite traffic drop
Hello, A while ago (Sep. 19 2013) we had a new url structure upgrade for products pages within our website (with all the needed 301 redirects in place,internal links & sitemaps updates), but our new urls lost the serps of the old ones and with that we experienced a big traffic drop (and since September I can't see any sign of recovery).
Intermediate & Advanced SEO | | Silviu
Here are just 3 examples of old and coresponding new urls: http://www.nobelcom.com/phone-cards/calling-Mexico-from-United-States-1-182.html
http://www.nobelcom.com/Mexico-phone-cards-182.html http://www.nobelcom.com/es/phone-cards/calling-Mexico-from-United-States-1-182.html
http://www.nobelcom.com/es/Mexico-tarjetas-telefonicas-182.html http://www.nobelcom.com/phone-cards/calling-Angola-Cell-from-Canada-55-407.html
http://www.nobelcom.com/Angola-Cell-phone-cards/from-Canada-55-407.html We followed every seo/usability rule and have no clue why this happened. Any ideea? Cheers,
S.0 -
How to deal with old, indexed hashbang URLs?
I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/). My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries. So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #). I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way? If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue. Best, -G
Intermediate & Advanced SEO | | Celts180 -
Changing URL Structure
We are going to be relaunching our website with a new URL structure. My question is, how is it best to deal with the migration process in terms of old URLS appearing whilst we launch the new ones. How best should we launch the new structure, considering we've in the region of 10,000 pages currently indexed in Google.
Intermediate & Advanced SEO | | NeilTompkins0 -
Removing a Page From Google index
We accidentally generated some pages on our site that ended up getting indexed by google. We have corrected the issue on the site and we 404 all of those pages. Should we manually delete the extra pages from Google's index or should we just let Google figure out that they are 404'd? What the best practice here?
Intermediate & Advanced SEO | | dbuckles0 -
URL Length or Exact Breadcrumb Navigation URL? What's More Important
Basically my question is as follows, what's better: www.romancingdiamonds.com/gemstone-rings/amethyst-rings/purple-amethyst-ring-14k-white-gold (this would fully match the breadcrumbs). or www.romancingdiamonds.com/amethyst-rings/purple-amethyst-ring-14k-white-gold (cutting out the first level folder to keep the url shorter and the important keywords are closer to the root domain). In this question http://www.seomoz.org/qa/discuss/37982/url-length-vs-url-keywords I was consulted to drop a folder in my url because it may be to long. That's why I'm hesitant to keep the bradcrumb structure the same. To the best of your knowldege do you think it's best to drop a folder in the URL to keep it shorter and sweeter, or to have a longer URL and have it match the breadcrumb structure? Please advise, Shawn
Intermediate & Advanced SEO | | Romancing0