Historic issue with incomplete indexing
-
Hi there
We run quite a big site in the UK in the commercial real-estate space.
Historically we have always had a challenge getting our "primary" landing pages indexed, which are location based property result pages.
e.g. https://realla.co/to-rent/commercial-property/oxford
For example, for the "towns" category we have 8,549 submitted in our xml sitemap, with only 3,171 indexed. This is a general issue across all our sitemaps. 120k submitted, 80k indexed. Our pages are linked through breadcrumbs, and nearby links.
In the new search console these pages are reported as "crawled - currently not indexed"
These all sit under the folder:
site:https://realla.co/to-rent/commercial-property/*
site:https://realla.co/to-rent/office/*
We have done extensive work to optimise performance, including AMP pages.
Each location page has many details pages for individual properties e.g.
https://realla.co/to-rent/details/0ffbbd0a1a1147edb8847c5ce6179509
One action we have remaining is to nest the details under the locations pages, which may help. These details pages are indexed fully.
Any feedback much appreciated
-
Hi Ian,
The details URL should ideally have keywords in it, getting property name in details page URL would be of great help, like : https://realla.co/to-rent/details/Office-to-let-John-Eccles-House-Robert Robinson-Avenue-Oxford-Science-Park-Oxford-OX4-4GP
About the category (locations in your case), you are submitting too many of them, your URL structure needs to re-structured, there is work to be done there and sitemap updated according to that. For example:
https://realla.co/to-rent/commercial-property/
can be changed to
https://realla.co/commercial-property-to-rent/
I hope this helps, let me know if you have further queries.
Regards,
Vijay
-
Thanks for your reply
We are just about to nest the "details" pages under the results path e.g. /to-rent/commercial-property/newbury/details/1294321739712973129 etc so it sits under the right location.
I think this is in line with your recommendation.
We have alot of individual sitemap files, should these be consolidated?
-
Hi Ian,
I have analyzed the website in detail, the problem seems to be that you are not giving any differentiation to search engine bots between important category/sub-category(in your case different locations) pages compared to product pages (in your case property details page). The location pages URL structure and their sitemap submission strategy can be re-worked to get the desired results.
Another scope of improvement is in URL structure for property details page **For example, **
https://realla.co/to-rent/details/0ffbbd0a1a1147edb8847c5ce6179509 should be https://realla.co/to-rent/details/Office-to-let-John-Eccles-House-Robert Robinson-Avenue-Oxford-Science-Park-Oxford-OX4-4GP
Your site structure is huge, and it must be getting dynamic links generated or removed, you need to be careful with the site structure and how often to submit sitemap.
I hope this helps. Let me know if you have further queries, I will be happy to help.
Regards,
Vijay
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Some Issues about my Blog
I am facing issue regarding to my Blog https://digitalmedialine.com/blog/. As some pages are not Rank in google yet. Can Anyone help me out how to rank those blogs to improve my Traffic. Thanks in Advance.
Technical SEO | | qwaswd0 -
No index tag robots.txt
Hi Mozzers, A client's website has a lot of internal directories defined as /node/*. I already added the rule 'Disallow: /node/*' to the robots.txt file to prevents bots from crawling these pages. However, the pages are already indexed and appear in the search results. In an article of Deepcrawl, they say you can simply add the rule 'Noindex: /node/*' to the robots.txt file, but other sources claim the only way is to add a noindex directive in the meta robots tag of every page. Can someone tell me which is the best way to prevent these pages from getting indexed? Small note: there are more than 100 pages. Thanks!
Technical SEO | | WeAreDigital_BE
Jens0 -
Why are my webpages not getting indexed?
I want to figure out why a lot of my pages for my website are not getting indexed by google. I have installed the SEO plugin by Yoast to my wordpress website. Under the titles and meta section of the plugin options I have set categories and tags to noindex. In WMT, google is saying that all my category pages and most of my tag pages are not being indexed. I want to make sure that the reason these pages are not being indexed are because of the SEO plugin. I want to prevent duplicate content so that is the reason I have set my categories and tags to noindex. Please respond if you know the absolute answer, its very important that I have my website indexed the proper way I want it to.
Technical SEO | | Dino640 -
Duplicate Content Issue
SEOMOZ is giving me a number of duplicate content warnings related to pages that have an email a friend and/or email when back in stock versions of a page. I thought I had those blocked via my robots.txt file which contains the following... Disallow: /EmailaFriend.asp Disallow: /Email_Me_When_Back_In_Stock.asp I had thought that the robot.txt file would solve this issue. Anyone have any ideas?
Technical SEO | | WaterSkis.com0 -
Noindex Pages indexed
I'm having problem that gogole is index my search results pages even though i have added the "noindex" metatag. Is the best thing to block the robot from crawling that file using robots.txt?
Technical SEO | | Tedred0 -
Home page canonical issues
I think I’ve got a canonical issue with a client’s site that I’m having problems with I’ve noticed in their analytics that they receive traffic from themselves. I’ve used ‘ rel canonical’ throughout the site to avoid any dup issues and I have 301’ed every other variation of the home page I can think of. I don’t have full access to the back end of the host to control any of the iis as it’s an asp site. They seem to be getting traffic from their site under the URL of, example.com I’ve 301 redirected www.example.com/home.asp www.example.com/default.asp www.example.com/index.asp to www.example.com And 'rel canonical' the home page to www.example.com but still seem to be having the same problem any ideas? Thanks
Technical SEO | | FarkyRafiq0 -
Dealing with indexable Ajax
Hello there, My site is basically an Ajax application. We assume lots of people link into deep pages on the site, but bots won't be able to read past the hashmarks, meaning all links appear to go to our home page. So, we have decided to form our Ajax for indexing. And so many questions remain. First, only Google handles indexable Ajax, so we need to keep our static "SEO" pages up for Bing and Yahoo. Bummer, dude, more to manage. 1. How do others deal with the differences here? 2. If we have indexable Ajax and static pages, can these be perceived as duplicate content? Maybe the answer is to disallow google bot from indexing the static pages we made. 3. What does your canonical URL become? Can you tell different search engines to read different canonical URLs? So many more questions, but I'll stop there. Curious if anyone here has thoughts (or experience) on the matter. Erin
Technical SEO | | ErinTM2 -
Is this 404 page indexed?
I have a URL that when searched for shows up in the Google index as the first result but does not have any title or description attached to it. When you click on the link it goes to a 404 page. Is it simply that Google is removing it from the index and is in some sort of transitional phase or could there be another reason.
Technical SEO | | bfinternet0