Which pages should I index or have in my XML sitemap?
-
Hi there,
my website is ConcertHotels.com - a site which helps users find hotels close to concert venues. I have a hotel listing page for every concert venue on my site - about 12,000 of them I think (and the same for nearby restaurants).
e.g.
https://www.concerthotels.com/venue-hotels/madison-square-garden-hotels/304484
Each of these pages list the nearby hotels to that concert venue. Users clicking on the individual hotel are brought through to a hotel (product) page e.g.
https://www.concerthotels.com/hotel/the-new-yorker-a-wyndham-hotel/136818
I made a decision years ago to noindex all of the /hotel/ pages since they don't have a huge amount of unique content and aren't the pages I'd like my users to land on . The primary pages on my site are the /venue-hotels/ listing pages.
I have similar pages for nearby restaurants, so there are approximately 12,000 venue-restaurants pages, again, one listing page for each concert venue.
However, while all of these pages are potentially money-earners, in reality, the vast majority of subsequent hotel bookings have come from a fraction of the 12,000 venues. I would say 2000 venues are key money earning pages, a further 6000 have generated income of a low level, and 4000 are yet to generate income.
I have a few related questions:
-
Although there is potential for any of these pages to generate revenue, should I be brutal and simply delete a venue if it hasn't generated revenue within a time period, and just accept that, while it "could" be useful, it hasn't proven to be and isn't worth the link equity. Or should I noindex these "poorly performing pages"?
-
Should all 12,000 pages be listed in my XML sitemap? Or simply the ones that are generating revenue, or perhaps just the ones that have generated significant revenue in the past and have proved to be most important to my business?
Thanks
Mike
-
-
Hi Chris,
thank you very much for your help and suggestions, it is much appreciated. I'll de-noindex a handful of my biggest artist pages and see if they attract much interest from users.
As for the /venues/ pages, these have been fairly neglected to date, so perhaps I need to really focus my attention on them, as you say, and bring in some cross referencing.
I have also wondered whether allowing companies to create pages dedicated to their events would be a good route to take - it could be done with ease, so perhaps I should investigate further.
Again, thanks very much, and hopefully I can report back with good news at some point.
Best wishes
Mike
-
I think they should be indexed, but keyword research should shed light on this topic for you. It will let you know if your audience is searching for those things and in what numbers. Even as they are, though, they might make sufficient landing pages for google. You could de-noindex a group of those pages at a time, starting with the ones most likely to be popular and see how google treats them. I think I'd go that route rather than release them into the wild all at once.
To me, the pages with the most interesting potential are the /venues/ pages like /venues/md-concert-venues/a, for example. I think the potential lies in populating them with venue grouping, upcoming artists grouping, and state. How hard would it be to populate an area above the black line with all/some of the upcoming artists playing near the hotels that show on that page. That 3-way cross referencing would make those pages fairly unique on the web and unique on your site and would give google a number of good reasons to send traffic there. They'd probably be good pages to publish advertising on, too.
Also wondering if there is a thing such as "licensing" dedicated pages out to companies/hotels that are putting on non-musical events like conferences, etc, so they can link to a kind of pre-fab hotels-close-by page up for their attendees?
-
Thanks Chris,
appreciate your comments. Google in indexing a high percentage of the key pages, and does not have any noindex pages indexed. Pages are loading at a decent speed. And only indexed pages are in the sitemap. So perhaps the non-performing pages are not something I should be particularly concerned about, especially they don't necessarily take up much of my time. I guess if I start to run into issues with overall site speed then perhaps then is is the time to consider whether they should continue to be listed. So perhaps, you're right, it's more of a business decision, rather than an SEO one.
I have a further question if you don't mind, which is related but I think is an SEO one. I have a large number of /artist/ pages - these are pages that list which venues a particular artist is performing at, and allows a user to then check hotel availability for the specific venue and date they will be attending. At the minute the pages are fairly light on content - they just list venues and dates, although I'm planning to start introducing more content in the near future. An example page can be seen here:
https://www.concerthotels.com/artist/hotels-near-guns-n-roses-events/1227
At the minute, I've noindexed every artist page on the site, because I was worried Google would see them as thin pages. But I actually think they are potentially very useful to users, and a powerful landing page for quickly taking a user to the correct venue page with the correct dates for the concert. I also think that not all users will search for "Hotels near Metlife Stadium" - they might instead search for "Hotels for Guns n roses in NJ..." etc etc. so perhaps I can pick up some long tail searches with these additional landing pages.
The question is, should I index these pages?
If the answer to that question is yes..... obviously, artists/bands do a tour and then generally disappear into a recording studio for a year or two - as a result, there will be many /artist/ pages that, for a while, have lots of useful event dates/venues listed, but at the end of the tour, the pages will simply be empty, and no longer useful, at least until the next tour. Would you recommend that such pages are indexed when there are events, but when no future events are listed, I set them to noindex?
Many thanks
Mike
-
Mike,
I'm wondering...is that an SEO question? It sounds like a business decision to me. From what you've said, I don't see any reason for Google to ding you on anything. My only questions would be--Is google indexing all the pages you want it to and does not have your noindex pages indexed? Any bad links coming in? Pages are loading at a decent speed? Oh, and I don't see a reason to have your noindex pages in the the sitemap.
Other than that, if those non-performing page are taking up time that you could be spending on more productive pages or on exploring more productive opportunities, then, again, it's time to put on your CEO cap.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My WP website got attack by malware & now my website site:www.example.ca shows about 43000 indexed page in google.
Hi All My wordpress website got attack by malware last week. It affected my index page in google badly. my typical site:example.ca shows about 130 indexed pages on google. Now it shows about 43000 indexed pages. I had my server company tech support scan my site and clean the malware yesterday. But it still shows the same number of indexed page on google. Does anybody had ever experience such situation and how did you fixed it. Looking for help. Thanks FILE HIT LIST:
Technical SEO | | Chophel
{YARA}Spam_PHP_WPVCD_ContentInjection : /home/example/public_html/wp-includes/wp-tmp.php
{YARA}Backdoor_PHP_WPVCD_Deployer : /home/example/public_html/wp-includes/wp-vcd.php
{YARA}Backdoor_PHP_WPVCD_Deployer : /home/example/public_html/wp-content/themes/oceanwp.zip
{YARA}webshell_webshell_cnseay02_1 : /home/example2/public_html/content.php
{YARA}eval_post : /home/example2/public_html/wp-includes/63292236.php
{YARA}webshell_webshell_cnseay02_1 : /home/example3/public_html/content.php
{YARA}eval_post : /home/example4/public_html/wp-admin/28855846.php
{HEX}php.generic.malware.442 : /home/example5/public_html/wp-22.php
{HEX}php.generic.cav7.421 : /home/example5/public_html/SEUN.php
{HEX}php.generic.malware.442 : /home/example5/public_html/Webhook.php0 -
Indexing Issue of Dynamic Pages
Hi All, I have a query for which i am struggling to find out the answer. I unable to retrieve URL using "site:" query on Google SERP. However, when i enter the direct URL or with "info:" query then a snippet appears. I am not able to understand why google is not showing URL with "site:" query. Whether the page is indexed or not? Or it's soon going to be deindexed. Secondly, I would like to mention that this is a dynamic URL. The index file which we are using to generate this URL is not available to Google Bot. For instance, There are two different URL's. http://www.abc.com/browse/ --- It's a parent page.
Technical SEO | | SameerBhatia
http://www.abc.com/browse/?q=123 --- This is the URL, generated at run time using browse index file. Google unable to crawl index file of browse page as it is unable to run independently until some value will get passed in the parameter and is not indexed by Google. Earlier the dynamic URL's were indexed and was showing up in Google for "site:" query but now it is not showing up. Can anyone help me what is happening here? Please advise. Thanks0 -
What is the best practice to re-index the de-indexed pages due to a bad migration
Dear Mozers, We have a Drupal site with more than 200K indexed URLs. Before 6 months a bad website migration happened without proper SEO guidelines. All the high authority URLs got rewritten by the client. Most of them are kept 404 and 302, for last 6 months. Due to this site traffic dropped more than 80%. I found today that around 40K old URLs with good PR and authority are de-indexed from Google (Most of them are 404 and 302). I need to pass all the value from old URLs to new URLs. Example URL Structure
Technical SEO | | riyas_
Before Migration (Old)
http://www.domain.com/2536987
(Page Authority: 65, HTTP Status:404, De-indexed from Google) After Migration (Current)
http://www.domain.com/new-indexed-and-live-url-version Does creating mass 301 redirects helps here without re-indexing the old URLS? Please share your thoughts. Riyas0 -
Is there a way for me to automatically download a website's sitemap.xml every month?
From now on we want to store all our sitemap.xml over the next years. Its a nice archive to have that allows us to analyse how many pages we have on our website and which ones were removed/redirected. Any suggestions? Thanks
Technical SEO | | DeptAgency0 -
Duplicate pages in Google index despite canonical tag and URL Parameter in GWMT
Good morning Moz... This is a weird one. It seems to be a "bug" with Google, honest... We migrated our site www.three-clearance.co.uk to a Drupal platform over the new year. The old site used URL-based tracking for heat map purposes, so for instance www.three-clearance.co.uk/apple-phones.html ..could be reached via www.three-clearance.co.uk/apple-phones.html?ref=menu or www.three-clearance.co.uk/apple-phones.html?ref=sidebar and so on. GWMT was told of the ref parameter and the canonical meta tag used to indicate our preference. As expected we encountered no duplicate content issues and everything was good. This is the chain of events: Site migrated to new platform following best practice, as far as I can attest to. Only known issue was that the verification for both google analytics (meta tag) and GWMT (HTML file) didn't transfer as expected so between relaunch on the 22nd Dec and the fix on 2nd Jan we have no GA data, and presumably there was a period where GWMT became unverified. URL structure and URIs were maintained 100% (which may be a problem, now) Yesterday I discovered 200-ish 'duplicate meta titles' and 'duplicate meta descriptions' in GWMT. Uh oh, thought I. Expand the report out and the duplicates are in fact ?ref= versions of the same root URL. Double uh oh, thought I. Run, not walk, to google and do some Fu: http://is.gd/yJ3U24 (9 versions of the same page, in the index, the only variation being the ?ref= URI) Checked BING and it has indexed each root URL once, as it should. Situation now: Site no longer uses ?ref= parameter, although of course there still exists some external backlinks that use it. This was intentional and happened when we migrated. I 'reset' the URL parameter in GWMT yesterday, given that there's no "delete" option. The "URLs monitored" count went from 900 to 0, but today is at over 1,000 (another wtf moment) I also resubmitted the XML sitemap and fetched 5 'hub' pages as Google, including the homepage and HTML site-map page. The ?ref= URls in the index have the disadvantage of actually working, given that we transferred the URL structure and of course the webserver just ignores the nonsense arguments and serves the page. So I assume Google assumes the pages still exist, and won't drop them from the index but will instead apply a dupe content penalty. Or maybe call us a spam farm. Who knows. Options that occurred to me (other than maybe making our canonical tags bold or locating a Google bug submission form 😄 ) include A) robots.txt-ing .?ref=. but to me this says "you can't see these pages", not "these pages don't exist", so isn't correct B) Hand-removing the URLs from the index through a page removal request per indexed URL C) Apply 301 to each indexed URL (hello BING dirty sitemap penalty) D) Post on SEOMoz because I genuinely can't understand this. Even if the gap in verification caused GWMT to forget that we had set ?ref= as a URL parameter, the parameter was no longer in use because the verification only went missing when we relaunched the site without this tracking. Google is seemingly 100% ignoring our canonical tags as well as the GWMT URL setting - I have no idea why and can't think of the best way to correct the situation. Do you? 🙂 Edited To Add: As of this morning the "edit/reset" buttons have disappeared from GWMT URL Parameters page, along with the option to add a new one. There's no messages explaining why and of course the Google help page doesn't mention disappearing buttons (it doesn't even explain what 'reset' does, or why there's no 'remove' option).
Technical SEO | | Tinhat0 -
Index page 404 error
Crawl Results show there is 404 error page which is index.htmk **it is under my root, ** http://mydomain.com/index.htmk I have checked my index page on the server and my index page is index.HTML instead of index.HTMK. Please help me to fix it
Technical SEO | | semer0 -
Any idea why our sitemap images aren't indexed?
Here's our sitemap: http://www.driftworks.com/shop/sitemap/dw_sitemap.xml In google webmaster tools, I can see the sitemap report and it says: Items:Web Submitted:2,798 Indexed:2,910 Items:Images Submitted:3,178 Indexed:0 Do you have any idea why our images are not being indexed according to webmaster tools? I checked a few of the image URLs and they worked nicely. Thanks in advance, J
Technical SEO | | DWJames0 -
If a page isn't linked to or directly sumitted to a search engine can it get indexed?
Hey Guys, I'm curious if there are ways a page can get indexed even if the page isn't linked to or hasn't been submitted to a search engine. To my knowledge the following page on our website is not linked to and we definitely didn't submit it to Google - but it's currently indexed: <cite>takelessons.com/admin.php/adminJobPosition/corp</cite> Anyone have any ideas as to why or how this could have happened? Hopefully I'm missing something obvious 🙂 Thanks, Jon
Technical SEO | | TakeLessons0