Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should pages of old news articles be indexed?
-
My website published about 3 news articles a day and is set up so that old news articles can be accessed through a "back" button with articles going to page 2 then page 3 then page 4, etc... as new articles push them down. The pages include a link to the article and a short snippet.
I was thinking I would want Google to index the first 3 pages of articles, but after that the pages are not worthwhile. Could these pages harm me and should they be noindexed and/or added as a canonical URL to the main news page - or is leaving them as is fine because they are so deep into the site that Google won't see them, but I also won't be penalized for having week content?
Thanks for the help!
-
Ah I'm sorry I misinterpreted you - so it's essentially about pagination? Rel Next/Rel Previous is probably the best way to go - the first page will be given the equity and the pages won't have to compete with each other for ranking. Google have a pretty comprehensive guide: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1663744
-
Thanks Alice, but my question is about the page where the article is linked from not the actual article itself ( which 100% is staying indexed )
-
Hi Sara,
If the articles are time sensitive but high quality, I wouldn't noindex them. They could still have value in the future (for example, if a related story comes up, you can link back to the old article). You might also find ways to refresh or recycle them, such as adding a follow up, updating the information, or promoting a really great post "From Our Archives". They could also be a good longtail source of traffic for people looking for information on past news/events.
Google will be able to index old and outdated articles, but it's smart enough to know that these posts are old and outdated and therefore won't assign big chunks of page rank to them.
However if the articles are low quality, I would take action to improve the good content/poor content ratio. The ideal situation would be to improve the articles themselves, but that might not be a feasible solution if you've been publishing three per day for an extended period of time. I would conduct a thorough audit to see what content could be saved/improved and what content should be deleted. I wouldn't bother with no index or canonicals - if it's good content leave it up and let it be indexed, and if it's bad content that can't be saved, remove it.
Finally if you are redirecting old articles, I would be careful about where they redirect to. Ideally you'd want to redirect from a low quality article to a high quality article on the same subject. A big increase in URLs pointing to the main news page could raise a red flag, and could force readers to look for information unnecessarily.
Good luck!
-
The news articles themselves are not thin content, but the general pages are relatively thin because they only consist of the link + snippet.
-
Are they all thin content? If not, then I don't think it's necessary to NOINDEX them. If you think some of them don't have any real value, you could specifically NOINDEX them(and not all together). Google will crawl those pages no matter how deep they are, as long as they are accessible.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Magento: Should we disable old URL's or delete the page altogether
Our developer tells us that we have a lot of 404 pages that are being included in our sitemap and the reason for this is because we have put 301 redirects on the old pages to new pages. We're using Magento and our current process is to simply disable, which then makes it a a 404. We then redirect this page using a 301 redirect to a new relevant page. The reason for redirecting these pages is because the old pages are still being indexed in Google. I understand 404 pages will eventually drop out of Google's index, but was wondering if we were somehow preventing them dropping out of the index by redirecting the URL's, causing the 404 pages to be added to the sitemap. My questions are: 1. Could we simply delete the entire unwanted page, so that it returns a 404 and drops out of Google's index altogether? 2. Because the 404 pages are in the sitemap, does this mean they will continue to be indexed by Google?
Intermediate & Advanced SEO | | andyheath0 -
My blog is indexing only the archive and category pages
Hi there MOZ community. I am new to the QandA and have a question. I have a blog Its been live for months - but I can not get the posts to rank in the serps. Oddly only the categories rank. The posts are crawled it seems - but seen as less important for a reason I don't understand. Can anyone here help with this? See here for what i mean. I have had several wp sites rank well in the serps - and the posts do much better. Than the categories or archives - super odd. Thanks to all for help!
Intermediate & Advanced SEO | | walletapp0 -
Pages are Indexed but not Cached by Google. Why?
Here's an example: I get a 404 error for this: http://webcache.googleusercontent.com/search?q=cache:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all But a search for qjamba restaurant coupons gives a clear result as does this: site:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all What is going on? How can this page be indexed but not in the Google cache? I should make clear that the page is not showing up with any kind of error in webmaster tools, and Google has been crawling pages just fine. This particular page was fetched by Google yesterday with no problems, and even crawled again twice today by Google Yet, no cache.
Intermediate & Advanced SEO | | friendoffood2 -
Date of page first indexed or age of a page?
Hi does anyone know any ways, tools to find when a page was first indexed/cached by Google? I remember a while back, around 2009 i had a firefox plugin which could check this, and gave you a exact date. Maybe this has changed since. I don't remember the plugin. Or any recommendations on finding the age of a page (not domain) for a website? This is for competitor research not my own website. Cheers, Paul
Intermediate & Advanced SEO | | MBASydney0 -
Article Marketing / Article Posting
I am working on the SEO on a few different websites and I have built out an article marketing campaign so that I can get high quality backlinks for my website. I have been writing the content myself and I have been manually building out the top Web 2.0, Article Directory, and Doc Sharing sites. today I was creating an account on squidoo and I wondered if it mattered if I had the username be one of two things: my keyword as a user name, like: [keyword+geotag] example: roofinghouston just my first and last name as the username (or just a username I always use) (The reason behind #1 would be to have the optimized keyword and location I am trying to rank for, inside of the username. The reason for #2 would be that I don't want to get into trouble by having "too much" optimization.) I know a bit about optimization and that getting your keyword out there is great in a lot of areas, but I am not sure if it looks "suspicious" if I have my username be the keyword+geotag. I am just worried that all of this hard work will be torn down if I look like I'm trying too hard to be optimized, etc etc. There is no one answer, I am mainly looking for shared experiences. If you do have a definite answer, then I would like that too 🙂 Thanks SEOMoz!
Intermediate & Advanced SEO | | SEOWizards0 -
How to deal with old, indexed hashbang URLs?
I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/). My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries. So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #). I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way? If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue. Best, -G
Intermediate & Advanced SEO | | Celts180 -
NOINDEX listing pages: Page 2, Page 3... etc?
Would it be beneficial to NOINDEX category listing pages except for the first page. For example on this site: http://flyawaysimulation.com/downloads/101/fsx-missions/ Has lots of pages such as Page 2, Page 3, Page 4... etc: http://www.google.com/search?q=site%3Aflyawaysimulation.com+fsx+missions Would there be any SEO benefit of NOINDEX on these pages? Of course, FOLLOW is default, so links would still be followed and juice applied. Your thoughts and suggestions are much appreciated.
Intermediate & Advanced SEO | | Peter2640 -
Can a XML sitemap index point to other sitemaps indexes?
We have a massive site that is having some issue being fully crawled due to some of our site architecture and linking. Is it possible to have a XML sitemap index point to other sitemap indexes rather than standalone XML sitemaps? Has anyone done this successfully? Based upon the description here: http://sitemaps.org/protocol.php#index it seems like it should be possible. Thanks in advance for your help!
Intermediate & Advanced SEO | | CareerBliss0