Export list of urls in google's index?
-
Is there a way to export an exact list of urls found in Google's index?
-
hmm, I actually need a near to complete list if possible
-
A good place to start would be to go to the Google Webmaster Tools. If you haven't set it up, I'd highly recommend it. Expand the "Your site on the web" section, and click on the "Search queries" report. The default tab is by "Top queries". Click the tab next to it for "Top pages". This will show all the pages with some information as well for each page, such as impressions, clicks, CTR, and average position on Google SERPs.
I'm not sure if this is a complete list, but it's what I go off of.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Mass Removal Request from Google Index
Hi, I am trying to cleanse a news website. When this website was first made, the people that set it up copied all kinds of articles they had as a newspaper, including tests, internal communication, and drafts. This site has lots of junk, but this kind of junk was on the initial backup, aka before 1st-June-2012. So, removing all mixed content prior to that date, we can have pure articles starting June 1st, 2012! Therefore My dynamic sitemap now contains only articles with release date between 1st-June-2012 and now Any article that has release date prior to 1st-June-2012 returns a custom 404 page with "noindex" metatag, instead of the actual content of the article. The question is how I can remove from the google index all this junk as fast as possible that is not on the site anymore, but still appears in google results? I know that for individual URLs I need to request removal from this link
Intermediate & Advanced SEO | | ioannisa
https://www.google.com/webmasters/tools/removals The problem is doing this in bulk, as there are tens of thousands of URLs I want to remove. Should I put the articles back to the sitemap so the search engines crawl the sitemap and see all the 404? I believe this is very wrong. As far as I know this will cause problems because search engines will try to access non existent content that is declared as existent by the sitemap, and return errors on the webmasters tools. Should I submit a DELETED ITEMS SITEMAP using the <expires>tag? I think this is for custom search engines only, and not for the generic google search engine.
https://developers.google.com/custom-search/docs/indexing#on-demand-indexing</expires> The site unfortunatelly doesn't use any kind of "folder" hierarchy in its URLs, but instead the ugly GET params, and a kind of folder based pattern is impossible since all articles (removed junk and actual articles) are of the form:
http://www.example.com/docid=123456 So, how can I bulk remove from the google index all the junk... relatively fast?0 -
'?q=:new&sort=new' URL parameters help...
Hey guys, I have these types of URLs being crawled and picked up on by MOZ but they are not visible to my users. The URLs are all 'hidden' from users as they are basically category pages that have no stock, however MOZ is crawling them and I dont understand how they are getting picked up as 'duplicate content'. Anyone have any info on this? http://www.example.ch/de/example/marken/brand/make-up/c/Cat_Perso_Brand_3?q=:new&sort=new Even if I understood the technicality behind it then I could try and fix it if need be. Thanks Guys Kay
Intermediate & Advanced SEO | | eLab_London0 -
When does Google index a fetched page?
I have seen where it will index on of my pages within 5 minutes of fetching, but have also read that it can take a day. I'm on day #2 and it appears that it has still not re-indexed 15 pages that I fetched. I changed the meta-description in all of them, and added content to nearly all of them, but none of those changes are showing when I do a site:www.site/page I'm trying to test changes in this manner, so it is important for me to know WHEN a fetched page has been indexed, or at least IF it has. How can I tell what is going on?
Intermediate & Advanced SEO | | friendoffood0 -
Tool that can retrieve mysite URL's
Hi, Tool that can retrieve mysite URL's I am not talking about href,open explorer, Majestic etc I have a list of 1000 site URL's where my site name is mentioned. I want to get the exact URL of my site next to the URL i want to query with Example http://moz.com/community is the URL i have and if this page has mysite name then i need to get the complete URL captured. Any software or tool that can do this? I used one for sure which got me this info but now i don't remember it Thanks
Intermediate & Advanced SEO | | mtthompsons0 -
Google suddenly indexing and displaying URLs that haven't existed for years?
We recently noticed google is showing approx 23,000 indexed .jsp urls for our site. These are ancient pages that haven't existed in years and have long been 301 redirected to valid urls. I'm talking 6 years. Checking the serps the other day (and our current SEOMoz pro campaign), I see that a few of these urls are now replacing our correct ones in the serps for important, competitive phrases. What the heck is going on here? Is Google suddenly ignoring rewrite rules and redirects? Here's an example of the rewrite rules that we've used for 6+ years: RewriteRule ^(.*)/xref_interlux_antifoulingoutboards&keels.jsp$ $1/userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID [R=301] Now, this 'bottom paint' url has been incredibly stable in the serps for over a half decade. All of a sudden, a google search for 'bottom paint' (no quotes) brings up the jsp page at position 2-3. This is just one example of something very bizarre happening. Has anyone else had something similar happen lately? Thank You <colgroup><col width="64"></colgroup>
Intermediate & Advanced SEO | | jamestown
| RewriteRule ^(.*)/xref_interlux_antifoulingoutboards&keels.jsp$ $1/userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID [R=301] |0 -
How would I be able to make sure Google lists search results as a combined listing opposed to a single listing/
I am slightly confused about how to let Google know to index our site as a complex listing opposed to individual page listing. Our site is well established and has over 3500 indexes. Does this have to do with the Sitemap or is it something else? Is there a way to expedite Google to list our site like the example below. Thank you for your help! Whole Foods Market <cite>www.wholefoodsmarket.com/</cite>Owns and operates chain of natural foods supermarkets which sell meat and poultry free of growth hormones and antibiotics, unprocessed grains and cereals, ... | ### Stores Hours: Open 8am to 10pm Seven Days a Week. Note Holiday ... | ### Online Ordering welcome find a store healthy eating about our products ... |
Intermediate & Advanced SEO | | olive13
| ### Coupons Here they are: printable coupons from the latest issue of our in ... | ### Recipes This boldly flavored casserole is an excellent way to use leftover ... |
| ### Careers Hiring Process - Job Fairs and Events - Career Paths - ... | ### Lamar Located just blocks from where Whole Foods Market began as ... |
| More results from wholefoodsmarket.com » |0 -
How to Handel Fashion Jewelry Stores's Listings?
Hi ! I m working with few brands who hardly make same designs again... which is good things for my buyers.. Buy i m worrying how i can handle this keeping seo in mind? should i delete pages after product is sold out? or there is some other better way to handle this? Thanks, Vinku
Intermediate & Advanced SEO | | vinku0 -
How to check a website's architecture?
Hello everyone, I am an SEO analyst - a good one - but I am weak in technical aspects. I do not know any programming and only a little HTML. I know this is a major weakness for an SEO so my first request to you all is to guide me how to learn HTML and some basic PHP programming. Secondly... about the topic of this particular question - I know that a website should have a flat architecture... but I do not know how to find out if a website's architecture is flat or not, good or bad. Please help me out on this... I would be obliged. Eagerly awaiting your responses, BEst Regards, Talha
Intermediate & Advanced SEO | | MTalhaImtiaz0