Removing URLs in bulk when directory exclusion isn't an option?
-
I had a bunch of URLs on my site that followed the form:
http://www.example.com/abcdefg?q=&site_id=0000000048zfkf&l=
There were several million pages, each associated with a different site_id. They weren't very useful, so we've removed them entirely and now return a 404.The problem is, they're still stuck in Google's index. I'd like to remove them manually, but how? There's no proper directory (i.e. /abcdefg/) to remove, since there's no trailing /, and removing them one by one isn't an option. Is there any other way to approach the problem or specify URLs in bulk?
Any insights are much appreciated.
Kurus
-
I'd go into Google Webmaster Tools and their parameter settings and tell them to ignore this parameter.
I would need to look up the exact syntax, but Google does accept some dynamic exclusions and parameters in robots.txt, and you may be able to put that into robots and then use the URL removal tools.
-
There are no links to these pages, so no juice. There are also no 'new' replacement pages. We just want them out of the index ASAP by any means necessary.
-
You should have 301 your most important pages to the new urls, so that you would keep your juice.
-
Thanks, but the goal is to expedite the removal process via the URL removal tool. We've already 404'd the pages, so they'll be removed from the index. It's a question of timing, since the pages in question are low quality and hurting us in the context of Panda.
-
try 301 redirect for most important links. http://www.seomoz.org/learn-seo/redirection
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google has discovered a URL but won't index it?
Hey all, have a really strange situation I've never encountered before. I launched a new website about 2 months ago. It took an awfully long time to get index, probably 3 weeks. When it did, only the homepage was indexed. I completed the site, all it's pages, made and submitted a sitemap...all about a month ago. The coverage report shows that Google has discovered the URL's but not indexed them. Weirdly, 3 of the pages ARE indexed, but the rest are not. So I have 42 URL's in the coverage report listed as "Excluded" and 39 say "Discovered- currently not indexed." When I inspect any of these URL's, it says "this page is not in the index, but not because of an error." They are listed as crawled - currently not indexed or discovered - currently not indexed. But 3 of them are, and I updated those pages, and now those changes are reflected in Google's index. I have no idea how those 3 made it in while others didn't, or why the crawler came back and indexed the changes but continues to leave the others out. Has anyone seen this before and know what to do?
Intermediate & Advanced SEO | | DanDeceuster0 -
All urls seem to exist (no 404 errors) but they don't.
Hello I am doing a SEO auditing for a website which only has a few pages. I have no cPanel credentials, no FTP no Wordpress admin account, just watching it from the outside. The site works, the Moz crawler didn't report any problem, I can reach every page from the menu. The problem is that - except for the few actual pages - no matter what you type after the domain name, you always reach the home page and don't get any 404 error. I.E. Http://domain.com/oiuxyxyzbpoyob/ (there is no such a page, but i don't get 404 error, the home is displayed and the url in the browser remains Http://domain.com/oiubpoyob/, so it's not a 301 redirect). Http://domain.com/WhatEverYouType/ (same) Could this be an important SEO issue (i.e. resulting in infinite amount of duplicate content pages )? Do you think I should require the owner to prevent this from happening? Should I look into the .htaccess file to fix it ? Thank you Mozers!
Intermediate & Advanced SEO | | DoMiSoL0 -
Migrating From Parameter-Driven URL's to 'SEO Friendly URL's (Slugs)
Hi all, hope you're all good and having a wonderful Friday morning. At the moment we have over 20,000+ live products on our ecomms site, however, all of the products are using non-seo friendly URL's (/product?p=1738 etc) and we're looking at deploying SEO friendly url's such as (/product/this-is-product-one) etc. As you could imagine, making such a change on a big ecomms site will be a difficult task and we will have to take on A LOT of content changes, href-lang changes, affiliate link tests and a big 301 task. I'm trying to get some analysis together to pitch the Tech guys, but it's difficult, I do understand that this change has it's benefits for SEO, usability and CTR - but I need some more info. Keywords in the slugs - what is it's actual SEO weight? Has anyone here recently converted from using parameter based URL's to keyword-based slugs and seen results? Also, what are the best ways of deploying this? Add a canonical and 301? All comments greatly appreciated! Brett
Intermediate & Advanced SEO | | Brett-S0 -
Site Structure: How do I deal with a great user experience that's not the best for Google's spiders?
We have ~3,000 photos that have all been tagged. We have a wonderful AJAXy interface for users where they can toggle all of these tags to find the exact set of photos they're looking for very quickly. We've also optimized a site structure for Google's benefit that gives each category a page. Each category page links to applicable album pages. Each album page links to individual photo pages. All pages have a good chunk of unique text. Now, for Google, the domain.com/photos index page should be a directory of sorts that links to each category page. Alternatively, the user would probably prefer the AJAXy interface. What is the best way to execute this?
Intermediate & Advanced SEO | | tatermarketing0 -
Should I remove 404 urls in webmaster tools?
I've recently removed a lot of category pages so should I remove the urls in webmaster tools or let them drop out of the index naturally?
Intermediate & Advanced SEO | | SamCUK0 -
Overly-Dynamic URLs & Changing URL Structure w Web Redesign
I have a client that has multiple apartment complexes in different states and metro areas. They get good traffic and pretty good conversions but the site needs a lot of updating, including the architecture, to implement SEO standards. Right now they rank for " <brand_name>apartments" on every place but not " <city_name>apartments".</city_name></brand_name> There current architecture displays their URLs like: http://www.<client_apartments>.com/index.php?mainLevelCurrent=communities&communityID=28&secLevelCurrent=overview</client_apartments> http://www.<client_apartments>.com/index.php?mainLevelCurrent=communities&communityID=28&secLevelCurrent=floorplans&floorPlanID=121</client_apartments> I know it is said to never change the URL structure but what about this site? I see this URL structure being bad for SEO, bad for users, and basically forces us to keep the current architecture. They don't have many links built to their community pages so will creating a new URL structure and doing 301 redirects to the new URLs drastically drop rankings? Is this something that we should bite the bullet on now for future rankings, traffic, and a better architecture?
Intermediate & Advanced SEO | | JaredDetroit0 -
Sitemap - % of URL's in Google Index?
What is the average % of links from a sitemap that are included in the Google index? Obviously want to aim for 100% of the sitemap urls to be indexed, is this realistic?
Intermediate & Advanced SEO | | stats440 -
Directories...
So, I have a website, and I have a few pages in directories, and the rest are normal with extensions (i.e. example.com/blah.html instead of example.com/blah/) Now, the directory page isnt ranking yet for a targeting keyword (although I am still in the process of link building to the page w/ anchor text), however, could it be because it is the odd man out being one of the only pages within a directory? Also, I would really like to move all my pages into directories, however some of the internal pages are ranking really well and I do not want to lose that once switching. Has anyone has experiences with using 301s to redirect so sub directories without loosing rankings?
Intermediate & Advanced SEO | | Aftermath_SEO0