Attack of the dummy urls -- what to do?
-
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website:
www.mysite.com/dynamicpage/dummy123
www.mysite.com/dynamicpage/dummy456
etc..
How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors.
I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
-
Hello,
I am also having this issue with hundreds of dummy urls that never existed as a part of our website's blog. Do I go into parameters and specify each of the dummy urls to avoid this?
Thanks in advance for any help!!!! (and sorry to piggyback this question Theodore-hope you don't mind!)
-
Thanks Ray. Appreciate the advice!
-
It's great that you've identified issues like this. I also suggest that if you know certain parameters are generated often and not necessary to index, that you go into your Google Webmaster Tools account > Crawl > URL Parameters and proactively set the crawl rate to 'No URLs' is appropriate. I do this with certain custom parameters for sites that are prone to having these extra URLs indexed mistakenly.
-
Hi Ray-pp,
Thanks for your answer. I'm not getting anything significant, but occasionally a bot will come with extra stuff added to the parameter names, so it got me to thinking a malicious program or nasty competitor might want to do that to cause havoc. My understanding is that 404s don't hurt SEO ranking from Google, but I was thinking that the way things are set up now no-one would get a 404 and in fact Google would index the 'bad' pages, so maybe I needed to do something proactively to 404 or 301 such pages so they would never get put into an index at all.
Since my site has lots of dynamically generated pages, I've had my share of surprises, and am just trying to avoid any new ones!
-
Hi Theodore - You pose an interesting problem, are you currently experiencing this issue? I don't see why someone would create a bunch of random non-existent links to your site, but if they did (and the pages were receiving low quality traffic) then I would proactively disavow those domains that created the links. That would be enough to prevent any penalties you're afraid of receiving.
If, however, you're noticing that specific 404 pages are receiving quality traffic (maybe an old page was removed but good traffic is still sent to the page) then you would want to 301 that page to its closest relative page that deserves the traffic and authority.
Does that help? Maybe a little more information around you specific problem would allow me to tailor the advice better.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Only Indexing Canonical Root URL Instead of Specified URL Parameters
We just launched a website about 1 month ago and noticed that Google was indexing, but not displaying, URLs with "?location=" parameters such as: http://www.castlemap.com/local-house-values/?location=great-falls-virginia and http://www.castlemap.com/local-house-values/?location=mclean-virginia. Instead, Google has only been displaying our root URL http://www.castlemap.com/local-house-values/ in its search results -- which we don't want as the URLs with specific locations are more important and each has its own unique list of houses for sale. We have Yoast setup with all of these ?location values added in our sitemap that has successfully been submitted to Google's Sitemaps: http://www.castlemap.com/buy-location-sitemap.xml I also tried going into the old Google Search Console and setting the "location" URL Parameter to Crawl Every URL with the Specifies Effect enabled... and I even see the two URLs I mentioned above in Google's list of Parameter Samples... but the pages are still not being added to Google. Even after Requesting Indexing again after making all of these changes a few days ago, these URLs are still displaying as Allowing Indexing, but Not On Google in the Search Console and not showing up on Google when I manually search for the entire URL. Why are these pages not showing up on Google and how can we get them to display? Only solution I can think of would be to set our main /local-house-values/ page to noindex in order to have Google favor all of our other URL parameter versions... but I'm guessing that's probably not a good solution for multiple reasons.
Intermediate & Advanced SEO | | Nitruc0 -
Trailing Slashes on URLs
Hi we currently have a site on Wordpress which has two version of each URL trailing slash on URLs and one without it. Example: www.domain.com/page (preferred version - based on link data) www.domain.com/page**/** The non-slash version of the URL has most of the external links pointing to them, so we are going to pick that as the preferred version. However, currently, each version of every URL has rel canonical tag pointing to the non-preferred version. E.g. www.domain.com/page the rel canonical tag is: www.domain.com/page/ What would be the best way to clean up this setup? Cheers.
Intermediate & Advanced SEO | | cathywix0 -
Demoting a URL (not-WMT Related)
I have a pharmaceutical brand that treats two diseases, but wants to primarily promote one. We want searches for "brand dosing" to go to Side A, but currently "brand dosing" goes to Side B. BUT, I want "brand dosing Side B" to still show up in organic search, so a noindex on Side B, or canonicalization of Side A, won't work. Essentially, I want any searches that are not specific to a disease treatment to go to Side A, and then specific Side B related searches, go to Side B. Because this is a client paying me to optimize their site, I obviously want to optimize their whole site, so only optimizing Side A, or unoptimizing Side B, aren't solutions I want to employ. I don't think a solution exists, but I figured my fellow Mozers would know best. Thanks in advance,
Intermediate & Advanced SEO | | GTO_Pharma_SEO0 -
URL mapping for site migration
Hi all! I'm currently working on a migration for a large e-commerce site. The old one has around 2.5k urls, the new one 7.5k. I now need to sort out the redirects from one to the other. This is proving pretty tricky, as the URL structure has changed site wide. There doesn't seem to be any consistent rules either so using regex doesn't really work. By and large, the copy appears to be the same though. Does anybody know of a tool I can crawl the sites with that will export the crawled url and related copy into a spreadsheet? That way I can crawl both sites and compare the copy to match them up. Thanks!
Intermediate & Advanced SEO | | Blink-SEO0 -
Spaces in URL line
Hi Gurus, I recently made the mistake of putting a space into a URL line between two words that make up my primary key word. Think www.example.com/Jelly Donuts/mmmNice.php instead of www.example.com/JellyDonuts/mmmNice.php This mistake now needed fixing to www.example.com/Jelly Donuts/mmmNice.php to pass W3, but has been in place for a while but most articles/documents under 'Jelly Donuts' are not ranking well (which is probably the obvious outcome of the mistake). I am wondering whether the best solution from an SEO ranking viewpoint is to: 1. Change the article directory immediately to www.example.com/JellyDonuts/mmmNice.php and rel=canonical each article to the new correct URL. Take out the 'trash' using robots.txt or to 301 www.example.com/Jelly Donut to the www.example.com/JellyDonut directory? or perhaps something else? Thanks in advance for your help with this sticky (but tasty) conundrum, Brad
Intermediate & Advanced SEO | | BM70 -
Effect of 301 redirect to a relative url to homepage?
One of our new clients recently encountered a site-wide ranking drop for many keywords and I'm pretty confident regarding their link profile as to being 98% legit. Background: 1. Client full site is https, and all http pages are 301 redirected to their https counterpart 2. Client has ~50 links partners (all legitimate sites + schools etc) links to client with urls such as www.example.com/portal/123.aspx that redirects to www.example.com. 3. Client homepage 301 redirects from www.example.com to www.example.com/default.aspx and then 301 redirects to the relative url "/Home.aspx". 4. Client launched some testing with Google website optimizer tool. ~1-2 months ago. Symptoms: 1. Rankings dropped for basically many/all 30-40+ keywords by ~15 positions 2. Seomoz reports close to a double of existing pages + (600+) duplicate content in the same date range. Webmasters only report 80 duplicate titles though. 3. Domain authority by seomoz reduced a bit + backlinks recorded by seomoz to the website nearly halved in the past 2 months. I'm not sure if I narrowed this towards the right direction, and it isn't clear when the relative url 301 redirect was implemented: 1. The 301 redirect to the relative page (www.example.com/default.aspx to "/home.aspx") is accounting for the loss of links recorded by seomoz. 2. The ~50 links the client currently use (www.example.com/portal.123.aspx 301 redirecting to www.example.com, also relative) as a tracking tool is being considered 301 redirect abuse. 3. Maybe something went wrong with the usage of google optimizer tool for SEO purposes? Visitor traffic to each of the tested pages looked fine. I would greatly appreciate any advice/insights on what I might be missing in terms of direction / factors. Thanks! Alex
Intermediate & Advanced SEO | | sixspokemedia0 -
Duplicate URL home page
I just got a duplicate URL error on by SEOMOZ report - and I wonder if I should worry about it Assume my site is named www.widgets.com I'm getting duplicate url from http://www.widgets.com & http://www.widgets.com/ Do the search engines really see this as different on the home page? The general drift on the web is that You site should look like Home page = http://www.widgets.com And subpages http://www.widgets.com/widget1/ Of course it seems as though the IIS7 slash tool will rewrite everything Including the home page to a slash.
Intermediate & Advanced SEO | | ThomasErb0 -
Why do old URL format are still being crawled by Rogerbot?
Hi, In the early days of my blog, I used permalinks with the following format: http://www.mysitesamp.com/2009/02/04/heidi-cortez-photo-shoot/ I then decided to change this format using .htaccess to this format: http://www.mysitesamp.com//heidi-cortez-photo-shoot/ My question is, why do rogerbot still crawls my old URL format since these urls' no longer exists in my website or blog.
Intermediate & Advanced SEO | | Trigun0