Webmaster Tools - Clarification of what the top directory is in a calender url
-
Hi all,
I had an issue where it turned out a calender was used on my site historically (a couple of years ago) but the pages were still present, crawled and indexed by google to this day.
I want to remove them now from the index as it really clouds my analysis and as I have been trying to clean things up e.g. by turning modules off, webmaster tools is throwing up more and more errors due to these pages.
Below is an example of the url of one of the pages:
The closest question I have found on the topic in Seomoz is:
http://www.seomoz.org/q/duplicate-content-issue-6
I want to remove all these pages from the index by targeting their top level folder. From the historic question above would I be right in saying that it is:
http://www.example.co.uk/index.php?mact=Calendar
I want to be certain before I do a directory level removal request in case it actually targets index.php instead and deindexes my whole site (or homepage at the very least).
Thanks
-
Unfortunately, "index.php?mact=Calendar" is not a folder, it's a page+parameter. If you tried to block that as a folder in GWT, it would mostly just not work. If it went really wrong, you'd block anything driven from index.php (including your home-page).
A couple of options:
(1) Programmatically META NOINDEX anything that calls the calendar parameters. This would have to be done selectively in the index.php header with code, so that ONLY the calendar pages were affected.
(2) Block "mact=" or "year=" with parameter handling in GWT. under "Configuration" > "URL Parameters". ONLY do this if these parameters drive the calendar and no other pages. You can basically tell Google to ignore pages with "year=" in them.
You can also block parameters in Robots.txt, but honestly, once the pages have been indexed, it doesn't work very well.
-
Thanks Thomas, I have uploaded a new site map to GMWT and hopefully that will cause Google to ignore those disappeared pages.
Best,
Mitty
-
I would not use googles disavow or remove links tool Lightly at all.
in my opinion it would be easier to fix the problems you're talking about on the site internally and to ask Google to ignore or disavow. They can basically penalize you because have essentially admitted you've done something wrong just by using the tips about tool. I don't mean to scare you and I don't think you've done anything wrong and if I were you I would let Google know that what you have done is simply try to picture website up to the bust your abilities for the end-user's experienced not for hiding any malicious actions in the past.
Sorry to be so alerted by this but you really do want to stay on top of what you tell Google and what they perceive you're telling them.
I hope this has been of help. The reason I gave the thing for treehouse which is available at an pro perks at the bottom of the page is they teach everything you need to fix the problems you have without using Google.
Sincerely,
Thomas
-
Thanks for the advice and the links Thomas.
I've already gotten rid of the pages from my site and they are not malware inducing so not to worry.
My question is only concerned with webmaster tools. I can manually enter each link into the removal tool but that will take days.
I am aware that there is a option in GWMT to remove directories as well as individual urls i.e. if I had a site that had the following pages: www.example.com/plants/tulips & www.example.com/plants/roses
I could either enter both urls into the removal tool or simply put www.example.com/plants/ and designate it a directory both pages would be removed.
My question is to confirm if I have the following pages which have virtually identical pathways but for the dates 2084 and 3000:
Could I just simply use http://www.example.co.uk/index.php?mact=Calendar as a directory, saving me having to write out the full pathways for both the pages.
-
I just remembered where you can learn how to do this. And it's free.
Pro perks at the bottom of the page will give you one month of free information from treehouse it is a mecca of videos and information on code to Seo to hosting to everything you ever wanted to know honestly take it advantage of this tool.
-
Well I believe the URLs you're talking about if the calendar took up the entire page or even part of it. It could harm other content on the page if there is any ask is there any?
Run your website through http://sitecheck.sucuri.net/scanner/
It will tell you immediately if you have any malware running on your website. If you do I strongly suggest purchasing sucuri and cleaning up. However hopefully that's not the case and you simply need some tweaking Denchere website. I unfortunately am not gifted with the knowledge of code. But I know there is a option out there that is extremely inexpensive and very high-quality called tweak a five I will try to find the URL right now that for less than $100 I
http://www.webdesignerdepot.com/category/code/
One of the better ones can be found by asking the guys at webdevStudios.com they are geniuses and it will lead you in the right direction. I don't want give you any advice that's wrong advice. Sincerely, Tom
-
Thanks Thomas.
It was a calender module with my CMS CMS Made Simple, it seemed to have generated thousands of pages which all linked to each page of my site so webmaster tools had listed my less than 100 page site (or so I thought) as having over 40,000 internal links pointing to each page.
I have deactivated it and added a site map to webmaster tools (GWMT) and that seems to have generated thousands of errors in the GWMT.
There is a list of the top 1,000 urls which are pretty much the calender pages and they are now returning 404 errors (as I have switched off the module so they are effectively deleted) but I want to have them deindexed so as to see if there is anything else hidden in the background.
I'm not completely sure with what you've sent me below. Are you concurring that if I add the below URL to the removal tool and select directory removal it will just target all those 404'd pages?
-
-
http://www.w3schools.com/php/php_ref_filesystem.asp
http://www.w3schools.com/php/php_ref_filter.asp
&
http://www.w3schools.com/php/php_ref_calendar.asp
Tell me if you need any help after this removing that calendar and I will give it a whirl. Sincerely, Tom
-
You are wise to be conscious. Be sure is your site hundred percent PHP? To you know what the calendar is made from meaning is it a third-party software? Or is it something you had Bill or had someone built for you a while back?
Try running your site through builtwith.com and it will give you the components of your website if the calendar shows up you can then Google a way to remove it.
If you don't feel comfortable sharing your URL here send it to me in a private message I'd be happy to have a look at it and give you my best opinion. I am not going to tell you that I am king of coding but I can pick up on these types of things sometimes and I'd be happy to lend another set of eyes.
Sincerely,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Redirect
Hi All, So we have employees who can own their own domains for business, however, one employee has a domain that links back to our main site, but when it does, the URL and Page title of our main site, still say his own domain. IE: www.johndoe.com links to www.mysite.com except the url and itle still say www.johndoe.com What are the implications of this? Thank you
Technical SEO | | PeteEllard0 -
Redirect URLS with 301 twice
Hello, I had asked my client to ask her web developer to move to a more simplified URL structure. There was a folder called "home" after the root which served no purpose. I asked for the URLs to be redirected using 301 to the new URLs which did not have this structure. However, the web developer didn't agree and decided to just rename the "home" folder "p". I don't know why he did this. We argued the case and he then created the URL structure we wanted. Initially he had 301 redirected the old URLS (the one with "Home") to his new version (the one with the "p"). When we asked for the more simplified URL after arguing, he just redirected all the "p" URLS to the PAGE NOT FOUND. However, remember, all the original URLs are now being redirected to the PAGE NOT FOUND as a result. The problems I see are these unless he redirects again: The new simplified URLS have to start from scratch to rank 2)We have duplicated content - two URLs with the same content Customers clicking products in the SERPs will currently find that they are being redirect to the 404 page. I understand that redirection has to occur but my questions are these: Is it ok to redirect twice with 301 - so old URL to the "p" version then to final simplified version. Will link juice be lost doing this twice? If he redirects from the original URLS to the final version missing out the "p" version, what should happen to the "p" version - they are currently indexed. Any help would be appreciated. Thanks
Technical SEO | | AL123al0 -
Google not found errors in webmaster tool help
Hi, Google Webmaster tools sent me a few messages recently about the jump in the number of 'not found' errors. From 0 to 290 errors, ouch. I know what it's from but I think Google is seeing things. We developed another page/subdomain we're working on with links back to the root domain. Basically a complete list of articles page that lists each article and links back to the root domain. Not sure what Google is crawling but the links that would result in a 'not found' error aren't there. Will these disappear over time? Thanks for the help!
Technical SEO | | astahl110 -
GWT, URL Parameters, and Magento
I'm getting into the URL parameters in Google Webmaster Tools and I was just wondering if anyone that uses Magento has used this functionality to make sure filter pages aren't being indexed. Basically, I know what the different parameters (manufacturer, price, etc.) are doing to the content - narrowing. I was just wondering what you choose after you tell Google what the parameter's function is. For narrowing, it gives the following options: Which URLs with this parameter should Googlebot crawl? <label for="cup-crawl-LET_GOOGLEBOT_DECIDE">Let Googlebot decide</label> (Default) <label for="cup-crawl-EVERY_URL">Every URL</label> (the page content changes for each value) <label style="color: #5e5e5e;" for="cup-crawl-ONLY_URLS_WITH_VALUE">Only URLs with value</label> ▼(may hide content from Googlebot) <label for="cup-crawl-NO_URLS">No URLs</label> I'm not sure which one I want. Something tells me probably "No URLs", as this content isn't something a user will see unless they filter the results (and, therefore, should not come through on a search to this page). However, the page content does change for each value.I want to make sure I don't exclude the wrong thing and end up with a bunch of pages disappearing from Google.Any help with this is greatly appreciated!
Technical SEO | | Marketing.SCG0 -
3 URLS Being Created All For The Same Page
I use wordpress for my blog and for some reason it is creating triple urls for my pages. I am not sure it has always been like this or not. I just noticed it in the errors section of SEO Moz. http://www.kisswedding.com/blog/?gid=7&r=20 http://www.kisswedding.com/blog/ashley-and-daniels-rainy-day-diy-farm-wedding/?gid=7&r=20 http://www.kisswedding.com/blog/ashley-and-daniels-rainy-day-diy-farm-wedding/ It's all the exact same page. Is there something I can do in my settings to make this stop. I don't imagine this is good. Ya think....ha! I saw this is the SEO Moz error area for Missing Title Tags. Apparently the number has gone from 200 to 400 which is weird because I never gave my blog posts meta stuff and I haven't written 200 pages since SEO Moz's last crawl.
Technical SEO | | annasusmiles
Maybe I changed something on my blog settings without even knowing. I can't think for the life of me what that would be though. Thanks so much and I appreciate any help received. Edited to add: I added some plugins over the past week. Maybe it's one of these? Category Text Category SEO Meta Tags (just deactivated this one) PhotoSmash (also deactivated this one) Clicky for WordPress0 -
What are the SEO implications of URLs that use a # in them?
I have several clients who have begun to ask questions about sites that are designed to look like a single page. When you click on a link, the URL changes but it uses a # before (i.e. http://www.kelloggs.com/teamusa**/#**/teamusa/athletes/kerri-walsh.html. What are the SEO implications of having a page set up this way? I noticed that Google has indexed this page but the indexed URL does not include a #. Is Google indexing a separate version of this page? Any insights would be really helpful! Thanks
Technical SEO | | VMLYRDiscoverability0 -
Magento URL Question
Calling all Magento Kings out there! I'm working on a client' site - powered by magento. I'm looking to rewrite a lot of the URLs. I know there is the URL rewrite tool, but I think what I need to do may go beyond this. Typical example would be: Old URL - http://www.xxxxxxxx.co.uk/fabric/product/product-black-screen-print-and-silver-fabric.html New URL - http://www.xxxxxx.co.uk/fabric/product/silver I know that magento's URLs seem to be created through categories so wanted to double check with someone the best way to do this. Also, I've heard that 301 redirects of non www to www in the .htaccess has a knock on effect on discounts? All comments greatly appreciated.
Technical SEO | | PerchDigital0 -
Should I use www. or not in my main URL?
I have backlinks coming into my homepage, which has both a www. URL and one that's merely http://mysite.com. Which is the preferred URL for best optimization for search engines and how do I find this out?
Technical SEO | | NetPicks0