Webmaster Tools - Clarification of what the top directory is in a calender url
-
Hi all,
I had an issue where it turned out a calender was used on my site historically (a couple of years ago) but the pages were still present, crawled and indexed by google to this day.
I want to remove them now from the index as it really clouds my analysis and as I have been trying to clean things up e.g. by turning modules off, webmaster tools is throwing up more and more errors due to these pages.
Below is an example of the url of one of the pages:
The closest question I have found on the topic in Seomoz is:
http://www.seomoz.org/q/duplicate-content-issue-6
I want to remove all these pages from the index by targeting their top level folder. From the historic question above would I be right in saying that it is:
http://www.example.co.uk/index.php?mact=Calendar
I want to be certain before I do a directory level removal request in case it actually targets index.php instead and deindexes my whole site (or homepage at the very least).
Thanks
-
Unfortunately, "index.php?mact=Calendar" is not a folder, it's a page+parameter. If you tried to block that as a folder in GWT, it would mostly just not work. If it went really wrong, you'd block anything driven from index.php (including your home-page).
A couple of options:
(1) Programmatically META NOINDEX anything that calls the calendar parameters. This would have to be done selectively in the index.php header with code, so that ONLY the calendar pages were affected.
(2) Block "mact=" or "year=" with parameter handling in GWT. under "Configuration" > "URL Parameters". ONLY do this if these parameters drive the calendar and no other pages. You can basically tell Google to ignore pages with "year=" in them.
You can also block parameters in Robots.txt, but honestly, once the pages have been indexed, it doesn't work very well.
-
Thanks Thomas, I have uploaded a new site map to GMWT and hopefully that will cause Google to ignore those disappeared pages.
Best,
Mitty
-
I would not use googles disavow or remove links tool Lightly at all.
in my opinion it would be easier to fix the problems you're talking about on the site internally and to ask Google to ignore or disavow. They can basically penalize you because have essentially admitted you've done something wrong just by using the tips about tool. I don't mean to scare you and I don't think you've done anything wrong and if I were you I would let Google know that what you have done is simply try to picture website up to the bust your abilities for the end-user's experienced not for hiding any malicious actions in the past.
Sorry to be so alerted by this but you really do want to stay on top of what you tell Google and what they perceive you're telling them.
I hope this has been of help. The reason I gave the thing for treehouse which is available at an pro perks at the bottom of the page is they teach everything you need to fix the problems you have without using Google.
Sincerely,
Thomas
-
Thanks for the advice and the links Thomas.
I've already gotten rid of the pages from my site and they are not malware inducing so not to worry.
My question is only concerned with webmaster tools. I can manually enter each link into the removal tool but that will take days.
I am aware that there is a option in GWMT to remove directories as well as individual urls i.e. if I had a site that had the following pages: www.example.com/plants/tulips & www.example.com/plants/roses
I could either enter both urls into the removal tool or simply put www.example.com/plants/ and designate it a directory both pages would be removed.
My question is to confirm if I have the following pages which have virtually identical pathways but for the dates 2084 and 3000:
Could I just simply use http://www.example.co.uk/index.php?mact=Calendar as a directory, saving me having to write out the full pathways for both the pages.
-
I just remembered where you can learn how to do this. And it's free.
Pro perks at the bottom of the page will give you one month of free information from treehouse it is a mecca of videos and information on code to Seo to hosting to everything you ever wanted to know honestly take it advantage of this tool.
-
Well I believe the URLs you're talking about if the calendar took up the entire page or even part of it. It could harm other content on the page if there is any ask is there any?
Run your website through http://sitecheck.sucuri.net/scanner/
It will tell you immediately if you have any malware running on your website. If you do I strongly suggest purchasing sucuri and cleaning up. However hopefully that's not the case and you simply need some tweaking Denchere website. I unfortunately am not gifted with the knowledge of code. But I know there is a option out there that is extremely inexpensive and very high-quality called tweak a five I will try to find the URL right now that for less than $100 I
http://www.webdesignerdepot.com/category/code/
One of the better ones can be found by asking the guys at webdevStudios.com they are geniuses and it will lead you in the right direction. I don't want give you any advice that's wrong advice. Sincerely, Tom
-
Thanks Thomas.
It was a calender module with my CMS CMS Made Simple, it seemed to have generated thousands of pages which all linked to each page of my site so webmaster tools had listed my less than 100 page site (or so I thought) as having over 40,000 internal links pointing to each page.
I have deactivated it and added a site map to webmaster tools (GWMT) and that seems to have generated thousands of errors in the GWMT.
There is a list of the top 1,000 urls which are pretty much the calender pages and they are now returning 404 errors (as I have switched off the module so they are effectively deleted) but I want to have them deindexed so as to see if there is anything else hidden in the background.
I'm not completely sure with what you've sent me below. Are you concurring that if I add the below URL to the removal tool and select directory removal it will just target all those 404'd pages?
-
-
http://www.w3schools.com/php/php_ref_filesystem.asp
http://www.w3schools.com/php/php_ref_filter.asp
&
http://www.w3schools.com/php/php_ref_calendar.asp
Tell me if you need any help after this removing that calendar and I will give it a whirl. Sincerely, Tom
-
You are wise to be conscious. Be sure is your site hundred percent PHP? To you know what the calendar is made from meaning is it a third-party software? Or is it something you had Bill or had someone built for you a while back?
Try running your site through builtwith.com and it will give you the components of your website if the calendar shows up you can then Google a way to remove it.
If you don't feel comfortable sharing your URL here send it to me in a private message I'd be happy to have a look at it and give you my best opinion. I am not going to tell you that I am king of coding but I can pick up on these types of things sometimes and I'd be happy to lend another set of eyes.
Sincerely,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
WEBMASTER console: increase in the number of URLs we were blocked from crawling due to authorization permission errors.
Hi guys,I received this warning in my webmaster console: "Google detected a significant increase in the number of URLs we were blocked from crawling due to authorization permission errors." So i went to "Crawl Errors" section and i found such errors under "Access denied" status: ?page_name=Cheap+Viagra+Gold+Online&id=471 ?page_name=Cheapest+Viagra+Us+Licensed+Pharmacies&id=1603 and many happy URLs like these. Does anybody know what this is and where it comes from? Thanks in advance!
Technical SEO | | odmsoft0 -
Canonical URLs in an eCommerce site
We have a website with 4 product categories (1. ice cream parlors, 2. frozen yogurt shops etc.). A few sub-categories (e.g. toppings, smoothies etc.) and the products contained in those are available in more than one product category (e.g. the smoothies are available in the "ice cream parlors" category, but also in the "frozen yogurt shops" category). My question: Unfortunately the website has been designed in a way that if a subcategory (e.g. smoothies) is available in more than 1 category, then itself (the subcategory page) + all its product pages will be automatically visible under various different urls. So now I have several urls for one and the same product: www.example.com/strawberry-smoothie|SMOOTHIES|FROZEN-YOGURT-SHOPS-391-2-5 and http://www.example.com/strawberry-smoothie|SMOOTHIES|ICE-CREAM-PARLORS-391-1-5 And also several ones for one and the same sub-category (they all include exactly the same set of products): http://www.example.com/SMOOTHIES-1-12-0-4 (the smoothies contained in the ice cream parlors category) http://www.example.com/SMOOTHIES-2-12-0-4 (the same smoothies, contained in the frozen yogurt shops category) This is happening with around 100 pages. I would add canonical tags to the duplicates, but I'm afraid that by doing so, the category (frozen yogurt shops) that contains several non-canonical sub-categories (smoothies, toppings etc.) , might not show up anymore in search results or become irrelevant for Google when searching for example for "products for frozen yoghurt shops". Do you know if this would be actually the case? I hope I explained it well..
Technical SEO | | Gabriele_Layoutweb0 -
Google Webmaster tools: Sitemap.xml not processed everyday
Hi, We have multiple sites under our google webmaster tools account with each having a sitemap.xml submitted Each site's sitemap.xml status ( attached below ) shows it is processed everyday except for one _Sitemap: /sitemap.xml__This Sitemap was submitted Jan 10, 2012, and processed Oct 14, 2013._But except for one site ( coed.com ) for which the sitemap.xml was processed only on the day it is submitted and we have to manually resubmit every day to get it processed.Any idea on why it might?thank you
Technical SEO | | COEDMediaGroup0 -
Webmaster Tools "Links to your site" history over time?
Is there a way to see a history of the "links to your site"? I've seen a lot of posts here from people say "I just saw a big drop in my numbers." I don't look at this number enough to be that familiar with it. Is there a way to see if Google has suddenly chopped our numbers? I've poked around a little, but not found a method yet. Thanks, Reeves
Technical SEO | | wreevesc0 -
Persistent Unnatural Links in Webmaster tools
We recently were notified about unnatural links from two websites (totalling a few thousands links each). We went to the websites and asked them to remove the links, which they apparently did. After this we applied for reconsideration to Google, explaining the situation, however they came back and said we still have links. We noticed there were still links, however there were less than before, and so we once again asked the sites to remove all the links. Now we are sure all the links are gone as when we click a random link and view the page source there is no reference to our site, however WebMaster tools is not updating the link list, claiming we still have thousands of links. Do we have to apply for another reconsideration request to get them to re-crawl the sites to get rid of the links, or should it happen automatically?
Technical SEO | | eXia0 -
Unnatural links in webmaster tools
Google Webmaster Tools notice of detected unnatural links to my site.I download latest link from google webmaster tool and decided to remove links from last four months.I also got several links from directory sites and i want to remove the links, how to delete links from directory sites ? I f there is no way to delete directory link , please let me know other option to get rid of this issue.
Technical SEO | | Alick3000 -
Importance of SEO Friendly URLs
Hey SEOZ! How important do you all think SEO friendly URLs are to SEO. Here is an example: Non: http://www.domain.com/cart.php?m=product_detail&p=2964 Friendly: http://www.domain.com/product_name_here.html I have always heard mixed reviews but did an experiment comparing results on the same domain and actually noticed quite a difference with the friendly ones. Thanks!
Technical SEO | | 6thirty0 -
Crawl Errors In Webmaster Tools
Hi Guys, Searched the web in an answer to the importance of crawl errors in Webmaster tools but keep coming up with different answers. I have been working on a clients site for the last two months and (just completed one months of link bulding), however seems I have inherited issues I wasn't aware of from the previous guy that did the site. The site is currently at page 6 for the keyphrase 'boiler spares' with a keyword rich domain and a good onpage plan. Over the last couple of weeks he has been as high as page 4, only to be pushed back to page 8 and now settled at page 6. The only issue I can seem to find with the site in webmaster tools is crawl errors here are the stats:- In sitemaps : 123 Not Found : 2,079 Restricted by robots.txt 1 Unreachable: 2 I have read that ecommerce sites can often give off false negatives in terms of crawl errors from Google, however, these not found crawl errors are being linked from pages within the site. How have others solved the issue of crawl errors on ecommerce sites? could this be the reason for the bouncing round in the rankings or is it just a competitive niche and I need to be patient? Kind Regards Neil
Technical SEO | | optimiz10