WordPress - How to stop both http:// and https:// pages being indexed?
-
Just published a static page 2 days ago on WordPress site but noticed that Google has indexed both http:// and https:// url's. Usually I only get http:// indexed though.
Could anyone please explain why this may have happened and how I can fix? Thanks!
-
Just one adjustment to this - although I think David's right that the canonical tag can be a good solution. Although Google can index https: fine, the issue is whether you're creating duplicates. If you have duplicates, then it's possible that the https: version could be the one you want as canonical. In this case, it doesn't sound like it, but I just wanted to point that out.
Of course, long-term, you should sort out why these are being created. A desktop crawler like Xenu or Screaming Frog may be the best bet, but I'd hit the WordPress forums, too. Odds are it's a common issue. Typically, it happens when some deeper page (like a shopping cart) on a site is secure, and then the links are all relative ("/about.php", for example). Then, those links get crawled as both secure and non-secure.
Unfortunately, I'm not a WordPress expert, so I can only speak in generalities.
-
Thanks David, I feel like going out to buy some Swedish Fish for some reason now.
-
I actually just did a wealth of research on this topic a few days ago. Without going into the nitty gritty details, if the https is site-wide Google recommends a Rel="canonical" attribute (http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394) pointing to the non-secure http version. Google claims it can index https fine, but Matt Cutts said he would "lean towards pointing the canonical to the http version." Also, on the Rel="canonical" page Google says:
If you publish content on both http://www.example.com/product.php?item=swedish-fish and https://www.example.com/product.php?item=swedish-fish, you can specify the canonical version of the page. Create the element:
Add this link to the section of https://www.example.com/product.php?item=swedish-fish.
Make sure the canonical is on every page of your site.
Not sure why this may have happened, but it is creating duplicate content, which is why the canonical is necessary.
Hope that helps!
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Spam pages being redirected to 404s but sill indexed
Client had a website that was hacked about a year ago. Hackers went in and added a bunch of spam landing pages for various products. This was before the site had installed an SSL certificate. After the hack, the site was purged of the hacked pages and and SLL certificate was implemented. Part of that process involved setting up a rewrite that redirects http pages to the https versions. The trouble is that the spam pages are still being indexed by Google, even months later. If I do a site: search I still see all of those spam pages come up before most of the key "real" landing pages. The thing is, the listing on the SERP are to the http versions, so they're redirecting to the https version before serving a 404. Is there any way I can fix this without removing the rewrite rule?
Technical SEO | | SearchPros1 -
Sudden Indexation of "Index of /wp-content/uploads/"
Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you.
Technical SEO | | Tom3_150 -
How do I deal with /mobile/ page after responsive re-design?
Hi guys, One of our clients used to have a website that would redirect mobile traffic to a /mobile/ page. Thankfully we've finally gone fully responsive and there is no need for this /mobile/ page. Trouble is, www.clientsite.com.au**/mobile/** is still in the Google index and going to a 404 right now. What is the best way to deal with it? Should we be 301 redirecting /mobile/ to / (the home page)? Would be most grateful for any ideas. Thanks!
Technical SEO | | WCR0 -
Is there a way to get Google to index more of your pages for SEO ranking?
We have a 100 page website, but Google is only indexing a handful of pages for organic rankings. Is there a way to submit to have more pages considered? I have optimized meta data and get good Moz "on-page graders" or the pages & terms that I am trying to connect....but Google doesn't seem to pick them up for ranking. Any insight would be appreciated!
Technical SEO | | JulieALS0 -
Google indexing staging / development site that is redirected...
Hi Moz Fans! - Please help. We had a acme.stagingdomain.com while a site was in development, when it went live it redirected (302) to acmeprofessionalservices.com (real names redacted!!) no known external links to staging site although staging site url has been emailed from Google Apps(!!!) now found that staging site is in the index even though it redirects to the proper public site. and some (but not all) of the pages are in the index too. They all redirect to the proper public site when visited. It is convenient to have a redirect from the staging site to the new one for the team, Chrome etc. remember frequently visited sites. Be a shame to lose that. Yes, these pages can be removed using webmaster tools.
Technical SEO | | mozroadjan
But how did they get in the index to start with? And if we're building a new site, and a customer has an existing site is there a danger of duplicate content etc. penalties caused by the staging site? We had a similar incident recently when a PDF that was not linked anywhere on the site appeared in the index. The link had been emailed through Google Apps, and visited in Chrome, but that was it. So 3 questions. Why is the staging site still in the index despite the redirects? How did they get in the index in the first place? Will the new staging site affect the rank of the existing site, eg. duplicate content penalties?0 -
Why Google ranks a page with Meta Robots: NO INDEX, NO FOLLOW?
Hi guys, I was playing with the new OSE when I found out a weird thing: if you Google "performing arts school london" you will see w w w . mountview . org. uk at the 3rd position. The point is that page has "Meta Robots: NO INDEX, NO FOLLOW", why Google indexed it? Here you can see the robots.txt allows Google to index the URL but not the content, in article they also say the meta robots tag will properly avoid Google from indexing the URL either. Apparently, in my case that page is the only one has the tag "NO INDEX, NO FOLLOW", but it's the home page. so I said to myself: OK, perhaps they have just changed that tag therefore Google needs time to re-crawl that page and de-index following the no index tag. How long do you think it will take to don't see that page indexed? Do you think it will effect the whole website, as I suppose if you have that tag on your home page (the root domain) you will lose a lot of links' juice - it's totally unnatural a backlinks profile without links to a root domain? Cheers, Pierpaolo
Technical SEO | | madcow780 -
Do I use /es/, /mx/ or /es-mx/ for my Spanish site for Mexico only
I currently have the Spanish version of my site under myurl.com/es/ When I was at Pubcon in Vegas last year a panel reviewed my site and said the Spanish version should be in /mx/ rather than /es/ since es is for Spain only and my site is for Mexico only. Today while trying to find information on the web I found /es-mx/ as a possibility. I am changing my site and was planning to change to /mx/ but want confirmation on the correct way to do this. Does anyone have a link to Google documentation that will tell me for sure what to use here? The documentation I read led me to the /es/ but I cannot find that now.
Technical SEO | | RoxBrock0 -
Duplicate page/Title content - Where?
Hi, I have just run a crawl on a new clients site, and there is several 'duplicate page content' and 'Duplicate Page Title'' issues. But I cannot find any duplicate content. And to make matters worse. The actual report has confused me. Just for example the about us page is showing in both reports and for both under 'Other URLs' it is showing 1? Why? Does this mean there is 1 other page with duplicate page title? or duplicate page content? Where are the pages that have the duplicate page titles, or duplicate page content? I have run scans using other software and a copyscape scan. And apart from missing page titles, I cannot find any page that has duplicate titles or content. I can find % percentages of pages with similar/same page titles/content. But this is only partial and contextually correct. So I understand that SEO Moz may pick percentage of content, which is fine, and therefore note that there is duplicate content/page titles. But I cannot seem to figure out where I would the source of the duplicate content/page titles. As there is only 1 listed in both reports for 'Other URLs' Hopefully my long question, has not confused. many thanks in advance for any help
Technical SEO | | wood1e20