Drupal, http/https, canonicals and Google Search Console
-
I’m fairly new in an in-house role and am currently rooting around our Drupal website to improve it as a whole. Right now on my radar is our use of http / https, canonicals, and our use of Google Search Console. Initial issues noticed:
- We serve http and https versions of all our pages
- Our canonical tags just refer back to the URL it sits on (apparently a default Drupal thing, which is not much use)
- We don’t actually have https properties added in Search Console/GA
I’ve spoken with our IT agency who migrated our old site to the current site, who have recommended forcing all pages to https and setting canonicals to all https pages, which is fine in theory, but I don’t think it’s as simple as this, right? An old Moz post I found talked about running into issues with images/CSS/javascript referencing http – is there anything else to consider, especially from an SEO perspective?
I’m assuming that the appropriate certificates are in place, as the secure version of the site works perfectly well.
And on the last point – am I safe to assume we have just never tracked any traffic for the secure version of the site?
Thanks
John
-
OK I gotcha now. You can submit the sitemap in all versions of Search Console, won't hurt anything to have it referenced in multiple profiles of SC.
Another thing you can do to make sure crawlers find your XML is add this line to your robots.txt file:
Sitemap: http://yoursitecom/sitemap.xml
-
Thanks so much, this is so helpful!
About the search console question, I may have confused you. This is what I mean: I have a www and non-www property of the website in Search Console (from before my time), which looks like this:
|
property
|
Sitemap
|
http://www.mysite.com/sitemap.xml
|
NO SITEMAP LINKED
|
(apologies that has not formatted well, I hope you can decipher!)
With a sitemap linked to the www version and nothing to the non-www version. The sitemap is located on the non-www version of the site, so I was just wondering if the above scenario has essentially meant we've had no sitemap submissions to date (that said, the sitemap appears to be pulling through despite being the "wrong" address, so I can only think there are either 2 separate sitemap files, OR the redirect we have set from www to non-www is having an effect?)
-
Hi John, always glad to help!
For your Search Console question: When you get the redirects setup and have committed to your site being all HTTPS, you'll want to move the location of your XML sitemap to https://yoursite.com/sitemap.xml. As Cyrus mentions in that article, don't update the URLs in the sitemap yet, let search engines hit them as non-secure for a while, I think he recommends 30 days, to give them a chance to learn your new protocol and for them to hit your redirects multiple times.
For your www question: There's no difference in SEO-value whether you choose www or non-www, simply a preference. The only thing that matters here is that you pick one and stick with it.
For your GA question: That is correct, you are seeing traffic from both in GA. GA will collect and report on any page/URL/website that your UA-ID is on. If someone scraped your site and took the GA script with it, you'd start seeing their traffic in your reporting view (that's why appending hostname is always a good idea
). You can specify in the View Settings of GA what your protocol is.
-
Hi Logan,
Thanks for your quick response, that’s very helpful and the article you provided is great.
I hadn’t thought of the purpose of self-referring canonicals, thanks for clarifying.
Re: Search Console: I’ve just noticed we only have a sitemap linked for the http://www property. Currently, all www. traffic is redirected to the non-www version of any given page (forgetting https for a second). Is this an issue in terms of pagerank?
And my last question, I promise! If our UA tag is firing on both http and https versions of the site, should we be seeing traffic from both in GA, if the property/view default url is set to http:// ? By my understanding, that setting is just a vanity thing for reporting purposes, but I’m not sure where, if anywhere, I need to specify in a particular view that http:// and https:// traffic should be treated as the same thing?
-
Hi John,
For the most part, your IT partner is correct, 2 of the most important things are to 301 all HTTP requests to HTTPS and to update canonicals. I often refer to people with questions about HTTPS to this post written by Cyrus Shepard, he covers all the bases needed for an SEO-friendly secure migration: https://moz.com/blog/seo-tips-https-ssl.
Regarding your specific comments:
- We serve http and https versions of all our pages - A 301 redirect rule will correct this
- Our canonical tags just refer back to the URL it sits on (apparently a default Drupal thing, which is not much use) - Self-referring canonicals like this serve plenty of purpose, they just need to match your preferred version www/non-www http/https, etc. etc. Self-referring canonicals help prevent duplicates caused by parameters, case-sensitive URLs, and the aformentioned HTTP/S and www/non-www.
- We don’t actually have https properties added in Search Console/GA - You should add another profile for HTTPS, verification should be simple since you've already proven you're the site owner. You want to have both profiles in GSC so you can monitor the shift of indexed URLs from HTTP to HTTPS. Also good for future troubleshooting should you see and issue with indexing of HTTP in the future for some reason.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does "google selected canonical" pass link juice the same as "user selected canonical"?
We are in a bit of a tricky situation since a key top-level page with lots of external links has been selected as a duplicate by Google. We do not have any canonical tag in place. Now this is fine if Google passes the link juice towards the page they have selected as canonical (an identical top-level page)- does anyone know the answer to this question? Due to various reasons, we can't put a canonical tag ourselves at this moment in time. So my question is, does a Google selected canonical work the same way and pass link juice as a user selected canonical? Thanks!
Technical SEO | | Lewald10 -
Sudden Indexation of "Index of /wp-content/uploads/"
Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you.
Technical SEO | | Tom3_150 -
Switched from and HTTPS to HTTP. My home page is facing a redirect issue from the http to https. Should I no index the HTTP or find the redirect and delete it? Thank you
Switched from and HTTPS to HTTP. My home page is facing a redirect issue from the http to https. Should I no index the HTTP or find the redirect and delete it? Thank you
Technical SEO | | LandmarkRecovery20170 -
Https & http
I have my website (HTTP://thespacecollective.com) marked on Google Webmaster Tools as being the primary domain, as opposed to https. But should all of my on page links be http? For instance, if I click the Home button on my home page it will take the user to http, but if you type in the domain name in the address bar it will take you to https. Could this be causing me problems for SEO?
Technical SEO | | moon-boots0 -
Soft 404 in Search Console
Search console is showing quite a lot of soft 404 pages on my site, but when I click on the links, the pages are all there. Is there a reason for this? It's a pretty big site - I'm getting 141 soft 404s from about 20,000 pages
Technical SEO | | abisti20 -
How to link site.com/blog or site.com/blog/
Hello friends, I have a very basic question but I can not find the right answer... I have made my blog linkbuilding using the adress "mysite.com/blog" but now im not sure if is better to do the linkbuilding to "mysite.com**/blog/ "** Is there any diference? Thanks...
Technical SEO | | lans27870 -
Do I need to add canonical link tags to pages that I promote & track w/ UTM tags?
New to SEOmoz, loving it so far. I promote content on my site a lot and am diligent about using UTM tags to track conversions & attribute data properly. I was reading earlier about the use of link rel=canonical in the case of duplicate page content and can't find a conclusive answer whether or not I need to add the canonical tag to these pages. Do I need the canonical tag in this case? If so, can the canonical tag live in the HEAD section of the original / base page itself as well as any other URLs that call that content (that have UTM tags, etc)? Thank you.
Technical SEO | | askotzko1 -
How does google know a search result is a search result?
In the google webmaster forums, google specifically states that you should not include search results in the google index. What is the best way to make dynamic, great content show in search results without receiving a penalty?
Technical SEO | | nicole.healthline0