My automated build system is creating a duplicate website
-
Because of the tools my company is using for CI/CD (A CI/CD pipeline helps you automate steps in your software delivery process, such as initiating code builds, running automated tests, and deploying to a staging or production environment.) an extra URL is generated. The canonical for the generated site is that of our main website, but other than that it is the same website.
- Could this new URL compete with our website?
- Will Google count it against us since it is the same content BUT with canonical (it is not noindex-ed)?
- Does it matter?
- Surely others are using this method?
Answers/thoughts will be greatly appreciated. Thank you.
-
Do you have any control over the CI/CD pipeline URL?
If you control the domain enough so that you can be one to have validated and searched console them by all means. But it does not seem like you have the ability to control domain?
my correct?
https://support.google.com/webmasters/answer/7440203?hl=en
If the domain is 3ed party domain then you must trust the third-party or if you control the domain of pages which links or third-party domain URLs are embedded on you can add noindex nofollow
https://www.deepcrawl.com/blog/best-practice/noindex-disallow-nofollow/
I hope that helps,
Tom
-
Unfortunately, since URL is generated from the original site, I cannot change the robots.txt. It uses the same one as the main site. That would exclude adding a noindex meta tag, as well. Any other ideas?
Is there a way to add the duplicate URL to search console & tell google not to crawl?
Thank you.
-
I understand using CI cool
i agree get the bad content being made by CI blocked ASAP
“have an extra URL is generated. The canonical for the generated site is that of our main website, but other than that it is the same website.”
but it’s not the same content being made that will hurt you unless you’re pointing the canonicals to a similar page (get the automated content off your domain)
Remember to add using self pointing canonicals on the good pages you want to be indexed by Google or Search Engines
Hope this is of help,
Tom
-
To answer your questions:
- Technically it could compete with your current site as it's on its own domain, in reality, it's unlikely as you're canonicalizing the pages back to its original and making sure that the content itself through that way is attributed to your original site.
- What I would recommend is excluding the CI/CD site from the engines, through a robots.txt or a similar technique. That way you're making sure that the staging site itself isn't being crawled at all. In the end, I'd say there's very little upside of having that be the case currently.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New website server code errors
I launched a new website at www.cheaptubes.com and have recovered my search engine rankings as well after penguin & panda devestation. I'm continuing to improve the site but moz analytics is saying I have 288 medium issues and i see the warning "45% of site pages served 302 redirects during the last crawl". I'm not sure how to fix this. I'm on WP using Yoast SEO so all the 301's I did are 301's not 302's. I do have SSL, could it be Http vs Https?
Reporting & Analytics | | cheaptubes0 -
Why only a few pages of my website are being indexed by google
Our website www.navisyachts.com has in its sitemap over 3000 pages of information, and this is all unique content written by our team. Now Google Webmaster central shows only 100 urls indexed from 3500 submitted. Can you help me understand why and how I can fix this issue? The website has 4 years old, is a Joomla 3.3 up to date. It has part of the content in the Joomla core content systems and part in K2. Thank you. Pablo
Reporting & Analytics | | FWC_SEO0 -
Creating Meta Tags for a Web Hosting company
My client has recently had his site rewritten by a company which supplies a website/hosting/seo service. The client still wants to use me for establishing his site locally and for SEM. My problem is that the code is not accessible to me. They have set up GA tracking which feeds into the reports dashboard. There is no way of actually access the raw data. When I asked for access to the GA account I got this message from bOline solutions: "Your SEO consultant will need to create a new GA account and then add the meta tag to the reports tab of your site" Because I am still very much learning GA I do not know what meta tag they are referring too. I'm therefore stuck as to how to create it. I've started learning about "Google Tag Manager" and am working through "Digital Analytics Fundementals", I've a feeling this is just a question of terminology - can anyone more experience in the subject help me?
Reporting & Analytics | | catherine-2793880 -
Looking for an Automated SEO report Software Solution
Buon Giorno from 4 degrees C mostly cloudy Wetherby UK 🙂 I love Google Analytics but I'm bogged down with analytics report writting. I'm looking for a web analytics softeare package that: 1. White Label ie we can brand the reports up
Reporting & Analytics | | Nightwing
2. Bespoke ie i can pick and choose what I report on
3. Automated ie I can set a time & date when the client receives the report. Any recommendations appreciated 🙂 Grazie tanto, David0 -
The client's website serves as the main referral?
Hi mozzers, I have this weird case where one of my client's first referral is its own website!! I am really confused especially that I have checked there www vs non www and the non www is redirected to the www. This means that it resolve to one version which is good! Any thoughts on why the main referral is its own site? Thanks
Reporting & Analytics | | Ideas-Money-Art0 -
Duplicate Content
I am looking to check the duplicate content of two websites against each other, www.housesalesbulgaria.com and www.housesalesturkey.com. What is the best way to check this?
Reporting & Analytics | | Feily0 -
Solving link and duplicate content errors created by Wordpress blog and tags?
SEOmoz tells me my site's blog (a Wordpress site) has 2 big problems: a few pages with too many links and duplicate content. The problem is that these pages seem legit the way they are, but obviously I need to fix the problem, sooooo... Duplicate content error: error is a result of being able to search the blog by tags. Each blog post has mutliple tags, so the url.com/blog/tag pages occasionally show the same articles. Anyone know of a way to not get penalized for this? Should I exclude these pages from being crawled/sitemapped? Too many links error: SEOmoz tells me my main blog page has too many links (both url.com/blog/ and url.com/blog-2/) - these pages have excerpts of 6 most recent blog posts. I feel like this should not be an error... anyone know of a solution that will keep the site from being penalized by these pages? Thanks!
Reporting & Analytics | | RUNNERagency0 -
Why is this website with worse metrics performing better on serps?
Hi! I would like to ask if anyone has any ideas on why the second website in this analysis is getting position 2 on google and website 1 is getting position 5: http://www.opensiteexplorer.org/comparisons?site=bonobetfair.info&comparisons[0]=www.bonobetfair.com&=Compare The metrics indicate that the first website is clearly superior in (almost) all metrics. Why could this be?
Reporting & Analytics | | inmonova0