Is Google able to determine duplicate content every day/ month?
-
A while ago I talked to somebody who used to work for MSN a couple of years ago within their engineering department. We talked about a recent dip we had with one of our sites.We argued this could be caused by the large amount of duplicate content we have on this particular website (+80% of our site).
Then he said, quoted: "Google seems only to be able to determine every couple of months instead of every day if the content is actually duplicate content". I clearly don't doubt that duplicate content is a ranking factor. But I would like to know you guys opinions about Google being only able to determine this every couple of X months instead of everyday.
Have you seen or heard something similar?
-
Sorting out Google's timelines is tricky these days, because they aren't the same for every process and every site. In the early days, the "Google dance" happened about once a month, and that was the whole mess (index, algo updates, etc.). Over time, index updates have gotten a lot faster, and ranking and indexation are more real-time (especially since the "Caffeine" update), but that varies wildly across sites and pages.
I think you also have to separate a couple of impacts of duplicate content. When it comes to filtering - Google excluding a piece of duplicate content from rankings (but not necessarily penalizing the site), I don't see any evidence that this takes a couple of months. It can Google days or weeks to re-cache any given page, and to detect a duplicate they would have to re-cache both copies, so that may take a month in some cases, realistically. I strongly suspect, though, that the filter itself happens in real-time. There's no good way to store a filter for every scenario, and some filters are query-specific. Computationally, some filters almost have to happen on the fly.
On the other hand, you have updates like Panda, where duplicate content can cause something close to a penalty. Panda data was originally updated outside of the main algorithm, to the best of our knowledge, and probably about once/month. Over the more than a year since Panda 1.0 rolled out, though, it seems that this timeline accelerated. I don't think it's real-time, but it may be closer to 2 weeks (that's speculation, I admit).
So, the short answer is "It's complicated" I don't have any evidence to suggest that filtering duplicates takes Google months (and, actually, have anecdotal evidence that it can happen much faster). It is possible that it could take weeks or months to see the impact of duplicates on some sites and in some situations, though.
-
Hi Donnie,
Thanks for your reply, but I was already aware of the fact that Google had/ has a sandbox. I had to mention this within my question. I'm looking more for an answer around the fact if Google is able to determine on what basis if pages are duplicate.
Because I saw dozens of cases where our content was indexed and we linked/ linked not back to the 'original' source.
Also want to make clear that in all of these cases the duplicate content was in agreement with the original sources just to be sure.
-
In the past google had a sandbox period before any page (content) would rank. However, now everything is instant. (just learned this today @seomoz)
If you release something, Google will index it as fast as possible. If that info gets duplicated Google will only count the first one indexed. Everyone else loses brownie points unless they trackback/link back to the main article (first indexed).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Whatstuffwherebot user agent messing up Google Analytics
Starting yesterday, Aug 26, 2020, I noticed a new bot crawling our site with user agent whatstuffwherebot. Google Analytics is counting these hits as human traffic, completely throwing off my numbers - yesterday, Analytics reported nearly triple my typical number of visitors. As of now, Search Console only shows data through Aug 25 so I don't know if Search Console is also affected. Is anybody else seeing something similar? Does anybody know what the whatstuffwherebot bot is? I don't get any results when I search on Google or Bing. For what it's worth, the traffic is coming from Columbus, OH, running over Amazon AWS via 278 different IP addresses so far. Also, WordFence (my WordPress security plugin) correctly identifies these hits as bot traffic.
Reporting & Analytics | | ahirai0 -
Backlinks or content? What is the problem here?
My Moz ranking is 29, Ahrefs Domain rank is 49 and Majestic Citation flow is 44 whereas trust flow is 17. This is a plain question - If the above are the rankings for backlink problem, then where is the problem more likely to be found? Backlink or content? In more detail - My site has been dropping in search since the last few weeks. With these rankings, is it likely that a backlink related problem is there? I myself agree that some content is poor and thin and there is a problem of plagiarism also. But overall, do i keep focusing on content? I do not know how good are these rankings as shown above. My URL is www.marketing91.com Please let me know whether the backlink profile looks good or not? So that at least i am not worried that there is a backlink problem as well. (i will surely work on toxic links soon) I can worry only on content.
Reporting & Analytics | | marketing910 -
Visits from Google with ccTLD are showing as referrals
Hi!I was seeing on of my clients Analytics report and it shows that some of the main sites that send visits and get tagged as referral traffic are google.com.br, google.cl, google.com.ar, among others. Do you know why is this happening? Shouldn't they get tagged by default as organic?
Reporting & Analytics | | arielbortz0 -
Google Analytics: Different stats for date range vs single month?
I've been scratching my head, chin, and you name it over this one. I have an advanced segment to remove bot traffic from my data. When I look at the Audience Overview data for a single month (let's say Aug). I am shown a session count. No problems here, however If I set the date range to (January - August). The august monthly stats is incorrect, much lower. What this means is that, if I export a CSV report from Jan-Aug, the data is wrong compared to individually recording a month. Anyone faced this? I've asked the question over at the Google Analytics technical section as well, but no answer P.S I even used the 'control the number of sessions used to calculate this report' tool but no luck.
Reporting & Analytics | | Bio-RadAbs0 -
How to find goo.gl/ URLs in Google Analytics
Hello! How does one go about finding the impact of goo.gl/ shortened URLs in Google Analytics? (I know I should be using Campaigns, but this was for an old project.) Thanks in advance! Erik
Reporting & Analytics | | SmileMoreSEO0 -
Google Analytics and Webmaster Tools Setup for Agencies
Hi, As agencies, what are people finding to be the best practices for allowing multiple members of the agency's team to access client WMT and GA data? Have a generic "analytics@myagency.com" account that's used for the shares, that anyone in the agency can use as needed (limited, of course, not admin). Have the individual person at the company use their company email for the share for each particular client? employee@agency.com. Yet what happens when we need someone else to check the GA or WMT data? Any advice is much appreciated.
Reporting & Analytics | | Titan552
Thank you!0 -
Google Analytics - Intelligence Custom Alerts
When you add an intelligence custom alert does it look at the history and backdate issues? I've added a few custom alerts, even basic ones like my homepage gets 5 views, create an alert. I either need to wait a day or something is going wrong. Cheers.
Reporting & Analytics | | Seaward-Group0 -
Google Analytics: how many visits from country Google domains?
Hello, I manage a site with visitors from many different countries. With Google Analytics, it is normal to see the number of visitors from each search engine. However, I would like to identify the number of visitors from each Google-search contry domain. How many visitors from Google.com? How many from Google.co.uk. And from Google.co.zm? And so on. Anybody knows if this is possible and if yes, how can it be done? Thank you in advance, Dario
Reporting & Analytics | | Darioz0