Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
PDF best practices: to get them indexed or not? Do they pass SEO value to the site?
All PDFs have landing pages, and the pages are already indexed. If we allow the PDFs to get indexed, then they'd be downloadable directly from google's results page and we would not get GA events. The PDFs info would somewhat overlap with the landing pages info. Also, if we ever need to move content, we'd now have to redirects the links to the PDFs. What are best practices in this area? To index or not? What do you / your clients do and why? Would a PDF indexed by google and downloaded directly via a link in the SER page pass SEO juice to the domain? What if it's on a subdomain, like when hosted by Pardot? (www1.example.com)
Reporting & Analytics | | hlwebdev1 -
Organic reports showing a URL that isn't in Search Ask Question
In the image I've attached you can see that I have pulled a source/medium > google organic report. I've also made "landing page" my secondary dimension. The first landing page that is showing up is /v3/?slug=fnl, that is this page (https://orders.freshnlean.com/v3/?slug=fnl). You can see that the page has 230 sessions from Sep 3 - 9 and 17 transactions during that same time frame. The only thing is, that landing page is nowhere to be found in the SERPs. So how is it showing up in this report as having received google organic visitors that converted if it's not even in search? 05OclDp
Reporting & Analytics | | tdastru0 -
Need advice on setting up primary domain and shopify site analytics to work best together
Hello, I have a client that I have been working on their primary site for the last year or so. In the last month they decided to have one of their internal employees setup a small shopify store. Now they are asking for the analytics tracking codes for it. My question for you is what would be the best way for me to set that up? variables: primary domain and shopify domain, google and bing analytics Have been looking at how cross domain tracking works (https://support.google.com/tagmanager/answer/6106951), and the instructions for setting up ecommerce in analytics for shopify (https://help.shopify.com/manual/reports-and-analytics/google-analytics/google-analytics-setup). But am still not 100% which route would be the best, any input would be greatly appreciated! thank you, Dustin
Reporting & Analytics | | pastedtoast1 -
Does subdomain (or sub sub domain) affect analytics data of root site?
We self-host our public website, but over time have also added subdomains onto it that are not public and are for internal or even client portals. I am seeking advice as to whether those subdomains affect the analytics data (self referrals, visits, bounces) of the public site that I am tasked with analyzing. I feel that it does skew the data but need to build a solid case to move the public website to a new domain, so as to leave the existing one in tact with all of its subs.
Reporting & Analytics | | MarketingGroup0 -
Google Analytics Question - Impressions & Queries Up, Sessions Down
I'm working with a client who, according to the Google Query report, impressions and sessions are up since we've started work with them about 6 months ago, but Google sessions are down. In moz, we're seeing a gradual, but steady increase in search visibility specifically with Google. Note: this is all organic. From when we started tracking queries, the first month we were tracking there were 43,581 impressions and 690 click throughs for the month. This past month there were 98,293 queries and 1015 clicks throughs for the month (granted not year over year data) - of these 1,015 clicks, 995 of them were from web. However, for those same time periods, sessions from Google are down over 30% - 1,750 vs. 1,189. I'm not sure how to interpret this. I realize that clicks and sessions are not a straightforward comparison, but I would think that if clicks were up according to the query report that sessions would also be up. Is it that some of these clicks are bouncing and therefore not being tracked as a session? Is there a potential issue with how data is being tracked?
Reporting & Analytics | | Corporate_Communications0 -
What is the best way to embed Google Analytics charts on our site?
We want to build out this functionality so a client can log into our site and view data. I know its possible but I can't find any articles about the steps so that my team can move forward.
Reporting & Analytics | | appbackr0 -
Cross-Domain Tracking Urgent Query :-(
Hi Mozzers! One of my clients is having an issue with cross-domain tracking, in other words their own domain is seen as a huge referrer. When you land on their site, which is www.sunway.ie, and then choose a holiday to book it then takes you to another domain which is www.sunwayholidays.ie, during the booking process. I'm just wondering if there is a Google Analytics genius out there who may be able to take a quick look and let me know if there is any obvious solution to this within the Google Analytics code? Thanks in advance everyone! Gavin
Reporting & Analytics | | strategemilabs0 -
Question on regular expression for filters on GA
Hi guys, I am creating profiles on some of the countries sites in my network, and have managed to establish the filter for tracking certain url patterns, for example: ^/japan-english- is tracking all my urls in the Japan site that start by japan-english great! however, it is not tracking the japanese instance of the urls. The pattern for the latter is : www.mysite.org/jp/japan-english I could then modify the filter to track the jp subfolder like this: ^/jp/japan-english- but it will then only track the urls on the /jp/ subfolder does anyone know the regex command for tracking the two url patters as follows: /jp/japan-english- & /japan-english- thanks in advance david
Reporting & Analytics | | BritishCouncil0