Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Query on Not Set In Product List Performance in Google Analytics
Hi All, I have query for given below screenshot-1. What is Not Set here? For column no. 2 only purchase and revenue showing other column blank why? I have properly implemented enhance ecommerce via tag manager. And my product list impression, clicks all working fine for all categories now I don't know from where I am getting Not set - Please check screenshot-2. So what is Not set here? Thanks! QcBGT OCrEp
Reporting & Analytics | | Arnold30 -
If i was to drastically improve 5 critical things on my site, what would you suggest?
I have put in a lot of improvements on my site both onsite and offsite, I was just wondering from a critical point of view, what 5 things would you suggest would require an improvement, that will consequently lead to both, a better user experience and better Rankings on Google? Open even to criticism 🙂 Thank You..... Find my site here:http://bit.ly/1vW4GGP
Reporting & Analytics | | ConnectMedia0 -
Does anyone know of a way to do a profile level filter to exclude all traffic if it enters the site via certain landing pages?
Does anyone know of a way to do a profile level filter to exclude all traffic if it enters the site via certain landing pages? The problem I have is that we have several pages that are served to visitors of numerous other domains but are also served to visitors of our site. We end up with inflated Google Analytics numbers because people are viewing these pages from our partners' domains but never actually entering our site. I've made an advanced segment that serves the purpose but I'd really like to filter it at the profile level so the numbers across the board are more accurate without having to apply an advanced segment to every report. The advanced segment excludes visits that hit these pages as landing pages but includes visits where people have come from other pages on our domain. I know that you can do profile filters to exclude visits to pages or directories entirely but is there a way to filter them only if they are a landing pages? Any other creative thoughts? Thanks in advance!
Reporting & Analytics | | ATIseo0 -
Overall site traffic - 3 quick questions
3 things : 1. Does Google factor in overall site traffic in rankings? So for 2 sites, all other things being pretty much equal, the one with higher traffic will be listed higher? 2. Does this logically imply that sites with lower traffic overall face an uphill struggle to be ranked highly??? 3. As an extension to this, would it be true to say that by increasing site traffic, say with Google Adwords or other online or offiline or whatever advertising, that might help get higher SEO rankings??? Thanks so much for your responses. This forum is great!
Reporting & Analytics | | inhouseninja0 -
Something strange going on with new client's site...
Please forgive my stupidity if there is something obvious here which I have missed (I keep assuming that must be the case), but any advice on this would be much appreciated. We've just acquired a new client. Despite having a site for plenty of time now they did not previously have analytics with their last company (I know, a crime!). They've been with us for about a month now and we've managed to get them some great rankings already. To be fair, the rankings weren't bad before us either. Anyway. They have multiple position one rankings for well searched terms both locally and nationally. One would assume therefore that a lot of their traffic would come from Google right? Not according to their analytics. In fact, very little of it does... instead, 70% of their average 3,000 visits per month comes from just one referring site. A framed version of their site which is through reachlocal, which itself doesn't rank for any of their terms. I don't get it... The URL of the site is: www.namgrass.co.uk (ignore there being a .com too, that's a portal as they cover other countries). The referring site causing me all this confusion is: http://namgrass.rtrk.co.uk/ (see source code at the bottom for the reachlocal thing). Now I know reach local certainly isn't sending them all that traffic, so why does GA say it is... and what is this reachlocal thing anyway?? I mean, I know what reachlocal is, but what gives here with regards to it? Any ideas, please??
Reporting & Analytics | | SteveOllington0 -
Setting up Goals in Google Analytics that involve a 3rd party site
I've set up several goals for one of my clients in Google Analytics. The ones that relate to things on the site -- such as clicking on the "Contact Us" button -- work just fine. However, I set up one that is tracking when someone clicks on a purchase button, which sends the user to a third party site (PayPal). This one doesn't seem to work. (I purchased and item and the goal was not recorded). Looking to see if I have to do anything different when setting up the goal.
Reporting & Analytics | | EricVallee340 -
Googlebot encountered extremely large numbers of links on your site??? How Do I resolve this?
I am working on a site with over 30 million pages. Every time I get about One Million indexed I get a Message in the Google Webmasters Tools saying "Googlebot encountered extremely large numbers of links on your site" The indexing then starts dropping like a Rock. I need to get the site indexed. Please Help!
Reporting & Analytics | | GlobalFlex0 -
For an optimized site, any available stats / guesstimates on what is avg % of traffic to homepage vs. second-level pages?
I'm interested in passing this info on to a client who experienced a period of time when an incorrect GA code was installed on their homepage. They were able to get Google stats on second level pages only. This is a site that gets 80 + % of visits from organic search engine referrals. They do minimal advertising. Thanks in advance.
Reporting & Analytics | | alankoen1230