Site: Query Question
-
Hi All,
Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time.
I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages.
What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned.
When I do a query for www.newark.com "/dp/" I get ~845,000 results returned.
Either I am doing something stupid or these numbers are completely backwards?
Any thoughts?
Thanks,
Ben
-
Barry Schwartz posted some great information about this in November of 2010, quoting a couple of different Google sources. In short, more specific queries can cause Google to dig deeper and give more accurate estimates.
-
Yup. get rid of parameter laden urls and its easy enough. If they hang around the index for a few months before disappearing thats no big deal, as long as you have done the right thing it will work out fine
Also your not interested in the chaff, just the bits you want to make sure are indexed. So make sure thise are in sensibly titled sitemaps and its fine (used this on sites with 50 million and 100 million product pages. It gets a bit more complex at that number, but the underlying principle is the same)
-
But then on a big site (talking 4m+ products) its usually the case that you have URL's indexed that wouldn't be generated in a sitemap because they include additional parameters.
Ideally of course you rid the index of parameter filled URL's but its pretty tough to do that.
-
Best bet is to make sure all your urls are in your sitemap and then you get an exact count.
Ive found it handy to use multiple sitempas for each subfolder i.e. /news/ or /profiles/ to be able to quickly see exactly what % of urls are indexed from each section of my site. This is super helpful in finding errors in a specific section or when you are working on indexing of a certain type of page
S
-
What I've found the reason for this comes down to how the Google system works. Case in point, a client site I have with 25,000 actual pages. They have mass duplicate content issues. When I do a generic site: with the domain, Google shows 50-60,000 pages. If I do an inurl: with a specific URL param, I either get 500,000 or over a million.
Though that's not your exact situation, it can help explain what's happening.
Essentially, if you do a normal site: Google will try its best to provide the content within the site that it shows the world based on "most relevant" content. When you do a refined check, it's naturally going to look for the content that really is most relevant - closest match to that actual parameter.
So if you're seeing more results with the refined process, it means that on any given day, at any given time, when someone does a general search, the Google system will filter out a lot of content that isn't seen as highly valuable for that particular search. So all those extra pages that come up in your refined check - many of them are most likely then evaluated as less than highly valuable / high quality or relevant to most searches.
Even if many are great pages, their system has multiple algorithms that have to be run to assign value. What you are seeing is those processes struggling to sort it all out.
-
about 839,000 results.
-
Different data center perhaps - what about if you add in the "dp" query to the string?
-
I actually see 'about 897,000 results' for the search 'site:www.newark.com'.
-
Thanks Adrian,
I understand those areas of inaccuracy, but I didn't expect to see a refined search produce more results than the original search. That just seems a little bizarre to me, which is why I was wondering if there was a clear explanation or if I was executing my query incorrectly.
Ben
-
This is an expected 'oddity' of the site: operator. Here is a video of Matt Cutts explaining the imprecise nature of the site: operator.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Https question
I had a site with an ssl certificate on. It has now been taken off. We are getting 404 errors on the weekly report for pages that were indexed as https What is the best way to get rid of these? How can I take them off the map or do we need to put the ssl back on? Thanks
Reporting & Analytics | | sharpster0 -
How can we stop Google analytics pulling in data from another site?
We have a few accounts under one Google login. They all have separate Google analytics codes but one of the sites is somehow pulling in some data from another site but the other site has not got the same analytics code on it. Not sure how this is happening and what we can do about this, is it a bug in the Google Analytics system? Any help would be appreciated.
Reporting & Analytics | | dentaldesign0 -
How is it possible that this site has a higher page authority than my site?
Judging by open site explorer, I'm crushing my competitor in every imaginable way. And yet, somehow they have a higher page authority than me and, consequently, are ranking higher than me. How is this possible? My site is on the left: 40atcpP.png
Reporting & Analytics | | ScottMcPherson0 -
Multiple-Domain tracking for sister sites- NO retail checkout- Please help
Hello, I have about 5 sites I want to set up multiple-domain tracking in google analytics. All posts I read seem to be focused on cross-domain tracking for the purpose of tracking a visitor from one domain across another domain for shopping cart check outs. I don't need that. I have about 3 sister sites (mastersite.com, sistersite1.com, sistersite2.com, sistersite3.com) related to my primary site. I want 1 Master Analytics Profile to track traffic for all of these sites combined. My visitors will not jump from mastersite.com over to sistersite1.com. There will be no cross-domain visits. How can I set up 1 master google analytics profile that will aggregate traffic data from all sites and present the data to me in one analytics profile. Please help
Reporting & Analytics | | AndreGant0 -
2 questions on avoiding issues with Google and while being right in it.
Hi SEOmoz community In fact I have two questions I would like to ask (with future SEO in mind). Do you consider a WordPress Multisite or various Single installs 'safer' for SEO? Theoretically, having various sites packed into one Multisite network seems like an ideal solution. However, is there a chance that once a site in the network encounters a little 'negative turbulence', that your other sites in the network might get impacted too due to the cross-referencing, linked account i.e. Webmaster Tools etc.? It would seem outrageous, but then again I wouldn't rule it out. Do I even have to go as far as setting up new Gmail, Google Analytics and Webmaster Tools accounts, so they (the sites) are technically not linked? You can see, I don't trust search engines one bit... Is there still a point posting articles once Google is having a hissy fit with your site? Basically I am currently going through a 'rankings and traffic drops storm'. It's not as bad as being de-indexed, but it's still having enough of an impact. In addition, Google does not seem to treat my new articles (unique content) with the same attention anymore i.e. does not seem to index them 'fully' or not at all (i.e. posting the headline in Google should return the article, but it doesn't). Is there even a point spending time now and posting new material or may it pick it up again once I am through this low phase? Does Google still index what it considers worth or is it a waste of time right now to keep posting, posting and posting more? Thanks for your help. I really appreciate it.
Reporting & Analytics | | Hermski0 -
Should we add the city to our keywords for a site that is only local?
This is one of those things I have done for a long time and all of a sudden asked myself was it necessary: For our local clients, we add the city name (Houston, KC, Birmingham) after each keyword. An example would be TestSite.com/big-tester-houston A Title Tag might be Big Tester Houston | Test Site, etc. Where appropriate we do the same with H1 or H2's and occasionally in the content we will use the city name. The thought being that since the site is only for a given city, it will be deemed more relevant than a site from outside.( I understand there are other factors in SEO; this is a specific question around adding the city). Yes, we also optimize with local directories/citation sites. Is this overkill, is it even worthwhile? Is there any evidence one way or another? I would love some strong opinions backed up with something other than anecdotal evidence where possible.
Reporting & Analytics | | RobertFisher0 -
New Google Analytics Site Speed tool and excel
Hello, I was wondering if there is a good tool or method to pull the new Google Analytics Site Speed data into excel and use this document to track site speeds on a weekly basis for multiple clients? Any good articles or how-to's would be awesome!
Reporting & Analytics | | Hakkasan0 -
Setting up Goals in Google Analytics that involve a 3rd party site
I've set up several goals for one of my clients in Google Analytics. The ones that relate to things on the site -- such as clicking on the "Contact Us" button -- work just fine. However, I set up one that is tracking when someone clicks on a purchase button, which sends the user to a third party site (PayPal). This one doesn't seem to work. (I purchased and item and the goal was not recorded). Looking to see if I have to do anything different when setting up the goal.
Reporting & Analytics | | EricVallee340