Sitemap and crawl impact
-
If I have two links in the sitemap (for example: page1.html and page2.html) but the web-site contains more pages (page1.html, page2.html and page3.html) is this a sign for Google to not to crawl other pages?
I.e. Will Google index page3.html?
Consider that any page can be accessed.
-
Even though you don't mention about page3.html in sitemap, bot's will crawl and index this page.
When ever a bot come to crawl our site first it will look for Sitemap, because it is a guide line from google to create and submit sitemap to the search engines. So it is nothing but we are providing the total website information in one file. In other terms it is called Search engine friendly.
And also if bot's encountered any problem while crawling this page3.html then they won't index this page.
Thank you..
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is sitemap required on my robots.txt?
Hi, I know that linking your sitemap from your robots.txt file is a good practice. Ok, but... may I just send my sitemap to search console and forget about adding ti to my robots.txt? That's my situation: 1 multilang platform which means... ... 2 set of pages. One for each lang, of course But my CMS (magento) only allows me to have 1 robots.txt file So, again: may I have a robots.txt file woth no sitemap AND not suffering any potential SEO loss? Thanks in advance, Juan Vicente Mañanas Abad
Technical SEO | | Webicultors0 -
Having Problems to Index all URLs on Sitemap
Hi all again ! Thanks in advance ! My client's site is having problems to index all its pages. I even bought the full extension of XML Sitemaps and the number of urls increased, but we still have problems to index all of them. What are the reasons? The robots.txt is open for all robots, we only prohibit users and spiders to enter our Intranet. I've read that duplicate content and 404's can be the reason. Anything else?
Technical SEO | | Tintanus0 -
Should Sitemaps be placed in the sub folder they reference?
I have a sitemap-index.xml file in the root. I then have several sitemaps linked to from the index in example.com/sitemaps/sitemap1.xml, example.com/sitemaps/sitemap2.xml, etc. I have seen on other sites that for example a sitemap containing blogs where the blogs are located at example.com/blog/blog1/ would be located at example.com/blog/sitemap.xml. Is it necessary to have the sitemap located in the same folder like this? I would like to have all sitemaps in a single sitemap folder for convenience but not if it will confuse search engines. My index count for URLs in some sitemaps has dropped dramatically in Google Webmaster Tools over the past month or so and I'm not sure if this is having an effect. If it matters, I have all sitemap files, including the index, listed in the robots.txt file.
Technical SEO | | Giovatto0 -
I want to resubmit sitemap
I am doing major changes in my website some of my old url pages i don't want them to be indexed or submitted in site map some of other old pages i want to keep them and there is new pages any one can give me hints what should i do also I have thousands of pages on my website and I don't want to submit all my pages i want to submit best pages to google in sitemap that why i want to resubmit new site maps
Technical SEO | | Jamalon0 -
Sitemap.xml - autogenerated by CMS is full of crud
Hi all, hope you can help. the Magento ecommerce system I'm working with autogenerates sitemap.xml - it's well formed with priority and frequency parameters. However, it has generated lots of URLs that are pointing to broken pages returning fatal erros, duplicate URLs (not canonicals), 404s etc I'm thinking of hand creating sitemap.xml - the site has around 50 main pages including products and categories, and I can get the main page URLs listed by screaming frog or xenu. Then I'll have to get into the hand editing the crud pages with noindex, and useful duplicates with canonicals. Is this the way to go or is there another solution thanks in advance for any advice
Technical SEO | | k3nn3dy30 -
Syndicated posts extracts on wordpress and impact on SERPS
On our main site (http://www.deeperblue.com) we've been syndicating posts (not the full posts just link and short extract) from a trusted partner of ours. These posts are listed as Diverwire Staff and point directly back to the original website. What i'm concerned about is the impact on SERPS - we don't want to be penalised by any of the search engines.
Technical SEO | | StephanWhelan0 -
Crawling and indexing content
If a page element (div, e.g.) is initially hidden and shown only by a hover descriptor or Javascript call, will Google crawl and index it’s content?
Technical SEO | | Mont0 -
False Negative Warnings with Crawl Diagnostic Test
Ok... I will try to explain as clear as possible. This issue is regarding close to 5000 'Warnings' from our most recent seomoz pro crawl diagnostic test. The top three warnings have about 6000 instances among them: : 1. Duplicate Page Title 2. Duplicate Page Content 3. 302 (Temporary Redirect) We understand that duplicate titles and content are "no-no's" and have made it top priority to avoid duplication on any level. Here is the issue lies... we are using the Volusion eCommerce solution and they have a variety of value add shopping features such as "Email A Friend" and "Email Me When Back In-Stock" on each product page. If one of these options is clicked, you are then directed to the appropriate page. Now each page has a different url with the sole variable of each individual product code. But with it being a part of Volusion's ingrained functionality... the META title is the same for each page. It takes from the title of our store homepage. Example below: Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130 Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=BI8BIOSI34 The same goes for the duplicate content warnings. If you click on one of these features, it directs you to a page with pretty much the same content except for different product. Basically each page has both duplicate content and duplicate title. SEOMOZ description is Duplicate Title: Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings. Duplicate Page Content: You should use unique titles for your different pages to ensure that they describe each page uniquely and don't compete with each other for keyword relevance. Because I know SEO is not an exact science, the question here is does Google recognize that although they are duplicates, it actually is generated from a feature that makes us even more of a legitimate eCommerce site? Or, from seomoz description, if duplication is bad only because you do not want your pages to be competing with each other... should I not worry because i could care less if these pages don't get traffic. Or does it effect my domain authority as whole? Then as for a solution. I am still trying to work out with Volusion how we can change the META title of the pages. It's highly unlikely but we'll see. As for the duplicate content, there is no way to change one of these pages. It's hard coded. Solution... so if it is bad (even though it shouldn't be) would it be worth it to disable these features. I hope not. Wouldn't that defeat the purpose of Google trying to provide the most legitimate, value add sites to searchers? As for the 302 (Temporary Redirect) warning... this is only appearing on all of our shopping cart pages. Such as the "Email A Friend" feature, there is a page for every product. For example: http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8040 http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8050 The description semoz provides is: 302 (Temporary Redirect): Using a 302 redirect will cause search engine crawlers to treat the redirect as temporary and not pass any link juice (ranking power). We highly recommend that you replace 302 redirects with 301 redirects. So the probably solution... I do have the ability to change to a 301 redirect but do I want to do this for my shopping cart? Does Google realize the dead end is legitimate? Or... does it matter if link juice is passed through my shopping cart? And again, does it impact my site as a whole? It is greatly appreciated if anyone could help me out with this stuff 🙂 Thank you
Technical SEO | | anthonyjamesent1