Crawl Tool Producing Random URL's
-
For some reason SEOmoz's crawl tool is returning duplicate content URL's that don't exist on my website. It is returning pages like "mydomain.com/pages/pages/pages/pages/pages/pricing" Nothing like that exists as a URL on my website. Has anyone experienced something similar to this, know what's causing it, or know how I can fix it?
-
The same thing is happening for one of my campaigns, specifically for a 302 redirect to the homepage. My guess is I need to update it to a 301, but I'm not 100% sure if that would solve the issue?
-
Well we have our website setup to where if you type something after mydomain.com/ that is not a valid URL it will take you to the site map. For example if I typed in mydomain.com/ljlksdfsdfkjsdlfjsflj it would take me to the site map. The same holds true for mydomain.com/pages/pages/pages/pages/pages/pricing. So to answer your question, no it does not take you to the correct page, but it doesn't give you a 404 error either. It takes you to the site map.
-
I had this weird issue too but it was down to how our developers created the website. Does "mydomain.com/pages/pages/pages/pages/pages/pricing" still take you to the correct page or does it show you a 404 error?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No: 'noindex' detected in 'robots' meta tag
I'm getting an error in Search Console that pages on my site show No: 'noindex' detected in 'robots' meta tag. However, when I inspect the pages html, it does not show noindex. In fact, it shows index, follow. Majority of pages show the error and are not indexed by Google...Not sure why this is happening. Unfortunately I can't post images on here but I've linked some url's below. The page below in search console shows the error above... https://mixeddigitaleduconsulting.com/ As does this one. https://mixeddigitaleduconsulting.com/independent-school-marketing-communications/ However, this page does not have the error and is indexed by Google. The meta robots tag looks identical. https://mixeddigitaleduconsulting.com/blog/leadership-team/jill-goodman/ Any and all help is appreciated.
Technical SEO | | Sean_White_Consult0 -
Geo ip filtering / Subdomain can't be crawled
My client has "load balancing" site traffic in the following way: domain: www.example.com traffic from US IP redirected to usa.example.com traffic from non-US IP redirected to www2.example.com The reason for doing this is that site contents on the www2 contains herbal medicine info banned by FDA."usa.example.com" is a "cleaned" site. Using HK IP, when I google an Eng keyword, I can see that www.example.com is indexed. When googling a Chi keyword, nothing is indexed - neither the domain or www2 subdomain. From Google Search Console, it shows a Dell Sonicwall geo ip filtering alert for www2 (Connection initiated from country: United States). GSC data also confirms that www2 has never been indexed by Google. Questions: Is geo ip filtering the very reason why www2 isn't indexed? What should I do in order to get www2 to be indexed? Thanks guys!
Technical SEO | | irene7890 -
Webmaster tools Hentry showing pages that don't exist
In Webmaster Tools I have a ton of pages listed under Structured Data >> Hentry. These pages are not on my website and I don't know where they are coming from. I redid the site for someone and perhaps they are from the old site. How do I find and delete these? Thank you Rena
Technical SEO | | renalynd270 -
What's wrong with this robots.txt
Hi. really struggling with the robots.txt file
Technical SEO | | Leonie-Kramer
this is it: User-agent: *
Disallow: /product/ #old sitemap
Disallow: /media/name.xml When testing in w3c.org everything looks good, testing is okay, but when uploading it to the server, Google webmaster tools gives 3 errors. Checked it with my collegue we both don't know what's wrong. Can someone take a look at this and give me the solution.
Thanx in advance! Leonie1 -
Client's site dropped completely for all keywords, but not brand name - not manual penalty... help!
We just picked up a new search client a few weeks ago. They've been a customer (we're an automotive dealer website provider) since October of 2011. Their content was very generic (came from the previous provider), so we did a quick once-over as soon as he signed up. Beefed up his page content, made it more unique and relevant... tweaked title tags... wrote meta descriptions (he had none). In just over a week, he went from ranking on page 4 or 5 for his terms to ranking on page 2 or 3. My team was working on getting his social media set up, set up his blog, started competitor research... And then this last weekend, something happened and he dropped completely from the rankings... He still shows up if you do a site: search, or if you search his exact business name, but for everything else, he's nowhere to be found. His URL is www.ohioautowarehouse.com, business name is "Ohio Auto Warehouse" We filed a reconsideration request on Monday, and just got a reply today that there was no manual penalty. They suggested we check our content, but we know we didn't do anything spammy or blackhat. We hadn't even fully optimized his site yet - we were just finishing up his competitor research and were planning on a full site optimization next week... so we're at a complete loss as to what happened. Also, he's not ranking for any of the vehicles in his inventory. Our vehicle pages always rank on page 1 or 2, depending on how big the city is... you can always search "year make model city" and see our customers' sites (whether they're doing SEO or not). This guy's cars aren't showing up... so we know something is going on... Any help would be a lifesaver. We've been doing this for quite some time now, and we've never had a site get penalized. Since the reconsideration request didn't help, we're not sure what to do...
Technical SEO | | Greg_Gifford0 -
Does Google pass link juice a page receives if the URL parameter specifies content and has the Crawl setting in Webmaster Tools set to NO?
The page in question receives a lot of quality traffic but is only relevant to a small percent of my users. I want to keep the link juice received from this page but I do not want it to appear in the SERPs.
Technical SEO | | surveygizmo0 -
We have a decent keyword rich URL domain that's not being used - what to do with it?
We're an ecommerce site and we have a second, older domain with a better keyword match URL than our main domain (I know, you may be wondering why we didn't use it, but that's beside the point now). It currently ranks fairly poorly as there's very few links pointing to it. However, the exact match URL means it has some value, if we were to build a few links to it. What would you do with it: 301 product/category pages to current site's equivalent page Link product/category pages to current site's equivalent page Not bother using it at all Something else
Technical SEO | | seanmccauley0 -
Url's don't want to show up in google. Please help?
Hi Mozfans 🙂 I'm doing a sitescan for a new client. http://www.vacatures.tuinbouw.nl/ It's a dutch jobsite. Now the problem is here: The url http://www.vacatures.tuinbouw.nl/vacatures/ is in google.
Technical SEO | | MaartenvandenBos
On the same page there are jobs (scroll down) with a followed link.
To a url like this: http://www.vacatures.tuinbouw.nl/vacatures/722/productie+medewerker+paprika+teelt/ The problem is that the second url don't show up in google. When i try to make a sitemap with Gsitecrawler the second url isn't in de sitemap.. :S What am i doing wrong? Thanks!0