SEO-Friendly Method to Load XML Content onto Page
-
I have a client who has about 100 portfolio entries, each with its own HTML page.
Those pages aren't getting indexed because of the way the main portfolio menu page works: It uses javascript to load the list of portfolio entries from an XML file along with metadata about each entry. Because it uses javascript, crawlers aren't seeing anything on the portfolio menu page.
Here's a sample of the javascript used, this is one of many more lines of code:
// load project xml try{ var req = new Request({ method: 'get', url: '/data/projects.xml',
Normally I'd have them just manually add entries to the portfolio menu page, but part of the metadata that's getting loaded is project characteristics that are used to filter which portfolio entries are shown on page, such as client type (government, education, industrial, residential, industrial, etc.) and project type (depending on type of service that was provided). It's similar to filtering you'd see on an e-commerce site. This has to stay, so the page needs to remain dynamic.
I'm trying to summarize the alternate methods they could use to load that content onto the page instead of javascript (I assume that server side solutions are the only ones I'd want, unless there's another option I'm unaware of). I'm aware that PHP could probably load all of their portfolio entries in the XML file on the server side. I'd like to get some recommendations on other possible solutions. Please feel free to ask any clarifying questions.
Thanks!
-
As a response to my own question, I received some other good suggestions to this issue via Twitter:
- @__jasonmulligan__ suggested XSLT
- @__KevinMSpence__ suggested "...easiest solution would be to use simplexml --it's a PHP parser for lightweight XML" & "Just keep in mind that simplexml loads the doc into memory, so there can be performance issues with large docs."
- Someone suggested creating a feed from the XML, but I don't think that adds a ton of benefit aside from another step, since you'd still need a way to pull that content on to the page.
- There were also a few suggestions for ways to convert the XML feed to another solution like JSON on the page, but those were really outside the scope of what we were looking to do.
Final recommendation to the client was to just add text links manually beneath all of the Javascript content, since they only were adding a few portfolio entries per year, and it would look good in the theme. A hack, perhaps, but much faster and cost-effective. Otherwise, would have recommended they go with PHP plus the simplexml recommendation from above.
-
Think you need to find a developer who understand progressive enhancement so that the page degrades gracefully. You'll need to deliver the page using something server-side (php?) and then add the bells and whistles later.
I'm guessing the budget won't cover moving the entire site/content onto a database/cms platform.
How does the page look in Google Webmaster Tools - (Labs, Instant Preview). Might give you a nice visual way to explain the problem to the client.
-
Site was done a year or two ago by a branding agency. To their credit, they produced clean and reasonably-well documented code, and they do excellent design work. However, they relied too heavily on Flash and javascript to load content throughout the site, and the site has suffered as a result.
Site is entirely HTML, CSS, & Javascript and uses Dreamweaver template files to produce the portfolio entry pages, which then propagate into the XML files, which then get loaded by the rest of the site.
I wouldn't call it AJAX - I think it loads all of the XML file and then uses the filters to display appropriate content, so there are no subsequent calls to the server for more data.
User interface is great, and makes it easy to filter and sort by relevant portfolio items. It's just not indexable.
-
What's the reason it was implemented this way in the first place? Is the data being exported from another system in a particular way?
What's the site running on - is there a CMS platform?
Is it javascript because it's doing some funky ajax driven "experience" or are they just using javascript and the xml file to enable you to filter/sort based on different facets?
Final silly question - how's the visitor expected to interact with them?
-
Try creating an XML sitemap with all the entries, spin that into an HTML sitemap version and also a portfolio page with a list of entries by type. It's a bit of work, but will probably work best.
-
Thanks Doug,
I forgot to mention it above, but I am definitely mentioning other workaround methods of getting the content indexed, specificallly:
- XML Sitemap
- Cross-linking - there's plenty of other opportunities to link throughout the site that haven't been done yet - so that's high on the list.
- Off-site deep link opportunities are also large and will be addressed.
- The projects aren't totally linear, so we can't use next/previous in this example, but that's a good idea as well.
Those aside, there is a fundamental issue with the way the data is working now and I want to address the ideal solution, since it's within the client's budget to have that content redesigned properly.
-
While helpfully not answering the question, could you generate a xml sitemap (I take it the portfolio data is being generated from something?) to help Google find and index the pages?
Is there any cross linking between the individual portfolio pages or at least a next/previous?
(My first thought would have been the php route.)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to handle images (lazy loading, compressing, caching...) to impact page load and thus SEO?
Hi all, I am looking for a conclusive answer on how to handle images on Wordpress websites. Most of the time we encounter the same problems regarding images. There are several options to make sure that images don't increase page load too much: Page caching and compressing: standard Lazy loading: helps decrease page load time, but Google might not crawl the images so not good for SEO. See this article on Googlebot scrolling. Correct image format (for example WebP): tried it several times and doesn't help much to decrease page load time. What is best practice? Are there standards or preferred options for the image dimensions and quality (max height, width, number of pixels, rectangular or square) before you upload it, also regarding responsiveness? Is it better to use .jpg, .png or WebP? To sum up, what should you do by default to handle images on websites so you can still have a good page speed even with loads of images? Thanks for your answers!
Intermediate & Advanced SEO | | Mat_C0 -
"No index" page still shows in search results and paginated pages shows page 2 in results
I have "no index, follow" on some pages, which I set 2 weeks ago. Today I see one of these pages showing in Google Search Results. I am using rel=next prev on pages, yet Page 2 of a string of pages showed up in results before Page 1. What could be the issue?
Intermediate & Advanced SEO | | khi50 -
Two sites with same content in different countries. How does it effect SEO?
Lets say for example that we have to sites, example.com and example.co.uk. The sites has the same content in the same language. Can the sites rank well in its own country? Of course all content could be rewritten, but that is very time consuming. Any suggestions? Has anyone did this before or now a site which has?
Intermediate & Advanced SEO | | fredrikahlen0 -
Wordpress - Dynamic pages vs static pages
Hi, Our site has over 48,000 indexed links, with a good mix of pages, posts and dynamic pages. For the purposes of SEO and the recent talk of "fresh content" - would it be better to keep dynamic pages as they are or manually create static pages/ subpages. The one noticable downside with dynamic pages is that they arent picked up by any sitemap plugins, you need to manually create a separate sitemap just for these dynamic links. Any thoughts??
Intermediate & Advanced SEO | | danialniazi1 -
New indepth page for content marketing - feedback?
I finally figured out that to EARN quality links I need some great content that I can share. Content that is valuable to people. So I created a (indepth?) article with some pretty sweet infographics. This is NEW to me. So I was hoping I could get some feedback on this before I attempt to promote it to media and industry publishers. Here is the link: http://www.titanium-jewelry.com/jewelry-insurance-info.html Would love your feedback and suggestions! Thanks ron
Intermediate & Advanced SEO | | yatesandcojewelers2 -
How Many Words in Content for Good SEO?
I have heard it's best to have 400+ words of content for strong SEO per page. I believe this is true for the most. I have a project in mind, however, that I am considering doing 100-200 words of content per page. This is for a glossary of terms for my industry, where I have a unique page for each term that describes what that term means w/ 1 image and a few links to related products. Is having just 100-200 words going to be enough? Each page will still be unique, original content. Or is it best to really try for longer articles? In other words, is there a general rule for # of words per page for search engines to see the page as valuable and unique and to give it good ranking? Give me a BIG THUMBS UP if you found this question useful. It won't cost you anything! Thanks!
Intermediate & Advanced SEO | | applesofgold0 -
What is the effect on using jQuery sliders for content on SEO?
I know using css in subversive manners gets you dinged for points. I didnt know if JS counted the same since you are essentially hiding parts of the content and showing it in intervals as slides. The goal would be having key items for a client in divs and rotating those divs via a slider plugin as slides. I was just curious if that effected things in any way. Thanks! ~Paul
Intermediate & Advanced SEO | | peb72680 -
SEOMoz mistaking image pages as duplicate content
I'm getting duplicate content errors, but it's for pages with high-res images on them. Each page has a different, high-res image on it. But SEOMoz keeps telling me it's duplicate content, even though the images are different (and named different). Is this something I can ignore or will Google see it the same way too?
Intermediate & Advanced SEO | | JHT0