PDA

View Full Version : An idea for a new HTML attribute... opinions needed...



ClarkeyBoy
02 Oct 2010, 11:20 PM
Hey,

I was browsing on Google earlier and found that there was one site which had content which had changed since it was indexed - the page was no longer relevant to the search term. I know some sites, such as eBay, have places in some pages which change regularly on page load - for example a catalogue where there may be a New Items section or a Featured Items section in which 5 random items are selected from a catalogue of 500 on every page load. I actually quite regularly come across this kind of thing, which is what made me think of this idea.

My idea is to have a "changeFreq" attribute for which the developer can specify any of the following:


PageLoad[:URL]
Daily[:hh:mm][:URL]
Weekly[Mon|Tue|Wed|Thu|Fri|Sat|Sun][:hh:mm][:URL]
Monthly[:dd][:hh:mm][:URL]
Yearly[:mm[/dd]][:URL]


So what does all this mean? Well for PageLoad, search engines simply will not index them. For options with [:hh:mm], the time is optional. If no time is supplied then it defaults to midnight. For options with [Mon|Tue|Wed|Thu|Fri|Sat|Sun], the day is optional. If no day is supplied then it defaults to Monday. For options with [:dd] (yes, you guessed it!) it is the day of the month. If it is more than the number of days in the month then it will default back to the last day. If no day is specified then it defaults to the 1st. For options with [:mm/dd] specified, it means (fairly obviously) the month and day. The same rules apply for day and if the month is specified on its own then the day defaults to the 1st.

Now for the clever bit - the [:URL]. This specifies the URL to load into the element when it is out of date. This can be used for two purposes - for more relevant searches and for better caching systems. I can hear asking what difference this would have... it would basically allow search engines to re-index just a part of the page, and browsers to load most of a page from the cache but reload any out-dated parts. This will be optional and, if not specified on an out-dated element, the whole page will be re-indexed or reloaded.

Oh and this attribute, if a URL is specified, could be used to load any outdated parts of the page with jQuery and similar frameworks... other options would have to be available for this, such as Hourly, Minutely, Secondly and so on.

So what do people think of this idea? Are there any major flaws that I havent thought of? I am not really up on how caching and indexing work - I know the basics but I dont know enough to know if this is actually a really good idea... Any opinions would be greatly appreciated before I even think about approaching W3C or anyone like that. And on that note, does anyone know exactly who (either firm or person) I would need to contact?

Thanks in advance.

Regards,

Richard