Archive

Posts Tagged ‘duplicate content’

File extension not needed in URL

October 30th, 2009

You do not need the .html or .htm in the URL to reach the page

Not sure if I have just stumbled upon something very clever or I am being ignorant here… I’ll start at the beginning:

Whilst updating pages for my current employer I managed to create an error in the navigation, instead of the link being www.bayplastics.co.uk/pvdf.htm I missed the .htm off so the URL read www.bayplastics.co.uk/pvdf

However this didn’t create an error in Google Webmaster Tools, nor did it stop the link working. I haven’t done anything to make the pages redirect so it can’t be that…

I tried fetching the pages as Googlebot – extremely good feature added to Google Webmaster tools recently, highly recommended! And the only difference is these three lines:

Content-Location: pvdf.htm
Vary: negotiate
TCN: choice

I’ll have to look into this, but my first impression is that URL’s without the .html or .htm on the end would be much better for the search engines and usability. The URL would be shorter and increase the concentration of keywords. Using the canonical tag will let the search engines know what address to index and should improve rankings if duplicate content is avoided.

Of course I could be wrong about this! I’ll do some research and post my findings! Watch this space.

wayne Google, SEO Tips , , ,

What do I do if a website steals my content?

July 11th, 2009

Report stolen content to Google

I recently discovered a website that had stolen the content from the website project I am working on. I found the duplicate content culprits using Google, I took this to mean Google had indexed their page and was not indexing mine due to the search engine seeing my content as being the duplicate.

This has been a real problem for me in my current role as Google Webmaster Tools is telling me that Google has indexed 2/116 URLs of the website. This is a big problem concerning SEO’s – I need to find out why Google isn’t indexing the pages and what to do about the stolen content.

After studying the website that copied the content, and the process needed to go through with Google in order to remove their page from the index I decided not to go through Google. It seemed to me that the process is a bit drawn out and involves correspondence with not only Google but the site containing the stolen content.

Here is the process to report stolen content to Google

Basically there is a copyright act for the internet called the Digital Millennium Copyright Act. This act protects content owners publishing content on the web – providing they publish the content first. You have to write to Google in the US declaring stolen content, then the website that stole your content, and somewhere along the line hopefully the matter is resolved.

I do not have the luxury of time, as of course we all know – time is money! So I decided not to contact Google in the US, as I am in the UK and the offending website is in Nigeria. I instead decided to re-write the pages they have copied. This new content should draw the search engines back to the website and have them index the pages.

Looks like the under-handed criminals have won this battle – but with my much more SEO text and less spelling mistakes in the new content, they will not win the war!

wayne Google , , , , , , ,