SEO Duplication Penalty

I am not exactly sure how to phrase this question. I brought it up once before but cannot find the discussion.

I am in the process of building a new website in RapidWeaver to replace my old one that was created with iWeb. The plan is to build a parallel site using many of the same images and much of the same text. When I am satisfied with the new site I will take down the old one and replace it with the new one.

Somewhere in my reading I came across a discussion about how search engines penalize you for having duplicative material anywhere else on the web. As I recall there was a work around for this that makes it so that your new site would not be indexed by search engines. This theoretically keeps the duplication from occurring.

How do you do this within RapidWeaver?

You will only have duplicate content if you have the same content on two different pages on the site. Since you are replacing the old site with the new site, you should not end up with both new and old pages active, and therefore should not have duplicate content.

I would make a list of all the page names and folders (basically the url after the domain name), on the old site before I begin on the new site. I would then decide if I’m happy with those full url’s. If I was, I would build the new site and keep the same content in the same location. If I was not happy with the page names/folders, I would create the new pages with new names/folders. After publishing, I would then redirect all the old pages which I did not re-use the names from to the new pages. You need to track the old and ne urls, in order to setup the redirects. This is done with redirect commands on the web server. Search for htaccess redirects for more info

This will:

  • Keep your page rank when you switch to the new site with new page names
  • Ensure links from other sites end up on the new page
  • Ensure that users who bookmarked a page on the old site end up in the correct place on the new site

Thank you so much Don. That is really helpful information.

I have two questions still:

You said I would “only have duplicate content if I have the same content on different pages…”

Does “same” content mean that EVERY element (text & images) is identical on each page?
If, for example, I use the same image on two different pages with different contextual text does the image itself constitute duplication?

I am going to replace the old site with the new site eventually. There will likely be an interim gestation period where both sites are simultaneously live and would contain identical pages. Would I want to keep the new (pending) site non-indexed so that search engines could not find it?

Would this mitigate duplication?

If so, how would I accomplish this?

Lastly: This campaign is going to take some time to accomplish. In the meanwhile I will probably have some great pages that I would want to be able to specifically point my customers to. (You would have to know the secret handshake to find them) Could I link my customers to these specific URLs without alerting search engines to them?

You’d have to ask Google! :wink: I don’t know precisely how their algorithm works, but I’d expect you’d need identical or almost entirely identical pages to trigger as duplicate content. Reusing images is common and should not cause issues. Just make sure you have unique page titles, page descriptions, and also page content (meaning substantially different text).

You can tell search engines to not index a page. In RW, in the page inspector on the Meta Tags tab, just untick the “index this page” under the Robots heading. Since the search sites are not indexing the new page, you would not have duplicate content.

You will likely want to have the new site in a subfolder of your existing site until you’re ready to switch to it. This way you can build out the page names correctly. Just add a folder to the server and include that as your publishing location directory. Something like “new”. This way your url would be domain.com/new/… When finalized, just republish to your root folder.

Yes, just email them a link to the page. Accessing the page does not affect whether search engines index a page.

Thanks Don. I think you have answered all of my questions.

Yes, I can’t imagine that Google would penalise you for having some duplication between pages. I certainly hope not. I have a book site and each category has a different page, but some books fall into more than one category so need to appear on more than one page.

From Google on duplicate content:
https://support.google.com/webmasters/answer/66359?hl=en

It’s best practice to avoid duplicate content either on the same domain or different domains.

If I understand stand the original question, it’s about “rebuilding” an old site and having two different URL’s during the rebuilding process.

The simplest approach would be to add a robots.txt file to the new URL telling search engines not to index.

Once the new site is ready to replace the old site then remove/change the robots.txt file. Then add the appropriate 301 redirect’s pointing each page to the surviving URLS page.

I’ve done regional pages for years and years - similar content with different town names, they have shown up on the first page for at least 10 years. I’m sure the bigger the coverage you require the tighter google becomes. Depends on the search results you are trying to achieve I think. ( the pages are a little more eloquent now but they used to be full on spamming to be honest ! :slight_smile: )

1 Like

I have some more questions. Again not exactly sure how to phrase them.

The 10 year old site I am re-creating is for my custom cabinet shop. For a lot of the pages I am satisfied with the descriptive text but not satisfied with the supporting images. I am a self-taught photographer and my skill set has improved considerably since the early days. Fortunately I was smart enough in the beginning to capture everything in RAW format so can greatly improve these images in post processing (which I also did not really understand back then).

For some of images I only need to tweak the white balance or cropping. On these pages everything would be the same except for the image. Would this constitute a need to re-direct old pages to new pages?

What happens if I re-use the original URL page names but make significant changes to image content and/or text? Would this constitute a need to re-direct?

I still have to learn more about the re-direct process. I understand that I need to do this with direct commands on the web server. If I accomplish this correctly will the re-direct be seamless? i.e, would the customers who come to the website have to make a choice to continue to the re-direct?

When you are replacing a site, if you keep the same information (generally) on page, even if you change all the images and all the text, but the subject of the page is the same and you are using the same url, then there is no need for a redirect.

In your example above, if the old url was www.mysite.com/cabinets.html, you could replace all the content (refresh images and re-write the text) and not need a redirect as long as you felt the content on that page was relevant to the visitors of the old page. If you decide to change the url to www.mysite.com/my-work/cabinets then you would need a redirect.

You only put the redirects in place after the new site is in place.

Let’s take another example, let’s say you had a page on the old site called www.mysite.com/contact.html. You decide that in the new site that is better under an About Me section, so you have it at www.mysite.com/about-me/contact-info. Once you replace the old site, if a search engine indexed the old page and it’s now gone, you will lose some of the page ranking associated with it. If you setup a redirect to the new page, you will transfer your ranking to the new page. Also, if anyone bookmarked the old contact.html, they will get an error about the page not being found. With the redirect setup, they will just appear on the new page.

Redirects are only needed if the url for the same content (subject) is changed. If you keep the url, no redirect is needed. They are also important for any page urls that will not be used in the new site.

A lot of times when people redo a site, they tend to adjust things, and they often change page names and website structure, since they now have a better idea of how they want their site to be navigated. They don’t keep track of the old page names and don’t setup redirects and have issues with less organic search traffic.

Also, it was common to have a more flat structure with each page ending in .html, usually right off the site root folder. Now it’s much more common to have each page in it’s own folder with the page named index.html. This way the url is simpler. www.mysite.com/contact.html/ can become www.mysite.com/contat-me. This is usually referred to as cruft-less links.

What are the actual mechanics of naming a page?
I am guessing this happens each page, one page at a time.

How does naming work with respect to customer navigation in a nav bar?
If the URL for a specific page is XYZ.com/Gallery.html how do I get it to show up in the navigation bar as simply “Gallery”?

For what it is worth I am currently working with the Foundry framework. I will be looking at others as well but right now are just getting my sea legs under me.

The name of the page shown in the navigation is taken from the page name in the left-hand side bar.

22%20AM

The actual url is set in the general settings tab (first tab) of the right-hand sidebar. To get the url you listed, you would enter “/” for the folder and “gallery.html” for the page name.

37%20AM

Personally, I’d recommend you use the url xyz.com/gallery. Set the folder name to “gallery” and finename to “index.html”.

Btw, most web servers are running linux and the file system is case-sensitive. You can have “Gallery.html” and “gallery.html” as two separate files in the same folder. To save yourself possible issues down the road, I’d recommend using lowercase for both the folder and the file names.

The actual page name (in the left sidebar) can use any capitalization you’d like.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.