Htaccess file creation


(Andreas Belivanakis) #1

Here is an idea for a cool feature:

We do not create 2 versions of our websites. That should be abundantly clear to everyone, and it is, except to the Googlebot, apparently, which insists, in an abysmally obtuse fashion, in considering your www website as having duplicate content with the non-www version, and penalizing you for it in SEO. Jesus F. Christ!

Google should know which website is the primary one. It is abysmally stupid on Google’s part to consider your one website as having duplicate content with the non-www version of the same website.

Common syntax of the htaccess file:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^domain.com
RewriteRule (.*) http://www.domain.com/$1 [R=301,L]

This is basically a permanent 301 redirect (the only type of redirect Her Majesty Google accepts) from the non-www version to the www version, so that Her Royal Stupidness won’t consider your website as 2 websites, and won’t penalize you for duplicate content as a result.

So, Realmac can have Rapidweaver automatically create an htaccess file at the root of the website’s hosted domain, to specify which of the two versions (according to Google) the www or the version without www is the primary, or active one. This will help us not having to manually create that htaccess file, which is, quite frankly, a royal pain in the butt to deal with, as it is a hidden file, its filename begins with a period, and has to be uploaded first, then renamed, and heaven forbid if you had to edit it later.

Of course, this feature must not be destructive. That is, it must not override existing content of the htaccess file, but complement it. This feature would be a productivity-enhancing feature for a change, as opposed to the productivity-destroying “Multiple Index Files Found“ feature introduced in 7.3.1.

Andreas


(Jochen Abitz) #2

I don’t think this would really work. Because the htaccess file is so much more than the example you mentioned. If you redirect non-http to https, enable gzip compression, browser caching, redirect old links… You have to learn to use this file. If you got it, it is very very easy to handle in the ftp app of your choice. But I think it would not work automatically in RW.


(Christopher Watson) #3

I’m only guessing here… But perhaps it’s time to get the wallet out and register whatever ftp app that is so you CAN edit the htaccess file.


(Andreas Belivanakis) #4

Jochen,

You are correct. That’s why it should be a very carefully-implemented optional feature, and as I mentioned, I am only concerned with the redirect to satisfy Google’s insanity.

Anyone in the know can make the adjustments you mentioned in the htaccess file, but I’d like to take out the drudgery of having to create an htaccess file solely for the purpose of redirecting the non-www version to the www one just to satisfy the Googlebot.

An experienced user may not need this, but even a power user would appreciate Rapidweaver’s automatic placement of these 3 redirect lines without disrupting any other content that might already be there, or may be added in the future, manually, by the user.


(Andreas Belivanakis) #5

Christopher, not a problem here. I use Cyberduck and can do this fine. The point is not having to do it, and letting Rapidweaver do it instead for me. It will be both a time and drudgery saver, as I create or edit an htaccess file solely for the redirect feature I am describing, 99% of the time.


(Andreas Belivanakis) #6

…and besides me, a somewhat experienced user, imagine how helpful this is going to be for the myriad of novice users who have no idea what an htaccess file is or how to create/edit hidden files, and do not understand why Google buries their website in SEO (penalizing it for duplicate content) because it considers it… two sites!


(Christopher Watson) #7

I’m not sure you have thought it through too well…
What about secure sites (https)?
Does it publish this file once? How would RW know this?
Some hosts have a htaccess file by default. Should it overwrite it and subsequently break something?

Surley uploading one file through ftp isn’t that far out of your workflow…?


(Andreas Belivanakis) #8

I do not claim to have all the answers to these technical issues. But Realmac does, and I am sure they can handle this. If Rapidweaver can publish via FTP, it sure can create and publish an htaccess file, or edit an already existing one.

Now, you may argue it is too much work that can’t be justified, or it may cost too much to implement. It is beyond my knowledge level, and I cannot answer these issues. Only Realmac can. I only know it would be a great feature and a significant time-saver to me, as well as thousands of users.


(Rob D) #9

Hey, Christopher,

I think Andreas is into something good.

If you are advanced enough to have a secure site, surely, you are advanced enough to EDIT the existing .htaccess file created for you by RW. Once RW would create that file, it should not create it again while republishing.


(Christopher Watson) #10

I get the idea of it… that wasn’t the point i was making though.
How is that going to help? You’re advanced enough to edit a file that is created by RW which was suppose to save you time in the first place?


(Doug Bennett) #11

I don’t think RapidWeaver could or should ever try handling htaccess file redirects. Here is why:
Not every hosting company supports .htaccess files.
.htaccess files (as they are known) are Local Apache directive files. The name htaccess was the default name that Apache first called them. They unless otherwise configured are complete configuration files that can change a lot of how the web server behaves. Many hosting companies will restrict or rename these files since they alter the configuration of the
Web Server.
Some host companies will remove them if they find them. They may set up The main configuration file( usually called httpd.conf) to ignore them. IIS (aka Windows Server) has an entirely different format with a file called web.config. The Htaccess redirects would need to be converted to the Windows format.

The other issue is a 301 redirect. Yes, Google will not honor 302 redirects, but an HTTP RC301 is a permanent redirect.
When they call it permanent, they mean that. Its considered long term cacheable meaning things like Proxy servers, search engines, and browsers will cache a 301 redirect with no expiry date. That is, it will remain cached for as long as the cache can accommodate it.

A beginner user could easily make a mistake or change their minds. Guess what happens if will you have the 301 redirect going one place a corporate user hits the redirects, that corporations proxy server caches it. Now you change the redirect to go someplace else and everyone using that proxy will never go to the new location.
RapidWeaver could offer as an option a canonical URL that should take care of the duplicate content issue and would be supported by all host companies. Canonicals are just a link in the head section of your document. Google, Bing, Yahoo, FaceBook (open graph), and others all try to honor the canonical URL Although not a sure thing like the 301 it is a lot easier to implement and a lot more forgiving.

https://support.google.com/webmasters/answer/139066?hl=en


(Andreas Belivanakis) #12

This is very interesting.

So, if I am not to use htaccess but a canonical link instead, how would I proceed manually using Rapidweaver? What kind of link shall I use to direct domain.com to www.domain.com permanently and where shall I place it in my Rapidweaver document?


(Tomas Jakobs) #13

I am putting my 301 redirects in apaches config files only. IMHO the only one place where this should be done. If spread in -htaccess files all around your server (which can be overwritten or deleted) you’ll loose control and cause headache when dealing with a lot of websites.


(Markus Frieauff ) #14

same question here…


(Doug Bennett) #15

The canonical URL is not at all a redirect. It won’t change you URL to remove WWW or add it. It won’t change http:// to https:// those still need to happen via a redirect.

Simply put RW can’t do the redirects. Too many variables as redirects occur at a different level than HTML/CSS. Depending on your server type and access levels redirects occur in different ways and places.

Tomas(@jakobssystems) is correct in Apache the best place to place them is in the server config files. Apache documentation recommends that. The problem is most of RW users are on shared host plans on don’t have access to the server config files. That is why htaccess is being used by lots of users.

The rel=Canonicals URL is simple a request that no matter how you got this page please treat it as if you got to it from this URL.
It goes in the <head> section of each page of your document.
Format is simple:
to request NOT to use WWW:
<link rel="canonical" href="https://example.com">
for an aboutus page:
<link rel="canonical" href="https://example.com/aboutus">

To request to treat as WWW
<link rel="canonical" href="https://www.example.com">
for an aboutus page:
<link rel="canonical" href="https://www.example.com/aboutus">

The rel=“canonical” and an HTTP redirect can both be used.