I would love some suggestions on how you geniuses clean up your server. Especially on sites with a few hundred pages so that you’re not having to republish the whole site. I’m thinking especially of situations with Stacks pages. I may use a stack the requires php and it then get replaced and the new stack page is html. If I publish and update I end up with php and html pages. Does anyone have a workflow that incorprates this cleanup so the server doesn’t become a complete mess?
One thought is not to use publish, but export and use a sync service that will sync a local folder with the server folder and remove as well add content.
I don’t have a few hundred pages, but the way I approached it was to change the default page in RW to .php in Advanced settings. So all subsequent pages are by default php.
I made sure all my existing project pages were .php, then I deleted all the files that were uploaded onto my server by RW and republished.
But with a few hundred pages, it may be quicker to run through your published server content and delete by hand the duplicate pages (.php or .html).
Perhaps someone has another strategy.
By far the quickest and easiest solution is to just make a ZIP backup copy of your public_html or htdocs directory on your hosting account. You can normally do this in your FTP software or web hosting control panel. Download this backup and store it somewhere safe.
Delete everything inside of the public_html or htdocs directories. Do not delete these directories themselves!
Then have RapidWeaver ready to republish your whole website. If you are using SFTP and have connections speed set to ‘6’, then your website should be back online again without much delay. You could schedule this delete and republish for a quieter time of day or night.
As a final step, use software like Integrity to scan your website for any broken links.
If you have comment stacks, search stacks or gallery stacks that build their own databases, you may need to copy those back onto the hosting server or run through their setup tasks again.
As @anugyan says, setting your default extension to php could help avoid the problem of duplicate html and php pages at the same location. It may help to be consistent and have everything set to php. The idea some say about PHP pages being slower or worse for SEO is completely untrue.
It depends what’s on the webpage. HTML, PHP, ASP, and whatever else are all equally secure.
The purpose of the PHP extension is to loosely tell the web hosting server that there might be PHP code in the page to process. Like a search box, blog comments, privacy popup, contact form or database.
Most better web hosting companies support PHP. Some cheaper companies (like GoDaddy) may run odd-ball configurations of PHP or require you to pay more to have it enabled.
Compatibility with different devices is equal. Most PHP runs server-side, before the completed webpage is served to the client.
I have a copy of the Forklift app and am also going to try syncing folders on ftp. That would mean only changed files would be uploaded (great if there are a large number of files), plus it would delete files not in the sync folder.
Please be aware that there may be files you should not delete. htaccess files, log files (if you want to retain the info), validation files, files that were manually added, (I have some php scripts manually added, etc. You should know what every file on your server is there for. Not every file may have been put there by RW.
For example, I maintain my own htaccess file via ftp. I don’t do it with RW. I have files that I uploaded manually and reference with links. This is not done with RW resources. So, if you know that your are using RW for each and every file on your server it would be OK to delete and re-upload. But just make sure you know what you are doing.
I “maintain” my site day by day. If I upload a test file I make sure I remove it when done. If I change an html page to php I delete the html page right then and there. Use a good Ftp/sftp app for this.
@1611mac That’s precisely why I emphasised making a backup first and downloading this backup to a safe location. Then if you need to resurrect any files or other things afterwards, you can easily do so.
A backup is also useful to keep if you want to see what was previously on a website. Although tools like Wayback Machine exist, they don’t always “catch” everything. I have backups of websites made 10+ years ago, which are interesting to browse.
I think the OP was trying to clean a server that hadn’t been maintained for a long time. He’s mentioning hundreds of pages, which implies potentially thousands of files. Probably too many to realistically go through manualy and decide whether it needs to be kept or deleted.
You would only need to do a “big” delete and republish like this periodically, if things had got out of hand. For day-to-day publishing, it’s not really needed.