Google Sitemap Issue - Bot can't Recrawl...?!

Here is a sharable link of what I have set up now on Made with Love:

If you want you can post a link to what you have on the made for love site:



Have a look at this it has the two redirects (I changed the one to the correct URL).
Since I can’t see what you have on the htaccess file, I don’t know why it wouldn’t be working for you.

If you want to share multi-line text here on the forum, you can do:

  1. Put a couple blank lines in
  2. then on a line by itself enter three back-ticks ``` (back-tick ` above the tab, below esc on most keyboards)
  3. Paste the code (contents of the htaccess file
  4. on another blank line right after the pasted code put the back-ticks ```



Hi again,

As I expected, I wasn’t using the .htaccess website properly…duh!

Here is what is in my .htaccess file at the moment. Having just out the old url directly in to the search bar it has clicked through to the correct page for the first time - woop woop! thanks so much for persisting with me! I think I’ll put a few more 302 redirects in using the same formula and see if they continue to work. You see the old url with the % character in it, do I just copy it as it is? Thanks, again D


Options -Indexes

RewriteEngine On
RewriteCond %{HTTPS} off [OR]
RewriteCond %{HTTP_HOST} ^www\. [NC]
RewriteCond %{HTTP_HOST} ^(?:www\.)?(.+)$ [NC]
RewriteRule ^ https://%1%{REQUEST_URI} [L,NE,R=301]

ErrorDocument 404 /error/404.html

RewriteRule ^painting/OIls\.html$ https://deborahgriceartist.uk/portfolio/paintings/? [NC,R=302,L]

@teefers
Hello again, so here is the code in my .htaccess file now: they all worked madewith love. However, I’ve just uploaded the .htaccess file and the old links only connect to my home page, not through to the new pages…please can you advise me what I’ve done wrong? It’s FAR from the end of the world though as it’s a much better destination than an error page. Thanks so much.

 
 
Options -Indexes

RewriteEngine On
RewriteCond %{HTTPS} off [OR]
RewriteCond %{HTTP_HOST} ^www\. [NC]
RewriteCond %{HTTP_HOST} ^(?:www\.)?(.+)$ [NC]
RewriteRule ^ https://%1%{REQUEST_URI} [L,NE,R=301]

ErrorDocument 404 /error/404.html

RewriteRule ^painting/OIls\.html$ https://deborahgriceartist.uk/portfolio/paintings/? [NC,R=302,L]
RewriteRule ^page|.html$ https://deborahgriceartist.uk/ [NC,R=302,L]
RewriteRule ^Limited%20Edition/prints|.html$ https://deborahgriceartist.uk/portfolio/printmaking [NC,R=302,L]
RewriteRule ^paintings/Paintings/miniatures|.html$ https://deborahgriceartist.uk/portfolio/printmaking [NC,R=302,L]
RewriteRule ^/page/biography|.html$ https://deborahgriceartist.uk/cv/ [NC,R=302,L]
RewriteRule ^/paintings/paintings|.html$ https://deborahgriceartist.uk/portfolio/paintings/ [NC,R=302,L]
RewriteRule ^/paintings/Oils|.html$ https://deborahgriceartist.uk/portfolio/paintings/ [NC,R=302,L]
RewriteRule ^/oilpaintings/oncanvas|.html$ https://deborahgriceartist.uk/portfolio/paintings/ [NC,R=302,L]
RewriteRule ^//subscribe/form|.html$ https://deborahgriceartist.uk/contact/ [NC,R=302,L]
RewriteRule ^//contact/hello|.html$ https://deborahgriceartist.uk/contact/ [NC,R=302,L]
RewriteRule ^about/biography\.html$ https://deborahgriceartist.uk/about/cv.html? [NC,R=302,L]
RewriteRule ^onpaper/mixedmedia\.html$ https://deborahgriceartist.uk/portfolio/drawings/? [NC,R=302,L]
RewriteRule ^drawings/drawing\.html$ https://deborahgriceartist.uk/portfolio/drawings/? [NC,R=302,L]
RewriteRule ^painting/drawingsonpaper/mixedmedia\.html$ https://deborahgriceartist.uk/portfolio/drawings/? [NC,R=302,L]
RewriteRule ^contact/hello|.php$ https://deborahgriceartist.uk/contact/? [NC,R=302,L]
RewriteRule ^hello|.php$ https://deborahgriceartist.uk/contact/? [NC,R=302,L]

Also, you mentioned I would need to this for http and www. addresses, is that correct? If so, please can you tell me how to do it?


It’s not really a %. It’s %20 that’s coding for having a space in the folder name. The coding doesn’t work for many things like Apache htaccess files, that’s why we try to tell everyone never use spaces, always use lowercase letters and a - or _ in folder names.
to get around this, you replace the space %20 with an escaped-s \s (RegEx for space).


RewriteRule ^Limited\sEdition/prints\.html$ https://deborahgriceartist.uk/portfolio/printmaking/$ [NC,R=302,L]

Now for the rest of the htaccess file. This is code every char counts. you have somehow replaced the \.html with |.html on some lines. Others you left it \.html. on the ends of the new addresses some you have the trailing /$ others you have just the/ and others you don’t have either. Some old URLs start with a // some start with a ‘/’ and some don’t have anything.

And the second RewriteRule
RewriteRule ^page|.html$ https://deborahgriceartist.uk/ [NC,R=302,L]
Not sure what that one is doing. Not sure what the old URL was? And it won’t match anything because of the |.html instead of the \.html.

So let’s explain a little here what this means. ReWriteRule’s use Regular Expressions (RegEx). So all these special characters have meanings.

  • The ^ say the string they match us against starts with the following pattern.
  • The \ is an escape character, meaning what directly follows is a character that normally has special meaning to RegEx and to ignore the special meaning and treat that char as is like the .
  • The ‘$’ says the string they match us against ends with the preceding pattern
    
    There are books of information on RegEx, but each character means something.
    

So I went in and cleaned up the Rewrite rules. I think I cought most of the errors. I left them 302’s so I’d just replace the entire htaccess file with this:

# Disable directory browsing (security) 
Options -Indexes
#  Force HTTPS and Remove WWW (this line is a comment)
RewriteEngine On
RewriteCond %{HTTPS} off [OR]
RewriteCond %{HTTP_HOST} ^www\. [NC]
RewriteCond %{HTTP_HOST} ^(?:www\.)?(.+)$ [NC]
RewriteRule ^ https://%1%{REQUEST_URI} [L,NE,R=301]
#  404 Error page assignment 
ErrorDocument 404 /Error/404.html
# set page redirects
RewriteRule ^painting/OIls\.html$ https://deborahgriceartist.uk/paintings/? [NC,R=302,L]
RewriteRule ^about/biography\.html$ https://deborahgriceartist.uk/about/? [NC,R=302,L]
RewriteRule ^Limited\sEdition/prints\.html$ https://deborahgriceartist.uk/portfolio/printmaking/ [NC,R=302,L]
RewriteRule ^paintings/Paintings/miniatures\.html$ https://deborahgriceartist.uk/portfolio/printmaking/ [NC,R=302,L]
RewriteRule ^page/biography\.html$ https://deborahgriceartist.uk/cv/? [R=302,L]
RewriteRule ^paintings/paintings\.html$ https://deborahgriceartist.uk/portfolio/paintings/? [R=302,L]
RewriteRule ^oilpaintings/oncanvas\.html$ https://deborahgriceartist.uk/portfolio/paintings/? [NC,R=302,L]
RewriteRule ^subscribe/form\.html$ https://deborahgriceartist.uk/contact/? [NC,R=302,L]
RewriteRule ^contact/hello\.html$ https://deborahgriceartist.uk/contact/? [NC,R=302,L]
RewriteRule ^about/biography\.html$ https://deborahgriceartist.uk/about/? [NC,R=302,L]
RewriteRule ^onpaper/mixedmedia\.html$ https://deborahgriceartist.uk/portfolio/drawings/? [NC,R=302,L]
RewriteRule ^drawings/drawing\.html$ https://deborahgriceartist.uk/portfolio/drawings/? [NC,R=302,L]
RewriteRule ^painting/drawingsonpaper/mixedmedia\.html$ https://deborahgriceartist.uk/portfolio/drawings/? [NC,R=302,L]
RewriteRule ^contact/hello\.php$ https://deborahgriceartist.uk/contact/? [R=302,L]


Then try testing again. if you are happy with the test results then Carefully change the 302’s to 301’s.

2 Likes

Thank you so much!!! 99% of them are all working on Google now. I found 43 broken links (and counting…). :grimacing: I’m just struggling with https://deborahgriceartist.uk/page and https://deborahgriceartist.uk/about/ - they’re all over google and are sending an error page even after writing this:

RewriteRule ^page\.html$ https://deborahgriceartist.uk/portfolio/? [NC,R=302,L]
RewriteRule ^about\.html$ https://deborahgriceartist.uk/cv/? [R=302,L]

All of the others are working well. I still haven’t turned them to 301s as I thought I’d wait until the morning, when I’m more clear headed.

I was just wondering: is there any code one can add that when someone searches for ‘Debbie Grice Artist’ they could be sent to ‘Deborah Grice Artist’…

You’re such a kind person for helping me so much.

@teefers Hello, I’m back at it and have tried:

RewriteRule ^page/index\.html$ https://deborahgriceartist.uk/portfolio/? [NC,R=302,L]

which doesn’t work either :thinking:

Going back to the way RegEx work you the $ at the end of the string you are matching to, says that the string must end with the proceeding pattern. Just like the ^ at the begining says the string must begin with this pattern.

Do here is one of the URL’s you are still having the issue with:

https://deborahgriceartist.uk/page

Now with mod_rewrites you the part of the URL that you are matching ignores the main domain, so throw thid part out:

https://deborahgriceartist.uk/

So what we have left for the string you’re trying to match is:

page

Did you Notice that there’s no \.html or \.php at the end? So that would probably mean that URL was a folder maybe. The reason I say maybe is that folders (directories) usually end with a /.

So you can match the page with a pattern of

^page$

That would make the RewriteRule look like this

RewriteRule ^page$ https://deborahgriceartist.uk/ [NC,R=302,L]

Then you could add a second rule to handle a url that ends with the /. But there’s a better way, RegEx has an optional match attribute. If a character can be present in a pattern but doesn’t have to be there to match you can place that character within parentheses that are followed by a question mark (/)?.

So the rewrite rule that would handle both would look like this:

RewriteRule ^page(/)?$ https://deborahgriceartist.uk/ [NC,R=302,L]

Now take that rule and change page to about and change the destination URL to where you want to go and that should work.

There’s no real way to do that. You could referred to “ Debbie Grice Artist” in the content, page titles, h1’s, h2’s, etc. You could also change or add a folder name of debbie-grice-artist.

2 Likes

I’ve done what you said and they work perfectly! I’m thrilled. I’ve CAREFULLY changed the 302s to 301s so wonder if now is the right time to ask for a Google recrawl of the site?

Also: I’ve found yet more broken links: my excel sheet shows I now have 87 (!) - some of the addresses are like this: https://deborahgriceartist.uk/Prints_Paintings/Drawings.html.html how on earth do I redirect that please?

I’ll start peppering my content with ‘Debbie’ here and there, I daren’t mess with the folder titles now. I think I am a perfect example of ‘a little knowledge is dangerous’. I’ve fired off new file and folder names to Google for five years - no wonder everything was in such a mess. I have a much better understanding of how it all works now, and that is 100% down to you. I am sending heartfelt thanks.

1 Like


I’d wait until you get all the redirect fixed and set to 301’s. Right now, the ones you have fixed will work for people who click on them.


Wow, that is a screwed up URL. This might be one that you would have to try this live on the htaccess file. Mad with love doesn’t seem to handle it because of the double .html.

So breaking it down:

https://deborahgriceartist.uk/Prints_Paintings/Drawings.html.html

Ignore the:

https://deborahgriceartist.uk/

Leaves the string:

Prints_Paintings/Drawings.html.html

So the pattern should be like this , it starts with ^Prints_Paintings/Drawings.html.html However, we have two periods . that have special meaning to RegEx so we have to escape those ^Prints_Paintings/Drawings\.html\.html Then we want that to be the end as well so we add the $ so the match pattern would be ^Prints_Paintings/Drawings\.html\.html$.

So the Entire rule would look like this (redirecting to the paintings page):

RewriteRule ^Prints_Paintings/Drawings\.html\.html$ https://deborahgriceartist.uk/paintings/? [NC,R=302,L]

Now Made with love doesn’t work with this, could be the tool, if it doesn’t work on the htaccess file then you might have to let this one end up going to the error page.

Search engines will pick things up in content. I’d make sure I used the fill phrase Debbie Grice Artist here and there. Also try and make sure it’s in at least some h2’s and h3’s tags on a few pages. I’d also make sure it was in a few browser titles as
Debbie Grice Artist
or
Deborah Grice Artist | Debbie Grice Artist

2021-01-17_12-41-45

Seach engine put more wieght on Titles h1, h2 and h3 tags.

Thanks ever so much - the piece of code has worked perfectly. Goodness knows where I got a .html.html url from!!

I have done as you said regarding renaming folders and have included some h2 tags.

I have just downloaded HeaderPro, as the header facilities within in RW8 won’t let me change the font. HeaderPro looks like it’ll need some studying…I’ll cope with the wrong typeface until then.

I have changed all the 302s to 301s and been through my spread sheet to see that they work. As the results were positive, I have asked for a re-crawl. I suppose I shall need to wait a week or so to see any difference? What with each .jpg being properly titled and the Alt tags having been completed, I am hoping my Google presence will be much more refined and reliable.

The reality is it can take much longer than a few weeks to detect variations in search engines. But as long as the abandoned URLs are now going to a good page the user experience will be fine until the engines catch up with the changes.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.