MySQL and danish characters


(Peter Kroman) #1

Hi,
I am working on a project where I am using a MySQL database. I have made an input field that is working as a search field in the database. It works nicely.
My problem is, that the search field does not accept the special danish characters, æ,ø,å, Æ, Ø, Å.

My database is made with the UTF-8 charset, and I have put in alle the right tages (I think) in the code, listed, below.

I am searching for ideas on how to make this work right with danish characters.

Thanks in advance.

<?php mysql_connect("mysql31.unoeuro.com", "", "", "genealogiskforum_dk_db4") or die("Error connecting to database: ".mysql_error()); /* first is location of the mysql server, usually localhost second - your username third is your password if connection fails it will stop loading the page and display an error */ mysql_select_db("genealogiskforum_dk_db4") or die(mysql_error()); /* the name of database we'are working in */ mysql_set_charset('utf-8'); ?> Search results <?php $query = $_GET['query']; // gets value sent over search form
$min_length = 1;
// you can set minimum length of the query if you want
 
if(strlen($query) >= $min_length){ // if query length is more or equal minimum length then
     
    $query = htmlspecialchars($query); 
    // changes characters used in html to their equivalents, for example: < to &gt;
     
    $query = mysql_real_escape_string($query);
    // makes sure nobody uses SQL injection
     
    $raw_results = mysql_query("SELECT * FROM sogne
        WHERE (`Sogn` LIKE '%".$query."%') OR (`sogne_id` LIKE '%".$query."%')") or die(mysql_error());
        

    // * means that it selects all fields, you can also write: `id`, `title`, `text`
    // articles is the name of our table
     
    // '%$query%' is what we're looking for, % means anything, for example if $query is Hello
    // it will match "hello", "Hello man", "gogohello", if you want exact match use `title`='$query'
    // or if you want to match just full word so "gogohello" is out use '% $query %' ...OR ... '$query %' ... OR ... '% $query'
     
    if(mysql_num_rows($raw_results) > 0){ // if one or more rows are returned do following
         
        while($results = mysql_fetch_array($raw_results)){
        // $results = mysql_fetch_array($raw_results) puts data from database into array, while it's valid it does the loop
         
            echo "<p>".$results['Sogn']."</p>";
            // posts results gotten from database(title and text) you can also show id ($results['id'])
        }
         
    }
    else{ // if there is no matching rows do following
        echo "No results";
    }
     
}
else{ // if query length is less than minimum
    echo "Minimum length is ".$min_length;
}

?>


(Peter Danckwerts) #2

MySQL has no difficulty handling such characters. If the database is set to use the UTF8 character set, it SHOULD work.


(Peter Danckwerts) #3

I don’t think you should be using htmlspecialchars to convert UTF8.


(Peter Kroman) #4

The database is set to UTF8, and it does not work :frowning:


(Peter Danckwerts) #5

No, it’s your query converting to html entities which is the problem:

You should only call this method when echoing the data into HTML.

Don’t store escaped HTML in your database; it will just make queries more annoying.
The database should store your actual data, not its HTML representation.

https://stackoverflow.com/questions/4882307/when-to-use-htmlspecialchars-function


(Peter Kroman) #6

@peterdanckwerts

Thanks for the input. I have tried to remove the line:
$query = htmlspecialchars($query);
from the code.
It makes no changes. It still don’t accept danish characters as a searchstring.
I have build the database in Sequel Pro, and I have not injected anything into it afterwards, so I believe that the database it self is as “clean” as possible.

Do you have any ideas on how to solve this annoying problem :slight_smile:


(Peter Danckwerts) #7

Sorry, I’m afraid I don’t know. Out of curiosity, I entered the name Anders Ångström (OK, I know he was Swedish rather than Danish!) in a SiteLok MySQL database. There was one very interesting result. SiteLok wouldn’t accept andersångström as a valid user name. I changed it to andersangstrom and all was well. The full name, as opposed to the user name is correctly entered and I can search the database for it. This surely indicates that there must be a restriction on the allowable characters in the field (I’ll have a look when I have a chance). Could there be such a thing in your database?


(Peter Kroman) #8

I am using SiteLok too, and I have just tested a username with danish characters. SiteLok does not accept that here either :slight_smile:
Well - i often communicate with Adrian form SiteLok - I will just hear what he has to say about this.
I will get back here later when I have reached him :slight_smile:
Thanks for your interest in this,


(Peter Kroman) #9

@peterdanckwerts

I have got a solution.
There are two different ways to specify the utf8 statement.

in php it goes like this:
mysqli_set_charset($con,“utf8”);

in HTML ig goes like this:
meta charset=“utf-8” lang=“da”

So it is all about to be able to use utf8 and utf-8 in the right places.

Now we know :slight_smile:

Best regards Peter


(Peter Danckwerts) #10

How daft is that? :grinning: Nothing is as simple as it ought to be.