Loading...
 
Features / Usability

Features / Usability


Does Tikiwiki support UTF-8?

posts: 56 China

I am using Tikiwiki v3.1 on my website http://www.joomlagate.com/ . My database use utf-8 encoding.

But when I check the SQL dump file, I noticed that thse Chinese characters all shown up as unreadable symbols !

The .sql file was in utf-8 encoding, but why the Chinese words can't be stored as proper Chinese characters?

I remembered that, when installing Tikiwiki, there is NO option to choose the Character Set (charset).

Then, how can I make Tikiwiki support utf-8 encoded Chinese texts?

Thanks.

posts: 4656 Japan

Tiki supports utf-8 encoding natively (ie., by default). That's how Chinese text can be input and displayed correctly at your site (in /wiki), as other languages are at other Tiki sites. I use Japanese text at a TikiWiki site of mine, but if I look at the page data using phpMyAdmin, the Japanese text, like in the case of your site, is unreadable symbols. Apparently this is a problem with the configuration of phpMyAdmin or its connection to the database, which is also probably the situation in your case. You could check the mysql docs regarding utf-8 for more information.

-- Gary


posts: 56 China

Ok, at least you had confirmed what I had seen in the SQL dump file.

But, I don't think it's the issue outside Tikiwiki.

In fact, I want to remind the developers of Tikiwiki to re-check this issue because I am sure this is something wrong in Tiki.

Why I am so sure?

Because I installed Tikiwiki along with Joomla 1.0.x on the SAME server, the SAME mysql(two different databases), the SAME charset for db tables. My Joomla also use utf-8 encoding.

Then, when I check the SQL dump file for Joomla, the Chinese characters in that file showed me proper Chinese words, of course I was sure the .sql file was in UTF-8 encoding.

Next, I open the Tiki sql dump file with the same text editor, the editor show me that the file was utf-8 encoding but the Chinese characters inside that file were UNREADABLE !

Now, do you think the reason is something outside Tiki?

Thanks.


posts: 4656 Japan

The problem isn't with TikiWiki, from what I can find. It appears to be an issue with MySQL, which only started supporting utf-8 more comprehensively in version 5.0. Do a search for "problem mysql utf-8 chinese dump" along with any popular CMS or wiki software name and you'll find posts similar to yours. If you can get a clean dump from your other software, then it seems you are one of the lucky ones. Please read pages like this - http://www.gossamer-threads.com/lists/wiki/wikitech/160327, which concerns Wikipedia, the wiki software that supports something like 48 languages via utf-8 - and you can see that utf-8 support involving MySQL is quite problematic even for a big operation like MediaWiki.

I'm curious to know the details of your Joomla database and database connection that give you clean database dumps, such as the server character set and collation and the character set and collation of the database, etc. It isn't a simple configuration, and many people using many software applications have trouble getting dumps that display the characters correctly even though there's no problem with character input and display in HTML pages using the software. If I had more time, I'd install Joomla (again - I've used it in the past), and check out the details myself.

-- Gary


posts: 56 China

Since I can't convince you, I hope you can test this youself by install Joomla and Tikiwiki in the same server to see the fact.

If you use Windows OS, it is very easy to build a testing server with XAMPP, which offers you PHP 5.2x and MySQL 5.0x. (I also used MySQL 5.0x in my test).

It is good that you speak Japanese. Please just input some Japanese texts into Joomla and Tikiwiki respectively, then you export the SQL file and check the difference. You can even install Joomla 1.5.14 and Tikiwiki 3.1 in the same database to see the bug of Tikiwiki.

Please do it and you will understand what I had said.


posts: 4656 Japan

I do understand what you said. My point was that a number of wiki and CMS software programs have a problem about MySQL dumps of their data not showing utf-8 characters correctly. Do all of these programs (including Mediawiki) have broken utf-8 support? I don't think so. Full support for utf-8 is quite new for MySQL, and the problem appears to be in how MySQL receives the data and stores it and exports it.

TikiWiki and these other programs do successfully receive, store, and display utf-8 text, so as far as their own use is concerned, there's no problem.

As I said, it is great that MySQL can cleanly export utf-8 text from Joomla (1.5, not earlier releases, from what I read). I would like to know the details of Joomla 1.5's database connections that enable this success. If I get time in the future, I'll explore this myself, because it's been a curiosity to me (but only a curiosity, since I don't need my TikiWiki database dumps for any external use, and utf-8 handling is just fine within the program).

-- Gary


posts: 56 China

When talking about working fine, yes, Tikiwiki works fine for my utf-8 encoded Chinese characters too.

But I reported this bug not only because my curiosity, but also I want Tikiwiki to be a better software, not just good.

Let me prove that this is bug again with the example of Joomla 1.0.x version.

In your last post, you had said that :

MySQL can cleanly export utf-8 text from Joomla (1.5, not earlier releases, from what I read).


You were half right: Joomla! 1.5 itself can support UTF-8 perfectly, and Joomla 1.0.x (earlier releases) can't support utf-8 encoding by default.

You were half wrong: Supporting utf-8 or not, is NOTHING about MySQL (version>5.0).

That is to say, we can just modify some file inside Joomla 1.0.x to make it support utf-8 characters as good as Joomla 1.5, and you can see proper Chinese (also Japanese can) words in the SQL dump file from Joomla 1.0.x.

Can't believe it? Ok, let me give you some reality.

Someone in China had modified the Joomla 1.0.x core file, in fact, that is no more than two files, to make J1.0 support utf-8 perfectly. Don't forget, he never touch the MySQL source code, ONLY modified the Joomla 1.0.x core.

You can find the modified J1.0 install package at:

http://www.joomlagate.com/component/option,com_remository/Itemid,48/func,fileinfo/id,952/

Please download it and play with it to see the effect. This package does not include a Japanese language, but you can test it with english interface, it's fine. Because we had tweaked the english language file to support utf-8 too.

I hope you really do some tests with this modified Joomla 1.0.15 package, with your Japanese characters, then export the SQL file to see the effect.

Well, via this test, I ONLY want to tell you that:

It has nothing to do with MySQL (if version>5) to support utf-8 perfectly!

If we can modify J1.0 to do this, you also can modify Tikiwiki to do this, too.

Thanks.


posts: 1630 Canada

The question is simple.

But the answer is not.

Please see: UTF-8.

Thanks!

M ;-)


posts: 56 China

Thank you marclaporte, I had read the material you mentioned.

I noticed that there is following comments:

Quote:

And since Tikiwiki doesn't specify the database, table, or field encodings in its installation scripts, the data sources are typically created in Latin1


So, why NOT?

Why the developers did not, or would not, specify the character encoding in the installation scripts for Tiki?

I suspect that this is the reason of this issue.


Upcoming Events

1)  18 Apr 2024 14:00 GMT-0000
Tiki Roundtable Meeting
2)  16 May 2024 14:00 GMT-0000
Tiki Roundtable Meeting
3)  20 Jun 2024 14:00 GMT-0000
Tiki Roundtable Meeting
4)  18 Jul 2024 14:00 GMT-0000
Tiki Roundtable Meeting
5)  15 Aug 2024 14:00 GMT-0000
Tiki Roundtable Meeting
6)  19 Sep 2024 14:00 GMT-0000
Tiki Roundtable Meeting
7) 
Tiki birthday
8)  17 Oct 2024 14:00 GMT-0000
Tiki Roundtable Meeting
9)  21 Nov 2024 14:00 GMT-0000
Tiki Roundtable Meeting
10)  19 Dec 2024 14:00 GMT-0000
Tiki Roundtable Meeting