Chinese Article Evaluation Tool

Chinese Article Evaluation Tool will evalute your article by counting number of unique Chinese characters in the article, how many of them is not in the first 500 character list (cover 72.1% usages in classical and modern Chinese texts so to learn Chinese effectively, we’d better first learn those characters), thus to determine if this is an easy article for beginers or not. It is a good Chinese Character Counting Tool too.

The stripped text with puctuation marks removed can be used as materials to have students practice Chinese style punctuation marks.

Version 0.2 may include unspecified updates, enhancements, or bug fixes. More anhancements are coming soon by adding word counting and level determining engine so it makes it easier for teachers/tutors to select appropriate reading materials for their students, and to evaluate how well their students perform in their Chinese writing. The output format will be refined once I have a little bit more time.
Try it at

For more information, visit

Chinese Article Evaluation Tool

Chinese Article Evaluation Tool will evalute your article by counting number of unique Chinese characters in the article, how many of them is not in the first 500 character list (cover 72.1% usages in classical and modern Chinese texts so to learn Chinese effectively, we’d better first learn those characters), thus to determine if this is an easy article for beginers or not. It is a good Chinese Character Counting Tool too. The stripped text with puctuation marks removed can be used as materials to have students practice Chinese style punctuation marks.

Version 0.2 may include unspecified updates, enhancements, or bug fixes. More anhancements are coming soon by adding word counting and level determining engine so it makes it easier for teachers/tutors to select appropriate reading materials for their students, and to evaluate how well their students perform in their Chinese writing. The output format will be refined once I have a little bit more time.
Try it at

Wireless Setup

Setting up network can be very tricky. One missing step can take you a sleepless night to pull your hairs. Here is what I was. I havv set up wireless access point several times and recently I thought setting up another PC will be just a snap. However I forgot to add that one in my assess point’s access list and hours and hours I just got fainted. 🙂

Here are some hints to set up another PC into your wireless network:

1. Add that PC in your router’s access list. Remmber this is the MAC addess oy sometimes called physical address, not IP address. Sometimes the “properties” show something like 1a.20.1d.2a.32.a2. Make sure to change the . to :.

2. Copy ONLY the HEX key to your wireless adapter’s configuration (if this is a secured network and I bet yours surely is), NOT the passphrase, at least for Windows XP. Otherwise you’ll have difficult time to connect.


Next Research

Wanted to find out AJAX technology (async Javascript and XML), uniserver and easyPHP, and PHProxy and see how they work and fit.


中国计算语言学综述 – 资料汇编(未完成稿)












哈工大计算机学院 (李生)



Shanghai Normal University

















俞士汶. 关于现代汉语词语的语法功能分类.  
张普. 论语义场. 又见:<<机器?shy;译研究进展>>,电子工业出版社,1992年8月.
张普. 信息处理用现代汉语语义分析的理论与方法. 又见:《中文信息学报》,1991年第3期Vol.5,-No.3
陈群秀,张普. 信息处理用现代汉语语义分类体系:属性分类.  
陈群秀,张普. 信息处理有现代汉语语义词典支撑环境的初步构想.  
陈群秀. 有关语义分类体系研究的几个问题.  
鲁川. 现代汉语的语义网络.



Web server sometimes failed to transfer files – a potentail MTU issue

I was once puzzled several months by my web server with an issue. It SOMETIMES can not load even a mid-sized files although the small files can work pretty well. It is so frustrating as some of my friends can read my files while others claim they can not.

It takes me long time to figure out why. Lots of research and experiments. I exposed my web server outside of my filewall, reinstalled web server, reset modem/router, … none of them seems to help. All of a suden one day, I figured out it is caused by an MTU issue. It should be 1492 for DSL in router settings. After I reset MTU value correctly in router, server works perfect.
If your web server sometimes work and sometimes not, it is more likely an MTU issue.

I’ll detail this later, if I have more time and if I can still remember all the details.


MediaWiki – multiple installations

Back to the multiple installtions of wiki. Here is what I did.
In installing the second wiki, lets say wiki2. I symbol linked everything from the first installation, except file LocalSettings.php and directory config. I Created it’s own config directory. Then installed the second mediawiki using different database.

The benefits are you do not need to keep many shared files/directories.
Another insteresting thing when playing with mediawiki. How about just two installations sharing same database. NO PROBLEM. You gained a lot by sharing many common files/directories as well as database. I checked that mediawiki databse diffentiates the data from two places pretty well using unique Ids so that if you browse one wiki it will not show up wiki you configured in another place. Nice feature.
However, if your wiki tends to be really big, you may choose to install it in a separate database though.


Community building: a wiki or a forum?

If you are installing midiawiki the first time, you may end up installing several just like me. The rich features of mediawiki impressed me and I decided to move some of my forums to mediawiki. Why? You may ask? I think wiki is promoting a more interactive community than forums. How many of you are fed up with a long long forums posts – you serached through it trying to find a final answser. You may just jump into the last several posts, but they are coments like “Thank you”, “That really helps”, “Please check another thread at bla bla bla”, etc. You just wanted to get the final answer to this issue in the thread but was overwelmed by many long unrelated posts. In wiki, you are always presnted with the final anser from the community, all the changes are kept in the “History” if you are insterested. Neat.

In forums, you may easily read who said what, when. In wiki, you always see the most recent version of the current discussion, the “who” and “when” and even “what” are located in “history” section. (“history” kept all versions of the pages)


MediaWiki Installation Tricks regarding MySql database versions

When I installed mediawiki version 1.5 the first time and was prompted for database features, I used the default option – backward compatibility. _ I later guess that it used features of MySql prior to version 4.1. That caused the problem:

First, after installation, you get SQL error 1271 when hitting pages like “Recent Changes”. I later looked into the codes and found the issue. Basically, if you have the query like this when your default character set of MySql is utf-8 you’ll get this SQL error.

select * from some_table where some_col =’test’

The reason is, some_col is utf-8 collation, while string literal ‘test’ was treated as collation latin1. So the comparison failed, prior to MySql version 4.

I got around the pronblem by inserting something like this in includes/Database.php in mediawiki installation source codes:

select * from some_table where convert(some_col using latin1) =’test’

SQL error disappeared. However, after that, I had the second problem. My link and categories all show red, regradless if the linked pages/categories were defined or not.

I suspect that is the same issue with the first as comparison failed due to collation conflicts. I ddin’t want to spend too much on it so I want ahaed to reinstall it and this time I chosed database MySql version 4/5 (NOT backward compatiable).

All issues were resolved and mediawiki worked like a charm.
