Tuesday, March 25, 2008

First blog (week 3 + easter break)

It's been one week since I received the project.

I have done the following:

1. I read papers about readablity measurement such as:

Klare George R., Assising Readability, Reading Research Quarterly, Vol. 10

Adaptive Approach to Concordance Readability

Wiki and google for linguistic definition of "syllable" and how to count " syllable"

And also details on the Flesh Website

2. I downloaded the source code from Flesh and try to understand the code, as well as how to reproduce one.

I also got the source code for TextMining from Jorge, trying out the examples and to understand more about the library, how corpus, storage works.

3. Since a lot of other readibility measurement programs online use the Flesch-Kincaid scale, I will use the same for my project as well. Only challenging thing is to count the "syllable" correctly.

4. I could not find any readability formula for Vietnamese, I think it would be different for English' one because the way words are formed and pronounced.

For the coming week (end of week 4):

I will get started on the coding. Using the library from Jorge to break down the documents and count numbers of syllable and hence work on the Readability calculation.