Sunday, April 6, 2008

I started on write introduction, background information about readability and past achievements for the project.
Counting syllable remains the difficult part to code. There are numbers of different ways to count syllable in English,source code from Flesh gives a very good way of counting syllable, however I am searching for other althernative methods.
This week I will finish writing part of my thesis. Decide what methods of counting syllable to go for. And to code the remaining parts.
This will take at most 2 weeks, and finish before the draft thesis to be handed in.

Tuesday, March 25, 2008

First blog (week 3 + easter break)

It's been one week since I received the project.

I have done the following:

1. I read papers about readablity measurement such as:

Klare George R., Assising Readability, Reading Research Quarterly, Vol. 10

Adaptive Approach to Concordance Readability

Wiki and google for linguistic definition of "syllable" and how to count " syllable"

And also details on the Flesh Website

2. I downloaded the source code from Flesh and try to understand the code, as well as how to reproduce one.

I also got the source code for TextMining from Jorge, trying out the examples and to understand more about the library, how corpus, storage works.

3. Since a lot of other readibility measurement programs online use the Flesch-Kincaid scale, I will use the same for my project as well. Only challenging thing is to count the "syllable" correctly.

4. I could not find any readability formula for Vietnamese, I think it would be different for English' one because the way words are formed and pronounced.



For the coming week (end of week 4):

I will get started on the coding. Using the library from Jorge to break down the documents and count numbers of syllable and hence work on the Readability calculation.