Overview
Topic status: We're looking for students to study this topic.
There is an ever-growing amount of textual information made available electronically, from news articles to books. These texts offer an opportunity for language teachers to provide reading materials that are tailored to individual students, on topics of their choice beyond the exercises offered by textbooks.
Readability measures have been developed for English readers and allow for the automatic selection of such materials. However few measures exist for measuring cross lingual readability. How shall we classify the difficulty of a text for someone who is learning a language? Criteria that need to be taken into account include word and phrases frequency, phrases compositionality (can the meaning of the phrase be derived from the meaning of the words composing it?) and cognates (words that are the same and mean the same in two languages) and false friends (words that are the same but mean different things in two languages).
Natural language processing resources and tools will be used to extract the parameter for individual sentences, then machine learning algorithms will be applied to derive a formula for readability.
Suits: IT student with a background in programming
Hypothesis/Aims: This project assumes that a measure of readability can be derived for any given pair of languages (native and learned), however it will focus on only one pair chosen by the student. If a valid measure of readability can be demonstrated, it will open several opportunities for developing education tools as well as new information retrieval paradigms.
Approaches:
- Extraction of linguistic and statistical parameters for a sentence
- Machine learning algorithms
- Development of gold standard for learning and empirical evaluation
References:
- Sandra Uitdenbogerd, 2005. Readability of French as a foreign language and its uses. In Proceedings of the Australian Document Computing Symposium pp. 19-25
- Sarah E. Schwarm and Mari Ostendorf, 2005. Reading level assessment using support vector machines and statistical language models. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, Michigan, pp. 523-530
- Study level
- Honours
- Supervisors
- QUT
- Organisational unit
Science and Engineering Faculty
- Research area
- Contact
- Please contact the supervisor.
Dr Laurianne Sitbon