I am a graduate student of linguistics at the University of California, Santa Barbara. I recently graduated with my Bachelor’s degree from the University of Hong Kong, where I majored in linguistics and statistics and worked on various projects as a student research assistant at the Language Development Lab, HKU.
Brief version: My main interest is to use quantitative techniques from computational corpus linguistics to discover generalisations of typological and psycholinguistic interest, especially with regards to the morphosyntax of Sino-Tibetan languages.
I am primarily interested in the use of modern quantitative methodology in linguistics, and I am interested in advances on both the model level (e.g. incorporating time-series components into linguistic models, using smoothing splines in additive models to approximate nonlinear relationships) and the inference level (e.g. bootstrap resampling approaches that take into account the dependence relationship between samples, newer variable selection techniques like some penalised regression methods). Because the nature of these methods often escape closed-form solution, computational solutions play an important role.
In terms of linguistic interests, I am particularly interested in morphosyntactic alternations: when there is more than one way to say something, when and why do we choose one way over another? I am also interested in phonology, particularly as it interacts with morphosyntax. This is not to give phonetics, semantics or discourse/pragmatics short shrift; the complex interplay of all levels of language is what makes language interesting, and a paper will often have to take into account, or at least control for, other linguistic strata.
I believe the best way to approach the study of language is to treat it as the complex, interacting system that it is. Language arises within the constraints of our physiology, influenced by properties of our cognitive system, social behaviour and communicative needs. Given the multi-faceted nature of language, a parochial view that ignores interactions between subfields and between types of evidence cannot move us forward. Thus, I believe corpus, typological and experimental data all have important roles to play in the study of language, and I would like to combine different sources of quantitative evidence to answer the same linguistic questions.
While I am interested in typological variation and do not intend to limit myself to any particular language or language family, my main interests lie in Sino-Tibetan, especially but not limited to Sinitic, Tibetic and non-Tibetic Bodish languages. As the Sino-Tibetan family has a huge amount of internal variation from the extreme analyticity and complex tonal systems of Sinitic to polysynthetic branches like Kiranti and rGyalrong, generalisations about Sino-Tibetan are likely good candidates for general trends in human language.