I have recently obtained my PhD in Computer Science from Instituto Politécnico Nacional,
where I studied in Natural Language and Text Processing Laboratory
in Center for Computing Research (Centro de Investigación en Computación).
My advisor is Dr. Alexander Gelbukh.
Broadly defined my area of research is computational linguistics and natural language processing. More specifically, my main research interest lies in open information extraction from text and its applications to text quality evaluation, in particular, text informativeness. Departing from these areas, I have also worked on the problems of human opinion collection for evaluation of subjective aspects of text. Another direction of my research is the mapping of the output returned by open information extraction systems onto RDF data representation model. In parallel to my main focus, I have also worked on semantic similarity measure and system log classification. My other interests lie in lexicography and semiotics.
The code accompanying my thesis on Open Information Extraction in Spanish based on Part-of-Speech Sequences is
During my spring 2014 and winter 2015 internships (Oracle MDC, Mexico, and IBM, USA, correspondingly), I developed different variants of an Open Information Extraction system based on rules/heuristics. One of the systems included posterior conversion of extractions into RDF/XML format.
In my summer 2012 internship at Microsoft Research, I explored measures of relational similarity between word pairs through similarity learning in vector space.
I have a three year experience working as a linguist in the commercial ontology based machine translation project of ABBYY, Inc., Russia.
I was awarded a Microsoft Research Latin America Fellowship in 2012.
In my spare time I listen to music,
take pictures, and ride snowboard...when snow is available :)
Recent Publications (for full list see my CV)
Revista Signos. Estudios de Lingüística, 60, vol. 49.
Open Information Extraction from Real Internet Texts in Spanish Using Constraints over Part-Of-Speech Sequences: Problems of the Method, Their Causes, and Ways for Improvement.
In print, 2016.
KESW 2015, International Conference on Knowledge Engineering and Semantic Web. Moscow, Russia, Sep 30 – Oct 2, 2015.
Bringing The Output of Open Information Extraction to The RDF/XML Format: A Case Study.
“Fast Named Entity Driven Open Information Extraction with Shallow Semantic Interpretation”. Submitted Sept. 22, 2015.
ACL SRW, 2014.
Open Information Extraction for Spanish Language based on Syntactic Constraints.
Richard Tapia Celebration of Diversity in Computing, 2014.
Informativeness and Objectivity of Texts on the Web.
- In Proceedings NAACL-2013, 2013.
Combining Heterogeneous Models for Measuring Relational Similarity.
- In Proceedings NoDaLiDa’13, 2013.
Using Factual Density to Measure Informativeness of Web Documents.
- In Proceedings Dialogue’2013, 2013.
Comparison of Open Information Extraction for Spanish and English.
- Richard Tapia Celebration of Diversity in Computing, 2013.
Open Information Extraction for Spanish and Its Application to Measuring Informativeness of Web Documents.
- In Computational Linguistics and Intellectual Technologies, 11, Vol. 1, pp. 716-725, 2012.
Exploring context clustering for term translation.
In Avances en Inteligencia Artificial (Advances in Artificial Intelligence), Mexican Society for Artificial Intelligence (SMIA), pp. 45-57, 2012.
Analysis of a cross-lingual application of context clustering (in Spanish: Análisis de una aplicación multilingüe del agrupamiento de textos).
Book chapter of: Avances recientes en sistemas inteligentes (Recent Advances in Intelligent Systems), Mexican Society for Artificial Intelligence (SMIA), pp. 232-241, 2011.
Classification of methods for improvement of WSD and the corresponding evaluation methods (in Spanish).
Alisa Zhila. Formalization of basic semiotic notions in set theoretic terms. In Polibits 42, pp. 83–97, 2010.
Alisa Zhila. Basic semiotic concepts explication in species of structures for their further formal systematization with advantages of extensional approach. In: The Proc. of the 33rd Annual Meeting of the Semiotic Society of America (SSA), pp. 751–771, 2009.
Alisa Zhila, Victor Kapoustyan. Review of semantic models and investigation of the possibilities of their applying to C.S. Pierce’s sign categories interpretation (in Russian). In: Proc. of 10th conference “Grigorievskie chteniya” (Readings in honor of Prof. Grigoriev): Symbols, codes, signs. Moscow, Russia, pp.141–147, 2008.
Alisa Zhila, Yulia Garaeva. Analysis and synthesis of models used by Melnikov in System Classification of Languages (in Russian), In Abstract Collection for 50th Scientific Conference of the Moscow Institute of Physics and Technology, vol. 9: Innovations and High Technologies, pp. 4–6, 2007.
Elena Mikhailova, Alisa Zhila, Anna Slavutskaya, Mikhail Kulikov, Igor Shevelev. Trajectories of Visual Evoked Potentials Dipole Sources Shifting over Human Brain Cortex (in Russian), In Journal of Higher Nervous Activity 56(6), pp. 555–564, 2007.
Open Information Extraction using Constraints over Part-of-Speech Sequences [pdf]
and the corresponding system ExtrHech