Kiril Ivanov Simov
Curriculum Vitae
Home Address:
11 Razhana Str.,
zh.k. “Nadezhda”,
Sofia 1220,
Bulgaria
Office Address:
Linguistic Modelling Department,
Institute of Information and Communication Technologies,
Bulgarian Academy of Sciences,
25A Acad. G.Bonchev Str.,
1113 Sofia,
Bulgaria
Telephone: home (+359 2) 361 721, office (+359 2) 979 2825, fax (+359 2) 707 273
E-mail: kivs@bultreebank.org, kivs@hotmail.com
Date of birth: 18th April 1961
Place of birth: Sofia, Bulgaria
Nationality: Bulgarian
Languages: Bulgarian (mother tongue), English, Russian
Education
July 1987 – June 1991
PhD Student. Faculty of Mathematics and Mechanics, University of Sofia, Sofia, Bulgaria. Thesis topic: Logical means for processing of linguistic knowledge in HPSG. Completed and defended in 2006.
October 1981 – June 1986
M.Sc. in Computer Science. Faculty of Mathematics and Mechanics, University of Sofia.
1976 – 1979
National Secondary School of Mathematics “Acad. Chakalov”, Sofia, Bulgaria. Honors in all subjects. I was admitted to the University of Sofia without entrance examinations, which were otherwise obligatory in Bulgaria.
Work Experience
April 1, 1988 – present
Associate Professor (since 2007), earlier possitions: Research Fellow 3rd, 2nd and 1st degree, Mathematician. Linguistic Modelling Department, Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Sofia, Bulgaria.
September 1986 – June 1987
Programmer. Institute for Software Products and Systems, Sofia, Bulgaria. I was responsible for the development of database software.
Supervision
I was a supervisor of many Master Students in the period 1994-2014 at the Faculty of Mathematics and Informatics, Sofia University.
I am a supervisor of the PhD student Ginka Ivanova for the period from 2012 to 2015.
Membership at Committees
I was a member of the PhD committee for the following people: Milena Slavcheva (Bulgarian Academy of Sciences) – 2011; Diana Grigorova (Technical University Sofia) – 2014.
Teaching
February 2001 – June 2007
Seminar on Computational Corpus Linguistics at the Faculty of Slavonic languages, St. Kl. Ohridski University, (together with Petya Osenova, BulTreeBank, LML, Bulgarian Academy of Sciences, and Krasimira Alexova, St. Kl. Ohridski University)
August 25 – September 8, 2000
“Declarative Knowledge Representation” (with Atanas Kiryakov), “WordNets: Principles and Applications” (with Atanas Kiryakov), and “Computational Morphology” (with Gergana Popova). Summer School on Computational Linguistics and Represented Knowledge, University of Tübingen, Germany. Here are some slides and lecture notes: Declarative Knowledge Representation: LecNote.PS, LecNote.PDF, LecNote.PS.Zip, LecNote.PDF.Zip, Slides.PS, Slides.PDF, Slides.PS.Zip, Slides.PDF.Zip; WordNets: Principles and Applications Slides.PS, Slides.PDF, Slides.PS.Zip, Slides.PDF.Zip; Computational Morphology: Slides.PS, Slides.PDF, Slides.PS.Zip, Slides.PDF.Zip.
August 22 – September 3, 1999
“Declarative Knowledge Representation” (with Atanas Kiryakov), and “Computational Morphology” (with Gergana Popova). Summer School on Computational Linguistics and Represented Knowledge, University of Tübingen, Germany.
October 1995 – February 1996
October 1998 – February 1999
October 1999 – February 2000
“Declarative Knowledge Representation. A Logical Approach” Faculty of Mathematics and Computer Science, University of Sofia.
October 1989 – February 1990
“Introduction to Computer Science for Linguists” Faculty of Classical and Modern Languages, University of Sofia.
Projects
1988 – 1992
“Advanced Computer Technologies in the Creation of Large Linguistic Knowledge Bases (for Slavonic Languages)”. Sponsored by the Bulgarian National Science Fund in cooperation with the Institute of Russian Language, Russian Academy of Sciences, Moscow, Russia.
The aim of the project was to create a computer dictionary of Bulgaria with 32 000 lexemes and a Russian dictionary with 100 000 lexemes.
I was responsible for the development of the MORPHO-ASSISTANT software system for modelling Bulgarian and Russian morphology which would serve the compilation of large dictionaries. Within this task, I developed a model for the acquisition of lexical information on the basis of the classification of lexemes with respect to a set of morphological classes. The classification itself was done by means of an index over the morphological classes, such that a lexicon writer need only provide minimal information about a particular word in order for it to be correctly classified. The MORPHO-ASSISTANT system can be adapted for use with other inflectional languages.
More specifically, I was responsible for the representation of morphological data; for creating algorithms for the analysis and synthesis of word forms and for processing of wrong word forms; I also developed tools for the acquisition and editing of lexical information and for supporting the building of the lexicon. I fully designed and implemented the software.
December 1993 – December 1994
“Interaction Among Knowledge Bases”, sponsored by the Fund Eureka and the Bulgarian Ministry of Science and Education.
The primary goal of this research was to expand further the ideas behind the MORPHO-ASSISTANT lexical acquisition tool. The lexicon and the grammar were viewed as two knowledge bases that interact in the process of their common usage (e.g. in the parsing of a sentence). The project researched methodologies for establishing semantic correspondences between knowledge bases represented in various knowledge representation languages. This methodology was then applied to the problem of using the ACLRN knowledge representation language as a query language for relational databases. The problem was overcome by building an ACLRN knowledge base together with a semantic correspondence between the terminological part of the knowledge base and the relational schemata of the relational database.
I was the project leader.
January 1995 – December 1996
“Representation of Control Information”, sponsored by the Fund Eureka and the Bulgarian Ministry of Science and Education.
The project built an explicit representation of the control of inference procedures in implementations of declarative knowledge representation languages. This representation allows an expert in some knowledge domain to encode control information to suit specific tasks over that domain. The project developed a special normal form for SRL (Speciate Reentrant Logic developed by Paul John King) theories and an indexing technique over such normal forms. The indexing technique enables the automatic reordering of a theory so that the theory exhibits certain relations between elements of the knowledge represented by the theory. The indexing technique also supports the reorganisation of a theory to suit those requirements of a user that are based on knowledge that is not represented by the theory, such as the environment in which the theory is to be used and the type of problem to be decided.
I was the project leader
April 1995 – June 1997
“MULTEXT-EAST” – a joint project under the Copernicus Programme of the European Union.
In the project I was responsible for preparation of the lexicon of the Bulgarian part. It is a full form dictionary in which each word form contains the appropriate grammatical information and a pointer to the lexeme. It is based on the morphological dictionary of Bulgarian language, created in cooperation with Dimitar Popov and Svetlomira Vidinska from the Institute of Bulgarian Language, see the next item
June 1995 – October 1998
“Creation of a Morphological Dictionary of Bulgarian Language”
On the basis of two machine-readable dictionaries of Bulgarian language, a methodology for extraction of the relevant linguistic knowledge about the inflectional morphological classes of words was developed. Starting with a minimal grammar of Bulgarian word formation – sufficient to analyse the information in the two machine-readable dictionaries – the project arrived at a complete morphological grammar and a morphological dictionary with 75 000 entries, which together can be used as the morphological component in a system to automatically process Bulgarian language. The dictionary and the grammar are published as a book (see Publications). I was responsible for developing the methodology for combining the knowledge from the two dictionaries and the formation of the morphological classes and classification of the lexical items. I also implemented the software for the creation of the dictionary.
January 1998 – July 2000
“CONCEDE” – a joint project under the Copernicus program of the European Union
In the project I am responsible for encoding and validation of Bulgarian machine readable dictionaries in SGML schemata for lexical knowledge bases. I implemented software modules for recognition of the structures in the lexical items in the machine readable dictionaries and converting these structures in SGML markup.
October 1998 – September 2000
“Tübingen-Sofia International Graduate Programme in Computational Linguistics and Represented Knowledge (CLARK)” – a joint project between the Seminar für Sprachwissenschaft (SfS), Eberhard-Karls-Universität, Tübingen, Germany and the Linguistic Modelling Laboratory.
The Tübingen-Sofia International Graduate Programme in Computational Linguistics and Represented Knowledge (CLARK) provides a joint teaching and research facility wherein doctoral and master’s students primarily from Bulgaria and Central and Eastern Europe (CEE) pursue their researches in the interdisciplinary field of computational linguistics and knowledge representation. The programme is funded by the Volkswagen-Stiftung.
The education of the students follows an apprenticeship model. That is, the students pursue their individual researches within a collaborative project-based research environment. At the moment, there are two sub-projects in the framework of CLARK: “XML-based Tool for Corpus Linguistics” and “Neural Networks for MorphoSyntactic Disambiguation”.
I was the scientific coordinator of the programme for Bulgaria
January 2001 – December 2003
“BIS-21 Center of Excellence in Information Technology”, an European Union funded project
I am leading a Workpackage 5: Knowledge-based tools for Linguistic Research
February 2001 – August 2004 (First phase)
January 2005 – December 2007 (Second phase)
“HPSG-based Syntactic Treebank of Bulgarian (BulTreeBank)” – a joint project between the Seminar für Sprachwissenschaft (SfS), Eberhard-Karls-Universität, Tübingen, Germany and the Linguistic Modelling Laboratory.
The project is funded by the Volkswagen Stiftung, Federal Republic of Germany under the Programme “Cooperation with Natural and Engineering Scientists in Central and Eastern Europe”.
I am the project leader. Within the project we have created a Bulgarian treebank based on HPSG; Morphosyntactically annotated corpus of Bulgarian; Partial parsing for Bulgarian.
December 2005 – May 2008
LT4eL “Language Technology for eLearning” – an European Project (EC FP6 IST STREP 027391).
Our group was working on the following tasks: creation of domain ontology (computer science for end users), semantic annotation of learning objects and Bulgarian language resources and tools.
I was the leader of the Bulgarian group and workpackage 3.
April 2006 – March 2008
AsIsKnown “A semantic-based knowledge flow system for the European home textiles industry” – an European Project (EC FP6 IST STREP 028044).
Our group was working on the following tasks: creation of domain ontology (home textile), semantic annotation of magazine articles, semantic search.
I was the leader of the Bulgarian group and workpackage 3 and 5.
March 2008 – February 2011
LTfLL “Language Technologies for LifeLong Learning” – an European Project (FP7 ICT STREP Grant agreement no. 212578).
Our group was working on the following tasks: improving the semantic annotation of learning objects, sentiment analysis, system for knowledge exchange (ontologies, lexicons, learning objects annotated with concept, etc).
I was the leader of the Bulgarian group.
January 2008 – December 2010
CLARIN “Common Language Resources and Technology Infrastructure” – an European Infrastructure Project (FP7-INFRASTRUCTURES-2007-1 Grant agreement no. 212230)
I was the leader of the Bulgarian group.
September 2008 – August 2011
FLaReNet “Fostering Language Resources Network” – an European Project (eContent plus, Thematic Network, 212230).
I was the leader of the Bulgarian group.
July 2010 – April 2012
EuroMatrixPlus – Bringing Machine Translation for European Languages to the User (European project IST-231720).
I was the leader of the Bulgarian group.
October 2013 – September 2015
EUCases – EUropean and National CASE Law and Legislation Linked in Open Data Stack
I am the leader of the Bulgarian group.
November 2013 – October 2016
QTLeap – Quality Translation by Deep Language Engineering Approaches
I am the leader of the Bulgarian group.
Practical Experience
PROLOG
Java
XML, SGML, Corpus Encoding Standards, TEI and related areas
Troll, ConTroll – Feature Logic-based systems (University of Tübingen)
Motel – A Kl-One-based knowledge representation system (Max-Plank-Institut fuer Informatik – Saarbrücken
Interests
Logic-based knowledge representation – Terminological and Feature Logics, Conceptual Graphs, KIF, KQML
Knowledge acquisition and reuse, design of ontologies
Natural language processing systems – Grammar Engineering, Acquisition of Linguistic Knowledge, Morphology, HPSG, Corpus Linguistics