Utilize este identificador para referenciar este registo: http://hdl.handle.net/10198/11577
Título: From source code identifiers to natural language terms
Autor: Carvalho, Nuno
Almeida, José João
Henriques, Pedro
Pereira, Maria João
Palavras-chave: Program comprehension
Natural language processing
Identifier splitting
Data: 2015
Editora: Elsevier
Citação: Carvalho, Nuno; Almeida, José João; Henriques, Pedro; Pereira, Maria João (2015) - From source code identifiers to natural language terms. Journal of Systems and Software. ISSN 0164-1212. 100, p. 117-128
Resumo: Program comprehension techniques often explore program identifiers, to infer knowledge about programs. The relevance of source code identifiers as one relevant source of information about programs is already established in the literature, as well as their direct impact on future comprehension tasks. Most programming languages enforce some constrains on identifiers strings (e.g., white spaces or commas are not allowed). Also, programmers often use word combinations and abbreviations, to devise strings that represent single, or multiple, domain concepts in order to increase programming linguistic efficiency (convey more semantics writing less). These strings do not always use explicit marks to distinguish the terms used (e.g., CamelCase or underscores), so techniques often referred as hard splitting are not enough. This paper introduces Lingua::IdSplitter a dictionary based algorithm for splitting and expanding strings that compose multi-term identifiers. It explores the use of general programming and abbreviations dictionaries, but also a custom dictionary automatically generated from software natural language content, prone to include application domain terms and specific abbreviations. This approach was applied to two software packages, written in C, achieving a f-measure of around 90% for correctly splitting and expanding identifiers. A comparison with current state-of-the-art approaches is also presented.
Peer review: yes
URI: http://hdl.handle.net/10198/11577
DOI: http://dx.doi.org/10.1016/j.jss.2014.10.013
ISSN: 0164-1212
Versão do Editor: http://www.sciencedirect.com/science/article/pii/S0164121214002179
Aparece nas colecções:IC - Artigos em Revistas Indexados ao ISI/Scopus

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
From Source code.pdf1,26 MBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote Degois 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.