Logo do repositório
 
Publicação

Mining github software repositories to look for programming language cocktails

datacite.subject.fosEngenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
datacite.subject.fosHumanidades::Línguas e Literaturas
datacite.subject.sdg04:Educação de Qualidade
datacite.subject.sdg09:Indústria, Inovação e Infraestruturas
dc.contributor.authorLoureiro, João
dc.contributor.authorCosta Neto, Alvaro
dc.contributor.authorPereira, Maria João
dc.contributor.authorHenriques, Pedro Rangel
dc.date.accessioned2026-03-18T15:04:12Z
dc.date.available2026-03-18T15:04:12Z
dc.date.issued2025
dc.description.abstractIn light of specific development needs, it is common to concurrently apply different technologies to build complex applications. Given that lowering risks, costs, and other negative factors, while improving their positive counterparts is paramount to a better development environment, it becomes relevant to find out what technologies work best for each intended purpose in a project. In order to reach these findings, it is necessary to analyse and study the technologies applied in these projects and how they interconnect and relate to each other. The theory behind Programming Cocktails (meaning the set of programming technologies - Ingredients - that are used to develop complex systems) can support these analysis. However, due to the sheer amount of data that is required to construct and analyse these Cocktails, it becomes unsustainable to manually obtain them. From the desire to accelerate this process comes the need for a tool that automates the data collection and its conversion into an appropriate format for analysis. As such, the project proposed in this paper revolves around the development of a web-scraping application that can generate Cocktail Identity Cards (CIC) from source code repositories hosted on GitHub. Said CICs contain the Ingredients (programming languages, libraries and frameworks) used in the corresponding GitHub repository and follow the ontology previously established in a larger research project to model each Programming Cocktail. This paper presents a survey of current Source Version Control Systems (SVCSs) and web-scrapping technologies, an overview of Programming Cocktails and its current foundations, and the design of a tool that can automate the gathering of CICs from GitHub repositories.eng
dc.description.sponsorshipThis work has been supported by FCT – Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UID/00319/2023. The work of Maria João and Alvaro was supported by national funds: UID/05757 - Research Centre in Digitalization and Intelligent Robotics (CeDRI); and SusTEC, LA/P/0007/2020 (DOI: 10.54499/LA/P/0007/2020).
dc.identifier.citationLoureiro, João; Costa Neto, Alvaro; Pereira, Maria João; Henriques, Pedro Rangel (2025). Mining GitHub Software Repositories to Look for Programming Language Cocktails. In 14th Symposium on Languages, Applications and Technologies, SLATE 2025. 135:13, p. 1-16 .ISBN 978-395977387-4. DOI: 10.4230/2025.13
dc.identifier.doi10.4230/2025.13
dc.identifier.isbn978-395977387-4
dc.identifier.issn2190-6807
dc.identifier.urihttp://hdl.handle.net/10198/36135
dc.language.isoeng
dc.peerreviewedyes
dc.publisherSchloss Dagstuhl - Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
dc.relationResearch Centre in Digitalization and Intelligent Robotics
dc.relationAssociate Laboratory for Sustainability and Tecnology in Mountain Regions
dc.rights.urihttp://creativecommons.org/licenses/by-sa/4.0/
dc.subjectSoftware repository mining
dc.subjectSource version control
dc.subjectGitHub scraping
dc.subjectProgramming cocktails
dc.titleMining github software repositories to look for programming language cocktailseng
dc.typeconference paper
dspace.entity.typePublication
oaire.awardNumberUIDP/05757/2020
oaire.awardNumberLA/P/0007/2020
oaire.awardTitleResearch Centre in Digitalization and Intelligent Robotics
oaire.awardTitleAssociate Laboratory for Sustainability and Tecnology in Mountain Regions
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDP%2F05757%2F2020/PT
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/LA%2FP%2F0007%2F2020/PT
oaire.citation.conferencePlaceFaro, Portugal
oaire.citation.endPage16
oaire.citation.issue13
oaire.citation.startPage1
oaire.citation.title14th Symposium on Languages, Applications and Technologies, SLATE 2025
oaire.citation.volume135
oaire.fundingStream6817 - DCRRNI ID
oaire.fundingStream6817 - DCRRNI ID
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyNamePereira
person.givenNameMaria João
person.identifier.ciencia-idC912-4A49-A3B3
person.identifier.orcid0000-0001-6323-0071
person.identifier.ridG-5999-2011
person.identifier.scopus-author-id13907870300
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.nameFundação para a Ciência e a Tecnologia
project.funder.nameFundação para a Ciência e a Tecnologia
relation.isAuthorOfPublicationa20ccfa6-4e84-4c25-ab0d-8d6ba196ffc2
relation.isAuthorOfPublication.latestForDiscoverya20ccfa6-4e84-4c25-ab0d-8d6ba196ffc2
relation.isProjectOfPublicationd0a17270-80a8-4985-9644-a04c2a9f2dff
relation.isProjectOfPublication6255046e-bc79-4b82-8884-8b52074b4384
relation.isProjectOfPublication.latestForDiscoveryd0a17270-80a8-4985-9644-a04c2a9f2dff

Ficheiros

Principais
A mostrar 1 - 1 de 1
A carregar...
Miniatura
Nome:
Mining GitHub.pdf
Tamanho:
865.6 KB
Formato:
Adobe Portable Document Format
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
1.75 KB
Formato:
Item-specific license agreed upon to submission
Descrição: