Repository logo
 
Publication

Semi-automated sequence curation for reliable reference datasets in ITS2 vascular plant DNA (meta-)barcoding

dc.contributor.authorQuaresma, Andreia
dc.contributor.authorAnkenbrand, Markus J.
dc.contributor.authorGarcia, Carlos Ariel Yadró
dc.contributor.authorRufino, José
dc.contributor.authorHonrado, Mónica
dc.contributor.authorAmaral, Joana S.
dc.contributor.authorBrodschneider, Robert
dc.contributor.authorBrusbardis, Valters
dc.contributor.authorGratzer, Kristina
dc.contributor.authorHatjina, Fani
dc.contributor.authorKilpinen, Ole
dc.contributor.authorPietropaoli, Marco
dc.contributor.authorRoessink, Ivo
dc.contributor.authorSteen, Jozef van der
dc.contributor.authorVejsnæs, Flemming
dc.contributor.authorPinto, M. Alice
dc.contributor.authorKeller, Alexander
dc.date.accessioned2024-05-03T13:16:53Z
dc.date.available2024-05-03T13:16:53Z
dc.date.issued2024
dc.description.abstractOne of the most critical steps for accurate taxonomic identification in DNA (meta)-barcoding is to have an accurate DNA reference sequence dataset for the marker of choice. Therefore, developing such a dataset has been a long-term ambition, especially in the Viridiplantae kingdom. Typically, reference datasets are constructed with sequences downloaded from general public databases, which can carry taxonomic and other relevant errors. Herein, we constructed a curated (i) global dataset, (ii) European crop dataset, and (iii) 27 datasets for the EU countries for the ITS2 barcoding marker of vascular plants. To that end, we first developed a pipeline script that entails (i) an automated curation stage comprising five filters, (ii) manual taxonomic correction for misclassified taxa, and (iii) manual addition of newly sequenced species. The pipeline allows easy updating of the curated datasets. With this approach, 13% of the sequences, corresponding to 7% of species originally imported from GenBank, were discarded. Further, 259 sequences were manually added to the curated global dataset, which now comprises 307,977 sequences of 111,382 plant species.pt_PT
dc.description.sponsorshipAQ acknowledges the PhD scholarship (2020.05155.BD), funded by the Portuguese Foundation for Science and Technology (FCT). This work was developed in the framework of INSIGNIA – Environmental monitoring of pesticide use through honeybees (SANTE/E4/SI2.788418-SI2.788452-INSIGINIA-PP-1-1-2018) and INSIGNIA-EU - Preparatory action for monitoring of environmental pollution using honey bees (Procurement procedure ENV/2021/OP/0014 of 28-09-2021). FCT provided financial support by national funds (FCT/MCTES) to CIMO (UIDB/00690/2020 and UIDP/00690/2020) and SusTEC (LA/P/0007/2021).pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.citationQuaresma, Andreia; Ankenbrand, Markus J.; Garcia, Carlos Ariel Yadró; Rufino, José; Honrado, Mónica; Amaral, Joana S.; Brodschneider, Robert; Brusbardis, Valters; Gratzer, Kristina; Hatjina, Fani; Kilpinen, Ole; Pietropaoli, Marco; Roessink, Ivo; Steen, Jozef van der; Vejsnæs, Flemming; Pinto, M. Alice; Keller, Alexander (2024). Semi-automated sequence curation for reliable reference datasets in ITS2 vascular plant DNA (meta-)barcoding. Scientific Data. EISSN 2052-4463. 11:1, p. 1-11pt_PT
dc.identifier.doi10.1038/s41597-024-02962-5pt_PT
dc.identifier.eissn2052-4463
dc.identifier.urihttp://hdl.handle.net/10198/29711
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.publisherNature Portfoliopt_PT
dc.relationLA/P/0007/2021pt_PT
dc.relationDNA metabarcoding of pollen mixtures for environmental monitoring: qualitative and quantitative robustness based on mock mixtures and honeybee-collected samples from across Europe
dc.relationMountain Research Center
dc.relationMountain Research Center
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/pt_PT
dc.subjectInternal transcribed spacerpt_PT
dc.subjectBarcodept_PT
dc.subjectBiodiversitypt_PT
dc.titleSemi-automated sequence curation for reliable reference datasets in ITS2 vascular plant DNA (meta-)barcodingpt_PT
dc.typejournal article
dspace.entity.typePublication
oaire.awardTitleDNA metabarcoding of pollen mixtures for environmental monitoring: qualitative and quantitative robustness based on mock mixtures and honeybee-collected samples from across Europe
oaire.awardTitleMountain Research Center
oaire.awardTitleMountain Research Center
oaire.awardURIinfo:eu-repo/grantAgreement/FCT//2020.05155.BD/PT
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F00690%2F2020/PT
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDP%2F00690%2F2020/PT
oaire.citation.endPage11pt_PT
oaire.citation.issue1pt_PT
oaire.citation.startPage1pt_PT
oaire.citation.titleScientific Datapt_PT
oaire.citation.volume11pt_PT
oaire.fundingStream6817 - DCRRNI ID
oaire.fundingStream6817 - DCRRNI ID
person.familyNameQuaresma
person.familyNameRufino
person.familyNameHonrado
person.familyNameAmaral
person.familyNamePinto
person.givenNameAndreia
person.givenNameJosé
person.givenNameMónica
person.givenNameJoana S.
person.givenNameM. Alice
person.identifier.ciencia-id4F1A-4E4A-3F23
person.identifier.ciencia-idC414-F47F-6323
person.identifier.ciencia-id4712-B40B-4B0E
person.identifier.ciencia-id5319-7DE8-BEDA
person.identifier.ciencia-idF814-A1D0-8318
person.identifier.orcid0000-0002-8678-5800
person.identifier.orcid0000-0002-1344-8264
person.identifier.orcid0000-0002-5126-4693
person.identifier.orcid0000-0002-3648-7303
person.identifier.orcid0000-0001-9663-8399
person.identifier.scopus-author-id57119742600
person.identifier.scopus-author-id55947199100
person.identifier.scopus-author-id8085507800
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.nameFundação para a Ciência e a Tecnologia
project.funder.nameFundação para a Ciência e a Tecnologia
project.funder.nameFundação para a Ciência e a Tecnologia
rcaap.rightsopenAccesspt_PT
rcaap.typearticlept_PT
relation.isAuthorOfPublicationd417b0ac-c8ee-473a-a355-820b5b9a3f55
relation.isAuthorOfPublication1e24d2ce-a354-442a-bef8-eebadd94b385
relation.isAuthorOfPublication87f8840d-04b1-427a-bca9-d37eadfc0e9b
relation.isAuthorOfPublication42be2cf4-adc4-4e7f-ac60-7aab515b38cd
relation.isAuthorOfPublication0667fe04-7078-483d-9198-56d167b19bc5
relation.isAuthorOfPublication.latestForDiscoveryd417b0ac-c8ee-473a-a355-820b5b9a3f55
relation.isProjectOfPublicatione0a6e4aa-533f-4118-baeb-96fc5e870ed8
relation.isProjectOfPublication29718e93-4989-42bb-bcbc-4daff3870b25
relation.isProjectOfPublication0aac8939-28c2-46f4-ab6b-439dba7f9942
relation.isProjectOfPublication.latestForDiscovery29718e93-4989-42bb-bcbc-4daff3870b25

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
s41597-024-02962-5.pdf
Size:
2.1 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.75 KB
Format:
Item-specific license agreed upon to submission
Description: