Loading...
9 results
Search Results
Now showing 1 - 9 of 9
- Parameters for vocal acoustic analysis - cured databasePublication . Fernandes, Joana Filipa Teixeira; Silva, Letícia; Teixeira, Felipe; Guedes, Victor; Santos, Juliana Hermsdorf; Teixeira, João PauloThis paper describes the construction and organization of a database of speech parameters extracted from a speech database. This article intends to inform the community about the existence of this database for future research. The database includes parameters extracted from sounds produced by patients distributed among 19 diseases and control subjects. The set of parameters of this database consists of the jitter, shimmer, Harmonic to Noise Ratio (HNR), Noise to Harmonic Ratio (NHR), autocorrelation and Mel Frequency Cepstral Coefficients (MFCC) extracted from the sound of sustained vowels /a/, /i/ and /u/ at the high, low and normal tones, and a short German sentence. The cured database has a total number of 707 pathological subjects (distributed by the various diseases) and 194 control subjects, in a total of 901 subjects.
- Acoustic analysis of chronic laryngitis - statistical analysis of sustained speech parametersPublication . Teixeira, João Paulo; Fernandes, Joana Filipa Teixeira; Teixeira, Felipe; Fernandes, Paula OdeteThis paper describes the statistical analysis of a set of features extracted from the speech of sustained vowels of patients with chronic laryngitis and control subjects. The idea is to identify which features can be useful in a classification intelligent system to discriminate between pathologic and healthy voices. The set of features analysed consist in the Jitter, Shimmer Harmonic to Noise Ratio (HNR), Noise to Harmonic Ratio (NHR) and Autocorrelation extracted from the sound of a sustained vowels /a/, /i/ and /u/ in a low, neutral and high tones. The results showed that besides the absolute Jitter, no statistical significance exist between male and female voices, considering the classification between pathologic or healthy. Any of the analysed parameters is likely to be a statistical difference between control and Chronic Laryngitis groups. This is an important information that these features can be used in an intelligent system to classify healthy from Chronic Laryngitis voices.
- Harmonic to noise ratio measurement - selection of window and lengthPublication . Fernandes, Joana Filipa Teixeira; Teixeira, Felipe; Guedes, Victor; Candido Junior, Arnaldo; Teixeira, João PauloHarmonic to Noise Ratio (HNR) measures the ratio between periodic and non-periodic components of a speech sound. It has become more and more important in the vocal acoustic analysis to diagnose pathologic voices. The measure of this parameter can be done with Praat software that is commonly accept by the scientific community has an accurate measure. Anyhow, this measure is dependent with the type of window used and its length. In this paper an analysis of the influence of the window and its length was made. The Hanning, Hamming and Blackman windows and the lengths between 6 and 24 glottal periods were experimented. Speech files of control subjects and pathologic subjects were used. The results showed that the Hanning window with the length of 12 glottal periods gives measures of HNR more close to the Praat measures.
- Transfer learning with audioSet to voice pathologies identification in continuous speechPublication . Guedes, Victor; Teixeira, Felipe; Oliveira, Alessa Anjos de; Fernandes, Joana Filipa Teixeira; Silva, Letícia; Candido Junior, Arnaldo; Teixeira, João PauloThe classification of pathological diseases with the implementation of concepts of Deep Learning has been increasing considerably in recent times. Among the works developed there are good results for the classification in sustained speech with vowels, but few related works for the classification in continuous speech. This work uses the German Saarbrücken Voice Database with the phrase “Guten Morgen, wie geht es Ihnen?” to classify four classes: dysphonia, laryngitis, paralysis of vocal cords and healthy voices. Transfer learning concepts were used with the AudioSet database. Two models were developed based on Long-Short-Term-Memory and Convolutional Network for classification of extracted embeddings and comparison of the best results, using cross-validation. The final results allowed to obtaining 40% of f1-score for the four classes, 66% f1-score for Dysphonia x Healthy, 67% for Laryngitis x healthy and 80% for Paralysis x Healthy.
- Classification of control/pathologic subjects with support vector machinesPublication . Teixeira, Felipe; Fernandes, Joana Filipa Teixeira; Guedes, Victor; Candido Junior, Arnaldo; Teixeira, João PauloThe diagnosis of pathologies using vocal acoustic analysis has the advantage of been noninvasive and inexpensive technique compared to traditional technique in use. In this work the SVM were experimentally tested to diagnose dysphonia, chronic laryngitis or vocal cords paralysis. Three groups of parameters were experimented. Jitter, shimmer and HNR, MFCCs extracted from a sustained vowels and MFCC extracted from a short sentence. The first group showed their importance in this type of diagnose and the second group showed low discriminative power. The SVM functions and methods were also experimented using the dataset with and without gender separation. The best accuracy was 71% using the jitter, shimmer and HNR parameters without gender separation.
- Cured database of sustained speech parameters for chronic laryngitis pathologyPublication . Fernandes, Joana Filipa Teixeira; Teixeira, Felipe; Fernandes, Paula Odete; Teixeira, João PauloThis paper reports the construction and organization of a database of speech parameters extracted from a speech sound database. The database is freely available on internet and the paper intends also theirs advertise for the research community. The database includes the parameters extracted from the sound of sustained vowels produced by a group of Chronic Laryngitis patients and a group of control subjects with similar characteristics concerning gender and age. The set of parameters of this database consists in the Jitter, Shimmer, Harmonic to Noise Ratio (HNR), Noise to Harmonic Ratio (NHR) and Autocorrelation extracted from the sound of sustained vowels /a/, /i/ and /u/ at low, neutral and high tones.
- Outliers treatment to improve the recognition of voice pathologiesPublication . Silva, Letícia; Hermsdorf, Juliana; Guedes, Victor; Teixeira, Felipe; Fernandes, Joana Filipa Teixeira; Bispo, Bruno; Teixeira, João PauloIn some of the processes used in data analysis, such as the recognition of pathologies and pathological subjects, the presence of anomalous instances in the dataset is an unfavorable situation that can lead to misleading results. This article presents a function that implements the identification of anomalies in dataset using the boxplot and standard deviation methods. Also was used the filling technique to treat these anomalies, in which the anomalous point value were substituted by a limit value determined by the boxplot or standard deviation methods. To improve the outliers methods some normalization processes based on the z-score, logarithmic and squared root methodologies were experimented. These outliers treatment were applied to the dataset used in the recognition of vocal pathologies (dysphonia, chronic laryngitis and vocal cords paralysis vs control), performed by a MLP and LSTM neural networks. After the experiments, both the standard deviation and the boxplot methods with z-score normalization showed very useful for pre-processing the dataset for voice pathologies recognition. The accuracy was improved between 3 and 13 points in percentage.
- DROOd: desidratação de fruta e vegetais por ar secoPublication . Fernandes, Joana Filipa Teixeira; Lamas, Ricardo; Pinto, Anaísa; Martins, Rúben; Oliveira, Carlos Manuel Mesquita; Cerdeira, Tânia Filipa Alves; Teixeira, Felipe; Vila Franca, Tiago; Borges, Pedro; Fitas, Tiago; Gouveia, Pedro; Ribeiro, Luís FrölénApresenta-se um equipamento capaz de desidratar alimentos que poderá ser adquirido por pequenos agricultores. A proposta de um equipamento que consegue desidratar os produtos produzidos através de ar seco com uma potência equivalente à de um eletrodoméstico, 1,4 kW, tendo a capacidade de desidratar até 4 kg de frutas ou vegetais. Apresenta-se a simulação do funcionamento do equipamento a secar o equivalente a 23 tomates ou 24 bananas ou 21 laranjas simultaneamente distribuídos em 7 tabuleiros individuais, demorando 10, 8 e 9h, respetivamente, a serem desidratados.
- Long short term memory on chronic laryngitis classificationPublication . Guedes, Victor; Candido Junior, Arnaldo; Fernandes, Joana Filipa Teixeira; Teixeira, Felipe; Teixeira, João PauloThe classification study with the use of machine learning concepts has been applied for years, and one of the aspects in which this can be applied is for the analysis of speech acoustics applied to the analysis of pathologies. Among the pathologies present, one of them is chronic laryngitis. Thus, this article aims to present the results for a classification of chronic laryngitis with the use of Long Short Term Memory as a classifier. The parameters of relative jitter, relative shimmer and autocorrelation was used as input of the LSTM. A dataset of about 1500 instances were used to train, validate and test along 4 experiments with LSTM and one feedforward Artificial Neural Network (ANN). The results of the LSTM overcome the ones of the feedforward ANN, and was about 100% accuracy, sensitivity and specificity in test set, denoting a promising future for this classification tool in the voice pathologies diagnose.