I am trying to get my virtual assist to work, but i get this output Traceback (most recent call last): File "/home/user/PycharmProjects/AI/main.py", line 279, in <module> voice_data = record_audio("Recording") # get the voice input File "/home/user/PycharmProjects/AI/main.py", line 53, in record_audio audio = r.listen(source, 5, 5) # listen for the audio via source File "/home/user/PycharmProjects/AI/venv/lib/python3.9/site-packages/speech_recognition/__init__.py", line ..
This is my first with python and I don’t understand the processes: I am using Pocketsphinx for speech to text and have a problem with the function createAudio(videoclip):. The script doesn’t run the whole transcript() function or rather runs only for some seconds until the next function starts. How can I make it that the ..
I have been trying to distinguish between words by extracting MFCCs from audios (with the librosa library), then applying Dynamic Time Warping to classify between the audios using kNN. For example, I am trying to recognize between the words "cat" and "anything". My problem is that I can’t find similarities between two different pronunciations of ..
I am testing an Automatic Speech Recognition model on some audio files containing speech in Hindi language. I am using WER, Word Error Rate as the metric. reference (ground truth) – वह शादीशुदा नहीं है hypothesis(model output) – वह शादी शुदा नहीं है I need some way to normalize the reference and hypotheses sentences so ..
https://google.github.io/tacotron/publications/speaker_adaptation/ This is the link of the website where I found the VCTK p240 voice. Is there a way for me to access this voice using Tacotron and add to in my python program so if a provide it a text it could return me with a audio with VCTK p240 voice. Source: Python..
I’m trying to learn how to use the speechrecognition module. I am using python 3.8. However I have an error that cannot be processed (error with the pyaudio module it seems to me). Can you help me please. Thanks in advance for your answers! Bonjour, J’essaie d’apprendre à utiliser le module speechrecognition. J’utilise python 3.8. ..
I’m training DeepSpeech from scratch (without checkpoint) with a language model generated using KenLM as stated in its doc. The dataset is a Common Voice dataset for Persian language. My configurations are as follows: Batch size = 2 (due to cuda OOM) Learning rate = 0.0001 Num. neurons = 2048 Num. epochs = 50 Train ..
Bonjour, J’essaie d’apprendre à utiliser le module speechrecognition. J’utilise python 3.8. Cependant j’ai une erreur que ne n’arrive pas à traiter (erreur avec le module pyaudio il me semble). Pouvez-vous m’aider s’il vous plait. Merci à l’avance pour vos réponses ! r = sr.Recognizer() with sr.Microphone() as source: print("Say something!") audio = r.listen(source) try: print("Google ..
I have speech audio files in wav format that are 60 seconds each. However, the output gets truncated and only captures about 15% of the length. I have tried this both in my local Jupyter Notebook but also through Google Colab. According to the documentation, this request is below the threshold of the API. What ..
I am new to Mycroft AI and I want to implement a skill that is triggered by an external process or in my case a TCP message. What I tried to do is run a separate TCP server that was listening on for messages and on a message it runs the os.system(‘su – pi – ..