pydub | Manipulate audio with a simple and easy high level interface | Speech library
kandi X-RAY | pydub Summary
Support
Quality
Security
License
Reuse
- Read audio from file
- Create a WAV file from a file
- Log subprocess output
- Return a temporary file descriptor
- Calculate a ratecv
- Format the struct format
- Get a sample from the given size
- Check parameters
- Read a wav file
- Play an audio segment
- Convert lin to lin
- Convert to mono
- Convert a CP
- Compute the factor between two Strings
- Strip silence
- Compute the maximum pp
- Return the number of samples crossing the given size
- Multiply a list of samples
- Bias to given size
- Return HTML representation of the video
- Compute the min and max of the given cpus
- Reverse the contents of a cp
- Find the index of the most common subsequence between two sequences
- Find the best fit between two sequences
- Compute the average power p
- Compute an eq
pydub Key Features
pydub Examples and Code Snippets
Trending Discussions on pydub
Trending Discussions on pydub
QUESTION
I'm attempting to write a python project that plays multiple parts of a song at the same time.
For background information, a song is split into "stems", and then each stem is played simultaneously to recreate the full song. What I am trying to achieve is using potentiometers to control the volume of each stem, so that the user can mix songs differently. For a product relation, the StemPlayer from Kanye West is what I am trying to achieve.
I can change the volume of the overlayed song at the end, but what I want to do is change the volume of each stem using a potentiometer while the song is playing. Is this even possible using pyDub? Below is the code I have right now.
from pydub import AudioSegment
from pydub.playback import play
vocals = AudioSegment.from_file("walkin_vocals.mp3")
drums = AudioSegment.from_file("walkin_drums.mp3")
bass = AudioSegment.from_file("walkin_bass.mp3")
vocalsDrums = vocals.overlay(drums)
bassVocalsDrums = vocalsDrums.overlay(bass)
songQuiet = bassVocalsDrums - 20
play(songQuiet)
ANSWER
Answered 2022-Feb-22 at 13:00Solved this question, I ended up using pyaudio instead of pydub. With pyaudio, I was able to define a custom stream_callback function. Within this callback function, I multiply each stem by a modifier, then add each stem to one audio output.
def callback(in_data, frame_count, time_info, status):
global drumsMod, vocalsMod, bassMod, otherMod
drums = drumsWF.readframes(frame_count)
vocals = vocalsWF.readframes(frame_count)
bass = bassWF.readframes(frame_count)
other = otherWF.readframes(frame_count)
decodedDrums = numpy.frombuffer(drums, numpy.int16)
decodedVocals = numpy.frombuffer(vocals, numpy.int16)
decodedBass = numpy.frombuffer(bass, numpy.int16)
decodedOther = numpy.frombuffer(other, numpy.int16)
newdata = (decodedDrums*drumsMod + decodedVocals*vocalsMod + decodedBass*bassMod + decodedOther*otherMod).astype(numpy.int16)
return (newdata.tobytes(), pyaudio.paContinue)
QUESTION
Answer: shouldn't set content/mime type browser side with JS, should use native browser mimeType then convert server side (I used PyDub).
Question: I am using Javascript MediaRecorder, Django, AWS s3 and Javascript Web Audio API to record audio files for users to share voice notes with one another. I've seen disbursed answers online about how to record and upload audio data and the issues with Safari/iOS but thought this could be a thread to bring it together and confront some of these issues.
Javascript:
mediaRecorder = new MediaRecorder(stream);
mediaRecorder.onstop = function (e) {
var blob = new Blob(
chunks,
{
type:"audio/mp3",
}
);
var formdata = new FormData();
formdata.append('recording', blob)
var resp = await fetch(url, { // Your POST endpoint
method: 'POST',
mode: 'same-origin',
headers: {
'Accept': 'application/json',
'X-Requested-With': 'XMLHttpRequest',
'X-CSRFToken': csrf_token,
},
body: formdata,
})
}
Django:
for k,file in request.FILES.items():
sub_path = "recordings/audio.mp3"
meta_data = {"ContentType":"audio/mp3"}
s3.upload_fileobj(file, S3_BUCKET_NAME, sub_path,ExtraArgs=meta_data)
###then some code to save the s3 URL to my database for future retrieval
Javascript:
var audio_context = new AudioContext();
document.addEventListener("#play-audio","click", function(e) {
var url = "https://docplat-bucket.s3.eu-west-3.amazonaws.com/recordings/audio.mp3"
var request = new XMLHttpRequest();
request.open('GET', url, true);
request.responseType = 'arraybuffer';
request.onload = function () {
audio_context.decodeAudioData(request.response, function (buffer) {
playSound(buffer)
});
}
request.send();
})
Results in: "EncodingError: Decoding Failed"
Note however that using the w3 schools demo mp3 url does play the recording: https://docplat-bucket.s3.eu-west-3.amazonaws.com/recordings/t-rex-roar.mp3
Specs: PC (used to upload recoding): Windows 11, Chrome Version 98.0.4758.81 (Official Build) (64-bit) Django: Version: 3.1.7 Mobile (used to play recording): iPhone X, iOS (Version 14.7.1) Problematic url: https://docplat-bucket.s3.eu-west-3.amazonaws.com/recordings/audio.mp3 Working url: https://docplat-bucket.s3.eu-west-3.amazonaws.com/recordings/t-rex-roar.mp3
(This is my first post so please forgive me if I haven't asked this question in the ideal way :) )
ANSWER
Answered 2022-Feb-07 at 20:59When you upload the recorded Blob
you set the type to 'audio/mp3'
. But unless you use a custom library which patches the MediaRecorder
the mimeType
of the recording will be whatever the browser likes best.
As of now it's 'audio/opus'
in Firefox and 'audio/webm'
in Chrome.
If you define your Blob
like this it should work.
var blob = new Blob(
chunks,
{
type: mediaRecorder.mimeType
}
);
You would also have to change your server side code to not use 'audio/mp3'
anymore.
QUESTION
Error while installing manimce, I have been trying to install manimce library on windows subsystem for linux and after running
pip install manimce
Collecting manimce
Downloading manimce-0.1.1.post2-py3-none-any.whl (249 kB)
|████████████████████████████████| 249 kB 257 kB/s
Collecting Pillow
Using cached Pillow-8.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
Collecting scipy
Using cached scipy-1.7.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (39.3 MB)
Collecting colour
Using cached colour-0.1.5-py2.py3-none-any.whl (23 kB)
Collecting pangocairocffi<0.5.0,>=0.4.0
Downloading pangocairocffi-0.4.0.tar.gz (17 kB)
Preparing metadata (setup.py) ... done
Collecting numpy
Using cached numpy-1.21.5-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
Collecting pydub
Using cached pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Collecting pygments
Using cached Pygments-2.10.0-py3-none-any.whl (1.0 MB)
Collecting cairocffi<2.0.0,>=1.1.0
Downloading cairocffi-1.3.0.tar.gz (88 kB)
|████████████████████████████████| 88 kB 160 kB/s
Preparing metadata (setup.py) ... done
Collecting tqdm
Using cached tqdm-4.62.3-py2.py3-none-any.whl (76 kB)
Collecting pangocffi<0.9.0,>=0.8.0
Downloading pangocffi-0.8.0.tar.gz (33 kB)
Preparing metadata (setup.py) ... done
Collecting pycairo<2.0,>=1.19
Using cached pycairo-1.20.1.tar.gz (344 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting progressbar
Downloading progressbar-2.5.tar.gz (10 kB)
Preparing metadata (setup.py) ... done
Collecting rich<7.0,>=6.0
Using cached rich-6.2.0-py3-none-any.whl (150 kB)
Collecting cffi>=1.1.0
Using cached cffi-1.15.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (446 kB)
Collecting commonmark<0.10.0,>=0.9.0
Using cached commonmark-0.9.1-py2.py3-none-any.whl (51 kB)
Collecting typing-extensions<4.0.0,>=3.7.4
Using cached typing_extensions-3.10.0.2-py3-none-any.whl (26 kB)
Collecting colorama<0.5.0,>=0.4.0
Using cached colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Collecting pycparser
Using cached pycparser-2.21-py2.py3-none-any.whl (118 kB)
Building wheels for collected packages: cairocffi, pangocairocffi, pangocffi, pycairo, progressbar
Building wheel for cairocffi (setup.py) ... done
Created wheel for cairocffi: filename=cairocffi-1.3.0-py3-none-any.whl size=89650 sha256=afc73218cc9fa1d844d7165f598e2be0428598166b4c3ed9de5bbdc94a0a6977
Stored in directory: /home/yusifer_zendric/.cache/pip/wheels/f3/97/83/8022b9237866102e18d1b7ac0a269769e6fccba0f63dceb9b7
Building wheel for pangocairocffi (setup.py) ... done
Created wheel for pangocairocffi: filename=pangocairocffi-0.4.0-py3-none-any.whl size=19283 sha256=54399796259c6e24f9ab56c5747ab273dcf97fb6fed3e7b54935f9ac49351d50
Stored in directory: /home/yusifer_zendric/.cache/pip/wheels/60/58/92/507a12a5044f7fcda6f4dfd8e0a607cc1fe957bc0dea885906
Building wheel for pangocffi (setup.py) ... done
Created wheel for pangocffi: filename=pangocffi-0.8.0-py3-none-any.whl size=37899 sha256=bea348af93696816b046dd901aa60d29a464460c5faac67628eb7e1ea7d1807d
Stored in directory: /home/yusifer_zendric/.cache/pip/wheels/c4/df/6d/e9d0f79b1545f6e902cc22773b1429de7a5efc240b891ee009
Building wheel for pycairo (pyproject.toml) ... error
ERROR: Command errored out with exit status 1:
command: /home/yusifer_zendric/manim_ce/venv/bin/python /home/yusifer_zendric/manim_ce/venv/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /tmp/tmpuguwzu3u
cwd: /tmp/pip-install-l4hqdegr/pycairo_f4d80b8f3e4840a3802342825adcdff5
Complete output (12 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.8
creating build/lib.linux-x86_64-3.8/cairo
copying cairo/__init__.py -> build/lib.linux-x86_64-3.8/cairo
copying cairo/__init__.pyi -> build/lib.linux-x86_64-3.8/cairo
copying cairo/py.typed -> build/lib.linux-x86_64-3.8/cairo
running build_ext
'pkg-config' not found.
Command ['pkg-config', '--print-errors', '--exists', 'cairo >= 1.15.10']
----------------------------------------
ERROR: Failed building wheel for pycairo
Building wheel for progressbar (setup.py) ... done
Created wheel for progressbar: filename=progressbar-2.5-py3-none-any.whl size=12074 sha256=7290ef8de5dd955bf756b90130f400dd19c2cc9ea050a5a1dce2803440f581e2
Stored in directory: /home/yusifer_zendric/.cache/pip/wheels/2c/67/ed/d84123843c937d7e7f5ba88a270d11036473144143355e2747
Successfully built cairocffi pangocairocffi pangocffi progressbar
Failed to build pycairo
ERROR: Could not build wheels for pycairo, which is required to install pyproject.toml-based projects
(venv) yusifer_zendric@Laptop-Yusifer:~/manim_ce$
(venv) yusifer_zendric@Laptop-Yusifer:~/manim_ce$ pip install manim_ce
ERROR: Could not find a version that satisfies the requirement manim_ce (from versions: none)
ERROR: No matching distribution found for manim_ce
(venv) yusifer_zendric@Laptop-Yusifer:~/manim_ce$ manim example_scenes/basic.py -pql
Command 'manim' not found, did you mean:
command 'maim' from deb maim (5.5.3-1build1)
Try: sudo apt install
(venv) yusifer_zendric@Laptop-Yusifer:~/manim_ce$ sudo apt-get install manim
[sudo] password for yusifer_zendric:
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package manim
(venv) yusifer_zendric@Laptop-Yusifer:~/manim_ce$ pip3 install manimlib
Collecting manimlib
Downloading manimlib-0.2.0.tar.gz (4.8 MB)
|████████████████████████████████| 4.8 MB 498 kB/s
Preparing metadata (setup.py) ... done
Collecting Pillow
Using cached Pillow-8.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
Collecting argparse
Downloading argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Collecting colour
Using cached colour-0.1.5-py2.py3-none-any.whl (23 kB)
Collecting numpy
Using cached numpy-1.21.5-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
Collecting opencv-python
Downloading opencv_python-4.5.4.60-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (60.3 MB)
|████████████████████████████████| 60.3 MB 520 kB/s
Collecting progressbar
Using cached progressbar-2.5-py3-none-any.whl
Collecting pycairo
Using cached pycairo-1.20.1.tar.gz (344 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting pydub
Using cached pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Collecting pygments
Using cached Pygments-2.10.0-py3-none-any.whl (1.0 MB)
Collecting scipy
Using cached scipy-1.7.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (39.3 MB)
Collecting tqdm
Using cached tqdm-4.62.3-py2.py3-none-any.whl (76 kB)
Building wheels for collected packages: manimlib, pycairo
Building wheel for manimlib (setup.py) ... done
Created wheel for manimlib: filename=manimlib-0.2.0-py3-none-any.whl size=212737 sha256=27efe2c226d80cfe5663928e980d3e5f5a164d8e9d0aacea5014d37ffdedb76a
Stored in directory: /home/yusifer_zendric/.cache/pip/wheels/87/36/c1/2db5ed5de9908034108f3c39538cd3367445d9cec01e7c8c23
Building wheel for pycairo (pyproject.toml) ... error
ERROR: Command errored out with exit status 1:
command: /home/yusifer_zendric/manim_ce/venv/bin/python /home/yusifer_zendric/manim_ce/venv/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /tmp/tmp5o2970su
cwd: /tmp/pip-install-sxxp3lw2/pycairo_d372a62d0c6b4c4484391402d21485e1
Complete output (12 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.8
creating build/lib.linux-x86_64-3.8/cairo
copying cairo/__init__.py -> build/lib.linux-x86_64-3.8/cairo
copying cairo/__init__.pyi -> build/lib.linux-x86_64-3.8/cairo
copying cairo/py.typed -> build/lib.linux-x86_64-3.8/cairo
running build_ext
'pkg-config' not found.
Command ['pkg-config', '--print-errors', '--exists', 'cairo >= 1.15.10']
----------------------------------------
ERROR: Failed building wheel for pycairo
Successfully built manimlib
Failed to build pycairo
ERROR: Could not build wheels for pycairo, which is required to install pyproject.toml-based projects
all the libraries are installed accept the pycairo library. It's just showing this to install pyproject.toml error. Infact I have already done pip install pyproject.toml and it is installed then also it's showing the same error.
ANSWER
Answered 2022-Jan-28 at 02:24apt-get install sox ffmpeg libcairo2 libcairo2-dev
apt-get install texlive-full
pip3 install manimlib # or pip install manimlib
Then:
pip3 install manimce # or pip install manimce
And everything works.
QUESTION
While trying to use audiosegment.from_file(x.mp3)
to open an mp3 file and later convert it to wave format by audio.export(x.mp3, format='wav')
, I face the following Couldnt DecodeError
.
What could be causing this? I am using python= 3.9
, pydub=0.25.1
, audiosegment=0.23.0
.
Thanks in advance for the help. Below is the error shown on the console.
CouldntDecodeError Traceback (most recent call last)
/var/folders/vh/nmgr0zd56yd_vs_56q_jy89c0000gn/T/ipykernel_21698/1373432166.py in
1 vad=wb.Vad()
2 filename= '/Users/gulag_dweller/Desktop/Lab_stuff/python_script/Isi_B1.mp3'
----> 3 audio= audiosegment.from_file(filename)
4 audio_wav= audio.export(filename, format ='wav')
5
~/mambaforge/lib/python3.9/site-packages/audiosegment.py in from_file(path)
1131 _name, ext = os.path.splitext(path)
1132 ext = ext.lower()[1:]
-> 1133 seg = pydub.AudioSegment.from_file(path, ext)
1134 return AudioSegment(seg, path)
1135
~/mambaforge/lib/python3.9/site-packages/pydub/audio_segment.py in from_file(cls, file, format, codec, parameters, start_second, duration, **kwargs)
771 if close_file:
772 file.close()
--> 773 raise CouldntDecodeError(
774 "Decoding failed. ffmpeg returned error code: {0}\n\nOutput from ffmpeg/avlib:\n\n{1}".format(
775 p.returncode, p_err.decode(errors='ignore') ))
CouldntDecodeError: Decoding failed. ffmpeg returned error code: 69
Output from ffmpeg/avlib:
ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
built with clang version 11.1.0
ANSWER
Answered 2022-Jan-24 at 17:46I had the same problem. I read that there is some cases where mp3 files contain AAC audio, but the container format is mpeg4.
So, the solution that worked for me is:
try:
audio = audiosegment.from_file(filename, "mp3")
except:
audio = audiosegment.from_file(filename, format="mp4")
QUESTION
I'm working on a speech recognition and following the example shown in this PythonCode page on Windows 10 with Spyder 5.1.5/Anaconda (Python 3.8.10).
I installed SpeechRecognition
and pydub
with conda install -c conda-forge
, and when I run the following script:
with sr.AudioFile(filename) as source:
audio_data = r.record(source)
text = r.recognize_google(audio_data)
print(text)
or more specifically (text = r.recognize_google(audio_data)
), this error message shows up:
OSError: FLAC conversion utility not available - consider installing the FLAC command line application by running `apt-get install flac` or your operating system's equivalent
There's a similar question but I couldn't find the solution for the Windows environment where I don't have apt-get install flac
or brew
.
Following this post, I've downloaded the flac.exe file and placed under C:\Windows\System32
. I can run flac on command line, but the same error shows up when I run the python script.
Does anyone know how to fix this problem?
ANSWER
Answered 2022-Jan-06 at 20:17According to the source code it is searching for a flac without exe extension that will not work in Windows. If that fails it looks for a file with a specific name (flac-win32.exe) in module folder.
You can either try to remove the extension of the file in the System32 folder or put the file in the module folder.
QUESTION
I am trying to convert flac files to mp3 format, using pydub
for conversion and mutagen
for tags and album art copy.
Convert a flac file to a 320Kbps mp3:
from pydub import AudioSegment
path_flac = 'mc_test/from/01 Lapislazuli.flac'
path_mp3 = 'mc_test/to/01 Lapislazuli.mp3'
flac_audio = AudioSegment.from_file(path_flac, format="flac")
flac_audio.export(path_mp3, format="mp3", bitrate='320K')
Load album art image from flac file and embed it into mp3 file (follow this question):
from mutagen.flac import FLAC
from mutagen.mp3 import MP3
from mutagen.id3 import ID3, APIC
file = FLAC(path_flac)
art = file.pictures[0].data
audio = MP3(path_mp3, ID3=ID3)
audio.tags.add(
APIC(
encoding=3, # 3 is for utf-8
mime='image/png', # image/jpeg or image/png
type=3, # 3 is for the cover image
desc=u'Cover',
data=art
)
)
audio.save()
I successfully embed the album art into the mp3 file, and the picture showed in players such as foobar and MPC, but didn't correctly showed in file icon. If I convert the file via foobar, it correctly showed, but didn't work with mutagen.
Does anyone knows how to make the album art correctly showed as icon?
ANSWER
Answered 2021-Dec-06 at 15:14Thanks for suggestion from @diggusbickus , I found and compared differences between mp3 file generated from foobar and pydub. The difference is encoding.
In pydub-converted file, which tags and album art were added by mutagen:
path_mp3 = 'mc_test/to/01 Lapislazuli.mp3'
file_mutagen = File(path_mp3)
file_mutagen.tags['APIC:'].encoding
It shows , which probably came from
audio.tags.add(APIC(encoding=3))
above.
In foobar-converted file:
path_mp3_foobar = 'mc_test/foobar/01 Lapislazuli.mp3'
file_foobar = File(path_mp3_foobar)
file_foobar.tags['APIC:'].encoding
shows
So I change my setting to audio.tags.add(APIC(encoding=0))
while embeding image, and it works, now I can see album art as a icon preview image. Also I do a little survey to check if other encoding number works, album art would correctly showed with encoding=0, 1 and 2.
QUESTION
I am running into an issue when trying to extract mono audio from a stereo file using pydub.
Here is the code:
import wave
import audioop
from pydub import AudioSegment
def cantDeleteLockedFile():
audiofile = "/Volumes/test/stereotest.wav"
audiostrip = AudioFileClip(audiofile)
if audiostrip.nchannels > 1:
with open(audiofile, "rb") as oaudiofile:
mono_audios = AudioSegment.from_file(oaudiofile, format="wav")
# List of AudioSegments, mono files, which can be accessed via [0] and [1]
mono_audios = mono_audios.split_to_mono()
audioChannelOne = str(audiofile.rsplit(".", 1)[0]) + "a.wav"
# This line is locking the stereo file
mono_left = mono_audios[0].export(audioChannelOne, format="wav")
# This extracts the mono left track from the stereo track
# On the same location a file will be created, in this example:
# "/Volumes/test/stereotesta.wav"
# This should unlock the file, but doesnt
mono_left.close()
# When trying to delete the file here, it will fail
# without exception raised
os.remove(audiofile)
if os.path.exists(audiofile):
return True
else:
return False
return False
After executing this code, which in my case is embedded into an API microservice system, that does not exit the code. Then the stereo audio file will be locked, for as long as that micro service is running. The file will not be deleted and function return value will be "False". If you later manually on the filesystem navigate to that file and try to delete it manually, it will also fail. It will first delete it, but then it will magically pop back up.
I am aware of this issue being discussed on other boards before. However the proposed solution does not work.
ref: https://github.com/jiaaro/pydub/issues/305
Either I am missing something completely. However, perhaps there is a workaround to forcibly unlock a file, so it can be deleted? I did not find a reference online. Basically I know, that pydub is locking the resource, I can't get it to unlock the wav file behind the Audio Segment.
Happy to read your feedback and suggestions.
Thank you!
ANSWER
Answered 2021-Dec-06 at 11:52The Audiosegment used to check for stereo files also needs to be closed. This was blocking the file on the storage side.
Adding a simple:
audiostrip.close()
solved the problem.
QUESTION
I am looking to combine 10 audio samples in various manners (format - wav probably, but this can be changed to any format as they will be pre-recorded).
from pydub import AudioSegment
sounds = []
sound1 = AudioSegment.from_wav("Dropbox/PIREAD/1.wav")
sound2 = AudioSegment.from_wav("Dropbox/PIREAD/2.wav")
sound3 = AudioSegment.from_wav("Dropbox/PIREAD/3.wav")
sound4 = AudioSegment.from_wav("Dropbox/PIREAD/4.wav")
sound5 = AudioSegment.from_wav("Dropbox/PIREAD/5.wav")
sound6 = AudioSegment.from_wav("Dropbox/PIREAD/6.wav")
sound7 = AudioSegment.from_wav("Dropbox/PIREAD/7.wav")
sound8 = AudioSegment.from_wav("Dropbox/PIREAD/8.wav")
sound9 = AudioSegment.from_wav("Dropbox/PIREAD/9.wav")
sound0 = AudioSegment.from_wav("Dropbox/PIREAD/0.wav")
sounds=[sound1,sound2,sound3,sound4,sound5,sound6,sound7,sound8,sound9,sound0]
combined_sounds = AudioSegment.empty()
for x in range(10):
for y in range(10):
combined_sounds += sounds[y]
combined_sounds.export("Dropbox/PIREAD/joinedFile.wav", format="wav")
This is literally me reading the numbers 0-9 and assembling them into one overall wav file.
It works - but it is slow once the loop is extended x=100, x=1000.
Q: How can I speed things up?
The actual order of the numbers will be read from a text$ - for example "354224848179261915075" which happens to be the 100th Fibonacci number.
Cheers Glen
ANSWER
Answered 2021-Oct-20 at 11:24I believe it's slow because when you loop over x
, you repeat operations (the loop over y
) which could be computed before the loop over x
, then assembled.
QUESTION
I am getting errors when training my machine learning model which is for checking what a person is feeling while saying somthing. I am working with librosa, soundfile & MLPClassifier from sklearn. This is my code:
;imported required libraries
import librosa
import soundfile
import os, glob, pickle
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
; created a function that basically gathers features from audio files
def extract_feature(file_name, mfcc, chroma, mel):
with soundfile.SoundFile(file_name) as sound_file:
X = sound_file.read(dtype="float32")
sample_rate=sound_file.samplerate
if chroma:
stft=np.abs(librosa.stft(X))
result=np.array([])
if mfcc:
mfccs=np.mean(librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=40).T, axis=0)
result=np.hstack((result, mfccs))
if chroma:
chroma=np.mean(librosa.feature.chroma_stft(S=stft, sr=sample_rate).T,axis=0)
result=np.hstack((result, chroma))
if mel:
mel=np.mean(librosa.feature.melspectrogram(X, sr=sample_rate).T,axis=0)
result=np.hstack((result, mel))
return result
; defined emotions
emotions={
'01':'neutral',
'02':'calm',
'03':'happy',
'04':'sad',
'05':'angry',
'06':'fearful',
'07':'disgust',
'08':'surprised'
}
observed_emotions=['calm', 'happy', 'fearful', 'disgust']
;to load data
def load_data(test_size=0.2):
x,y=[],[]
for file in glob.glob("data\\Actor_*\\*.wav"):
file_name=os.path.basename(file)
emotion=emotions[file_name.split("-")[2]]
if emotion not in observed_emotions:
continue
feature=extract_feature(file, mfcc=True, chroma=True, mel=True)
x.append(feature)
y.append(emotion)
return train_test_split(np.array(x), y, test_size=test_size, random_state=9)
x_train,x_test,y_train,y_test=load_data(test_size=0.23)
print((x_train.shape[0], x_test.shape[0]))
; used the mlpclassifier
model=MLPClassifier(alpha=0.01, batch_size=256, epsilon=1e-08, hidden_layer_sizes=(300,), learning_rate='adaptive', max_iter=500)
;trained my model
model.fit(x_train,y_train)
; This is the part used for unit testing and I am getting a lot of errors
a,b = [],[]
file_name=os.path.basename("data/what.wav")
emotion=emotions[file_name.split("-")[2]]
if emotion not in observed_emotions:
continue
feature=extract_feature(file, mfcc=True, chroma=True, mel=True)
a.append(feature)
b.append(emotion)
This is the error that I am getting, which when i try to remove via other methods like using pydub , I get different types of errors. I am a begineer to this and still have to learn a lot. So i hope can find a way to resolve this .
IndexError Traceback (most recent call last)
in
1 a,b = [],[]
2 file_name=os.path.basename("data/what.wav")
----> 3 emotion=emotions[file_name.split("-")[2]]
4 if emotion not in observed_emotions:
5 continue
IndexError: list index out of range
ANSWER
Answered 2021-Nov-15 at 21:11Your call to os.path.basename("data/what.wav") returns 'what.wav'
You then split that using "-" as the splitter, which returns ['what.wav'], a list of one element.
But you then try to reference the third element of the list with [2], which throws an exception.
QUESTION
I'm trying to set the rms level of an AudioSegment in Pydub relative to another file, before overlaying the two. The method I'm trying to use involves setting the relative rms of the first file to be +4 dB more intense than the second- I know rms isn't modifiable, but dBFS is. I'm trying to modify it with apply_gain()
, but printing the rms and the dBFS doesn't show any differences before and after calling that method.
At the moment, my code looks something like:
if segmentOne.dBFS > segmentTwo.dBFS:
gain = segmentOne.dBFS - segmentTwo.dBFS
segmentTwo.apply_gain(gain)
elif segmentOne.dBFS < segmentTwo.dBFS:
gain = segmentTwo.dBFS - segmentOne.dBFS
segmentOne.apply_gain(gain)
segmentOne.apply_gain(6)
segmentOne = segmentOne.overlay(segmentTwo)
I'm not very experienced with audio (at all), so it could be there's something obvious I'm missing. Is there a way of doing what I need using Pydub?
ANSWER
Answered 2021-Nov-05 at 23:42I understand that apply_gain
returns a modified copy of the audio segment.
So you probably must do:
segmentTwo = segmentTwo.apply_gain(gain)
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install pydub
Installing pydub is easy, but don't forget to install ffmpeg/avlib (the next section in this doc). Or install the latest dev version from github (or replace @master with a release version like @v0.12.0)….
Support
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesExplore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits
Save this library and start creating your kit
Share this Page