Questions tagged [audio-processing]
Audio processing involves the study of mathematical and signal processing techniques to understand or alter the nature of audio signals. The different kind of audio signals under study include speech, music, environmental audio and computer audio. Audio is analyzed in the temporal or spectral domain by applying various filters.
audio-processing
554
questions
58
votes
8
answers
23k
views
Algorithms for determining the key of an audio sample
I am interested in determining the musical key of an audio sample. How would (or could) an algorithm go about trying to approximate the key of a musical audio sample?
Antares Autotune and Melodyne ...
33
votes
4
answers
13k
views
How can I Compare 2 Audio Files Programmatically?
I want to compare 2 audio files programmatically.
For example: I have a sound file in my iPhone app, and then I record another one. I want to check if the existing sound matches the recorded sound or ...
31
votes
1
answer
998
views
Building audio processing Little Endian SDK with NDK
I am trying to use ndk-build to use native code for audio processing from Little Endian in an Android application (I don't have JNI yet).
When I executed ndk-build in jni dir I got ($USER_PATH is ...
26
votes
3
answers
2k
views
deeplearning4j - using an RNN/LSTM for audio signal processing
I'm trying to train a RNN for digital (audio) signal processing using deeplearning4j.
The idea is to have 2 .wav files: one is an audio recording, the second is the same audio recording but processed (...
24
votes
1
answer
9k
views
How to get below 10ms latency using WASAPI shared mode?
According to Microsoft, starting with Windows 10, applications using shared-mode WASAPI can request buffer sizes smaller than 10ms (see https://msdn.microsoft.com/en-us/library/windows/hardware/...
20
votes
2
answers
15k
views
How do I use audio sample data from Java Sound?
This question is usually asked as a part of another question but it turns out that the answer is long. I've decided to answer it here so I can link to it elsewhere.
Although I'm not aware of a way ...
18
votes
3
answers
6k
views
Perceptual similarity between two audio sequences
I would like to get some sort of distance measure between two pieces of audio. For example, I want to compare the sound of an animal to the sound of a human mimicking that animal, and then return a ...
17
votes
4
answers
2k
views
Detecting wind noise [closed]
I want to develop an app for detecting wind according the audio stream.
I need some expert thoughts here, just to give me guide lines or some links, I know this is not easy task but I am planning to ...
17
votes
1
answer
687
views
How to setup for record and playback audio on Mac. VOIP app on Mac
I want to record and playback audio in Mac. Now, I have some problems about the settings for Input/Output/ChannelFormat … I showed you some code I try below.
// Setup audio device
- (OSStatus) ...
15
votes
4
answers
12k
views
Algorithm to get the Key and Scale from musical notes? [closed]
From a series of MIDI notes stored in array (with MIDI note number), does an algorithm exist to get the most likely key or scale implied by these notes?
15
votes
4
answers
26k
views
Bpm audio detection Library [closed]
I'm looking for a library that simplify tempo/bpm audio detection.
Something similar to this http://adionsoft.net/bpm/ , but to use on *NIX machines.
Any language, but preference goes to php, perl, ...
15
votes
2
answers
4k
views
Audio and Signal Processing in Haskell
Do you know of alive attempts at audio synthesis / signal processing in Haskell ? Either for live performance or just for offline processing ? I am not looking for libraries relying on an external ...
14
votes
9
answers
10k
views
Music Recognition and Signal Processing
I want to build something similar to Tunatic or Midomi (try them out if you're not sure what they do) and I'm wondering what algorithms I'd have to use; The idea I have about the workings of such ...
14
votes
6
answers
19k
views
Sound sample recognition library/code
I don't want sound-to-text software. What I need is the following:
I'll record multiple (say 50+) audio streams (recordings of radio stations)
from that recordings, I'll mark interesting audio clips ...
12
votes
1
answer
55k
views
How to add an external audio track to a video file using VLC or FFMPEG command line
I want to add an audio.mp3 soundtrack to a soundless video.mp4 file using a bash script, what is the correct syntax of the "cvlc" "ffmpeg" command line ?
I've recorded the video with VLC and --no-...
12
votes
5
answers
26k
views
Convert audio to text
I just want to know if there is any build in libraries or external libraries in Java or C# that allow me to take an audio file and parse it and extract the text from it.
I need to make an application ...
12
votes
3
answers
23k
views
How to write C++ audio processing applications? [closed]
I'm an Electronics and Telecommunications student, next to my graduation. I'm gonna work on a project that involves my knowledge about DSP, music and audio in general. I allready know all the basic ...
12
votes
1
answer
899
views
Find most dominant audio frequency in sample
I'm trying to create a project that pulls in a live stream audio file from the internet and continuously samples the audio looking for the most dominant frequency for a given time period. The idea is ...
11
votes
2
answers
9k
views
Sound recognition API, SDK (Android) [closed]
I need to make an Android app that can recognize certain sound files created by me, and do an action on recognition. So something similar to Shazam/Soundhound, but with my own sound files.
Is there ...
10
votes
1
answer
14k
views
Adding silent frame to wav file using python
First time posting here, lets see how this goes.
I trying to write a script in python which would add a second of silence in the beginning of the wav file, but so far been unsuccessfully in doing so.
...
10
votes
6
answers
18k
views
Library for reading audio files
I want to process audio online/live where I constantly read audio samples from an audio file, process these (e.g. apply some effect), and forward the processed samples to an audio output device like a ...
9
votes
4
answers
12k
views
Trying to change pitch of audio file with scikits.samplerate.resample results in garbage audio from pygame
My problem is related to pitch-shifting audio in Python. I have the current modules installed: numpy, scipy, pygame, and the scikits "samplerate" api.
My goal is to take a stereo file and ...
8
votes
4
answers
9k
views
AVAudioPlayer rate
So I'm trying to play a sound file at a different rate in iOS 5.1.1, and am having absolutely no luck. So far I have tried setting the rate of the AVAudioPlayer:
player = [[AVAudioPlayer alloc] ...
8
votes
5
answers
39k
views
processing an audio wav file with C
I'm working on processing the amplitude of a wav file and scaling it by some decimal factor. I'm trying to wrap my head around how to read and re-write the file in a memory-efficient way while also ...
8
votes
1
answer
5k
views
Using Mutagen to process all accepted file types
What do I need to do in order to process every file type accepted by mutagen, .ogg, .apev2, .wma, flac, mp4, and asf? (I excluded mp3 because it has the most documentation on it)
I'd appreciated if ...
8
votes
4
answers
7k
views
Find sound effect inside an audio file
I have a load of 3 hour MP3 files, and every ~15 minutes a distinct 1 second sound effect is played, which signals the beginning of a new chapter.
Is it possible to identify each time this sound ...
8
votes
1
answer
6k
views
Open source FSK decoder library? [closed]
I'm looking for a library or tool to decode FSK in wav files, e.g. caller id.
Currently using the tools bundled with vpb-driver for Voicetronix hardware that is available via debian/ubuntu. But this ...
8
votes
1
answer
2k
views
audio comparison with R
I am working in a project where my task deals with speech/audio/voice comparison. This project is used for judging the winner in the competitions(mimicry). Practically I need to capture the user's ...
8
votes
1
answer
6k
views
Android Audio effect on wav file and save it
Requirement
Android open a .wav file in sd card, play it , add some effect (like echo, pitch shift etc), save the file with effect. Simple :(
What I know
I can open and play file using Soundpool or ...
8
votes
0
answers
280
views
How to get & parse the values of ITune's EQ presets
We are trying to implement a music player app with Equalizer presets. We are successful in getting presets from iPod and applying it through audio unit. But, now we need to display sliders and set ...
8
votes
0
answers
777
views
Can I measure distances with sound in an android app?
I've got a number of questions this time, although they all relate to the same problem: I wanted to build a rudimentary sonar in Android, and have no clue as to how possible it is to do such a thing.
...
8
votes
1
answer
2k
views
Get Video and Audio buffer separately while recording video using front camera
I dug a lot on SO and some nice blog post But seems I am having unique requirement of reading Video and Audio buffer separately for further processing on it while recording going on.
My use case is ...
7
votes
1
answer
3k
views
How can I obtain the raw audio frames from the microphone in real-time or from a saved audio file in iOS?
I am trying to extract MFCC vectors from the audio signal as input into a recurrent neural network. However, I am having trouble figuring out how to obtain the raw audio frames in Swift using Core ...
7
votes
4
answers
14k
views
Python NumPy - FFT and Inverse FFT?
I've been working with FFT, and I'm currently trying to get a sound waveform from a file with FFT, (modify it eventually), but then output that modified waveform back to a file. I've gotten the FFT of ...
7
votes
2
answers
3k
views
How to determine if an audio track is a Dolby Pro Logic II mixdown
I'm trying to find out if there's a way to determine if an AAC-encoded audio track is encoded with Dolby Pro Logic II data. Is there a way of examining the file such that you can see this information? ...
7
votes
2
answers
5k
views
Transforming Audio Samples From Time Domain to Frequency Domain
as a software engineer I am facing with some difficulties while working on a signal processing problem. I don't have much experience in this area.
What I try to do is to sample the environmental ...
7
votes
1
answer
3k
views
AVFoundation audio processing using AVPlayer's MTAudioProcessingTap with remote URLs
There is precious little documentation on AVAudioMix and MTAudioProcessingTap, which allow processing to be applied to the audio tracks (PCM access) of media assets in AVFoundation (on iOS). This ...
7
votes
1
answer
2k
views
How to train a machine learning algorithm using MFCC coefficient vectors?
For my final year project i am trying to identify dog/bark/bird sounds real time (by recording sound clips). I am using MFCC as the audio features. Initially i have extracted altogether 12 MFCC ...
7
votes
2
answers
2k
views
AVAudioRecorder in Swift 3: Get Byte stream instead of saving to file
I am new to iOS programming and I want to port an Android app to iOS using Swift 3. The core functionality of the app is to read the byte stream from the microphone and to process this stream live. So ...
6
votes
1
answer
13k
views
Understanding the shape of spectrograms and n_mels
I am going through these two librosa docs: melspectrogram and stft.
I am working on datasets of audio of variable lengths, but I don't quite get the shapes. For example:
(waveform, sample_rate) = ...
6
votes
1
answer
2k
views
why my 8kHz wav file's mel feature extracted differently in sr = 16kHz and 44.1kHz
I'm currently extracting mel features from my baby cry sound dataset and the wav files' sampling rate is 8kHz, 16bit, mono and about 7 sec.
Mel-Spectogram when sr = 16000
Mel-Spectogram when sr = ...
6
votes
3
answers
10k
views
Correct way to Convert 16bit PCM Wave data to float
I have a wave file in 16bit PCM form. I've got the raw data in a byte[] and a method for extracting samples, and I need them in float format, i.e. a float[] to do a Fourier Transform. Here's my code, ...
6
votes
1
answer
6k
views
Modify volume gain on audio sample buffer
I want to increase a volume on buffer with voice data. The point is I'm using DirectSound and I have one primary and one secondary buffer - all streams mixing is done by hand. In a voice chat all ...
6
votes
1
answer
3k
views
What kind of sound processing algorithm allows you to make visualizations like this?
I'm interested in making an OpenGL visualizer for MP3's as a pet project.
I stumbled upon this youtube video which demonstrates someone showing off a visualizer being used in conjunction with ...
6
votes
3
answers
8k
views
Scipy io read wavfile error
Whenever I try to read a .wav file, the following error comes.
I have searched everywhere but had no progress upon this.
CODE:
import scipy as sp
import matplotlib.pyplot as plt
sr, y = sp.io.wavfile....
6
votes
1
answer
7k
views
Python find audio frequency and amplitude over time
Here is what I would like to do. I would like to find the audio frequency and amplitude of a .wav file at every say 1ms of that .wav file and save it into a file. I have graphed frequency vs amplitude ...
6
votes
2
answers
2k
views
Best way to verify an mp3 file with python
I have to detect whether a file is a valid mp3 file. So far, I have found two solutions, including:
this solution from Peter Carroll
using try-catch expression:
try:
_ = librosa.get_duration(...
6
votes
1
answer
217
views
Verizon SongID - How is it programmed?
For anyone not familiar with Verizon's SongID program, it is a free application downloadable through Verizon's VCast network. It listens to a song for 10 seconds at any point during the song and then ...
6
votes
6
answers
4k
views
extracting a specific melody/beat/rhythm from a specific instument from a mixed wave (or other music format) file
Is it possible to write a program that can extract a melody/beat/rhythm provided by a specific instument in a wave (or other music format) file made up of multiple instruments?
Which algorithms could ...
6
votes
1
answer
2k
views
Chord Detection Algorithm with the Web Audio API [closed]
First off I'm trying to implement this chord detection algorithm:
http://www.music.mcgill.ca/~jason/mumt621/papers5/fujishima_1999.pdf
I originally implemented the algorithm to use my microphone, but ...