Contact

Valid XHTML 1.0 Strict

How to use the Audio Processing system call of Menuet 64

SYSTEM CALL 150

From version 0.94, an 'Audio Processing' system call was introduced in Menuet 64. The following functions are available, and may be expanded later:

  1. Complex to complex in-place FFT (split-radix decimation-in-frequency, radices 2-4)
  2. Sample format and wave format converter
  3. FFT convolution kernel
  4. SINC resampling kernel

i) Complex to complex in-place FFT

This is a complex to complex in-place FFT. The split-radix decimation-in-frequency (DIF) algorithm is used for decomposition. The FFT operates on interleaved real and imaginary arrays of double precision floats. Arrays must be aligned on a 16-byte boundary, also the elements of the array must be divided by n before passing them to the inverse FFT. Currently, only power-of-two lengths are supported, but processing of arbitrary lengths might by added later. In: rbx - 16 - Complex to complex in-place forward FFT (split-radix DIF) rcx - n (length n) rdx - *RI (interleaved real & imaginary array, double) Out: rax - In: rbx - 17 - Complex to complex in-place inverse FFT (split-radix DIF) rcx - n (length n) rdx - *RI (interleaved real & imaginary array, double) Out: rax - Kept for compatibility, superseded by calls 16 & 17: In: rbx - 11 - Create FFT4 table rcx - n (max length 2^n) Out: rax - *table or zero In: rbx - 12 - Destroy FFT4 table rcx - *table Out: rax - In: rbx - 13 - Complex to complex out-place forward FFT (Radix-4 decimation in freq) rcx - n (length 2^n) rdx - *RI (interleaved real & imaginary array, double) r8 - *fft4 table r9 - *aux buffer (16-byte aligned length (2^n)*16 ) or zero Out: rax - In: rbx - 14 - Complex to complex out-place inverse FFT (Radix-4 decimation in freq) rcx - n (length 2^n) rdx - *RI (interleaved real & imaginary array, double) r8 - *fft4 table r9 - *aux buffer (16-byte aligned length (2^n)*16 ) or zero Out: rax - An Example is given here (fasm syntax): mov eax , 150 ;audio processing syscall mov ebx , 16 ;16 - forward FFT mov ecx , 4096 ; length = 4096 points lea rdx , [RIarray] ; *interleaved real and imaginary array int 0x60 ;perform some double precision operations here.. mov ecx , 4096 ;division by n cvtsi2sd xmm1 , ecx ; unpcklpd xmm1 , xmm1 ; lea rsi , [RIarray] ; lp: movapd xmm0 , [rsi] ; divpd xmm0 , xmm1 ; movapd [rsi], xmm0 ; add rsi , 16 ; loop lp ; mov eax , 150 ;audio processing syscall mov ebx , 17 ;17 - inverse FFT mov ecx , 4096 ; length = 4096 points lea rdx , [RIarray] ; *interleaved real and imaginary array int 0x60

ii) Sample format and wave format converter

This is a sample format converter. It is capable of handling conversions from a given format (given type, precision and channels) to the desired format (desired type, prec ision and channels) eg. from 24-bit 5.1 to 16-bit stereo or vice versa. Channel up or downmixings can be done by the user, and note that sample format is not the same as wave format (sample formats are restricted to pcm formats only). In: rbx - 21 - Sample converter init rcx - *sciface [out] rdx - (max number of samples * max number of input channels) or zero r8 - (max number of samples * max number of output channels) or zero Out: rax - sc_lasterror In: rbx - 22 - Sample converter deinit rcx - sciface Out: rax - SC_ERR_OK In: rbx - 23 - Convert samples to internal format (32-bit float) rcx - sciface rdx - *in r8 - *out or zero r9 - number of samples or zero r10 - in format Out: rax - number of bytes read from "in" In: rbx - 24 - Reassign channels rcx - sciface rdx - *in or zero r8 - *out or zero r9 - number of samples or zero r10 - in format shl 32 + out format r11 - *channel reassignment list or zero Out: rax - SC_ERR_OK In: rbx - 25 - Convert samples from internal format rcx - sciface rdx - *in or zero r8 - *out r9 - number of samples or zero r10 - out format Out: rax - number of bytes written to "out" In: rbx - 26 - Change sample precision rcx - sciface rdx - *inout or zero r8 - number of samples or zero r9 - in format shl 32 + out format Out: rax - SC_ERR_OK In: rbx - 27 - Get internal buffers rcx - sciface rdx - *in buffer [out] r8 - *out buffer [out] Out: rax - SC_ERR_OK Required constants for sample converter: ; sample format (little endian order) ; bits 63-16 : reserved ; bits 15-8 : number of channels (1-255) ; bits 7-4 : format descriptor (+16=float +128=unsigned) ; bits 3-0 : sample type (1=8-bit 2=16-bit 3=24-bit 4=32-bit) SC_FORMAT_8B = (1 ) ; signed 8-bit SC_FORMAT_U8B = (1+128) ;unsigned 8-bit SC_FORMAT_16B = (2 ) ; signed 16-bit SC_FORMAT_U16B = (2+128) ;unsigned 16-bit SC_FORMAT_24B = (3 ) ; signed 24-bit SC_FORMAT_U24B = (3+128) ;unsigned 24-bit SC_FORMAT_32B = (4 ) ; signed 32-bit SC_FORMAT_U32B = (4+128) ;unsigned 32-bit SC_FORMAT_FLOAT32 = (4+16 ) ; signed 32-bit float SC_FORMAT_M = (1 shl 8) SC_FORMAT_ST = (2 shl 8) define SC_FORMAT_8B_M (SC_FORMAT_U8B or SC_FORMAT_M ) define SC_FORMAT_8B_ST (SC_FORMAT_U8B or SC_FORMAT_ST) define SC_FORMAT_16B_M (SC_FORMAT_16B or SC_FORMAT_M ) define SC_FORMAT_16B_ST (SC_FORMAT_16B or SC_FORMAT_ST) define SC_FORMAT_24B_M (SC_FORMAT_24B or SC_FORMAT_M ) define SC_FORMAT_24B_ST (SC_FORMAT_24B or SC_FORMAT_ST) define SC_FORMAT_32B_M (SC_FORMAT_32B or SC_FORMAT_M ) define SC_FORMAT_32B_ST (SC_FORMAT_32B or SC_FORMAT_ST) define SC_FORMAT_FLOAT32_M (SC_FORMAT_FLOAT32 or SC_FORMAT_M ) define SC_FORMAT_FLOAT32_ST (SC_FORMAT_FLOAT32 or SC_FORMAT_ST) SC_ERR_OK = 0 SC_ERR_NOT_ENOUGH_MEMORY = -30 A detailed Example is given here: mov eax , 150 ;audio processing syscall mov ebx , 21 ;21 - Sample converter init lea rcx , [sciface] ; pointer to *sciface mov rdx , 4096 * (2) ; max number of samples * max number of input channels mov r8 , 4096 * (5+1) ; max number of samples * max number of output channels int 0x60 test rax , rax jnz error ;eg. if you want to convert from stereo to 5.1 in parts, using 4096 samples each time ; you have to set rdx to (4096 * 2) and r8 to (4096 * (5+1)) ;eg. the size of 4096 samples in 16-bit stereo format is: 2*2*8192 = 32768 bytes ; the size of 4096 samples in 24-bit 5.1 format is: 6*3*8192 = 147456 bytes mov eax , 150 ;audio processing syscall mov ebx , 23 ;23 - Convert samples to internal format (32-bit float) mov rcx , [sciface] ; sciface lea rdx , [input] ; pointer to input wave xor r8 , r8 ; output to the "input" internal buffer mov r9 , 4096 ; 4096 samples mov r10 , SC_FORMAT_16B_ST ;16-bit stereo format int 0x60 ;the above piece of code will convert 4096 samples (16-bit stereo) to an ;internal format. internal format is 32-bit float, so only sample type is ;changed. note that sample precision is not changed, so after conversion, ;samples will be still in the range -32768.0 .. 32767.0 you have to ;change sample precision manually if it is necessary (see later) ; ;the returned value in rax is the number of bytes read from "input" mov eax , 150 ;audio processing syscall mov ebx , 24 ;24 - Reassign channels mov rcx , [sciface] ; sciface xor rdx , rdx ; input from the "input" internal buffer xor r8 , r8 ; output to the "output" internal buffer mov r9 , 4096 ; 4096 samples mov r10 . SC_FORMAT_ST shl 32 + (5+1) shl 8 lea r11 , [aslist] ; pointer to the assignment list int 0x60 ;the above piece of code will do the channel reassignment for 4096 samples ;from stereo to 5.1. from the input and output formats only the 'number of ;channels' is considered. input and output samples are 32-bit float. the ;last value is a pointer to the assignment list consisting of dword pairs ;if you specify zero as *aslist, default reassignments are done ;ie. this is valid only if the 'number of channels' in the input/output ; format are equal ;you can reassign channels as many times as you want ;the returned value in rax is zero aslist dd 0, 0 ;left -> FL dd 1, 1 ;right -> FR dd -1, 2 ;no assignment for FC (zeroed) dd -2, 3 ;no assignment for LFE (untouched!) dd 0, 4 ;left -> BL dd 1, 5 ;right -> BR mov eax , 150 ;audio processing syscall mov ebx , 25 ;25 - Convert samples from internal format mov rcx , [sciface] ; sciface xor rdx , rdx ; input from the "output" internal buffer lea r8 , [output] ; pointer to the output wave mov r9 , 4096 ; 4096 samples mov r10 , SC_FORMAT_24B + ((5+1) shl 8) ;24-bit 5.1 format int 0x60 ;the above piece of code will convert 4096 reassigned samples to the desired ;output format. input samples are 32-bit float. again, note that sample prec ;ision is not changed, so after conversion, samples will be still in the ;range -32768.0 .. 32767.0 if the input format was 16-bit ;you have to change sample precision manually if it is neceassary (see below) ;note: this function can be used only after channel reassignment ; the returned value in rax is the number of bytes written to "out" mov eax , 150 ;audio processing syscall mov ebx , 26 ;26 - Change sample precision mov rcx , [sciface] ; sciface xor rdx , rdx ; do modifications on the "output" internal buffer mov r8 , 4096 ; 4096 samples mov r9 , ((SC_FORMAT_16B + (5+1) shl 8) shl 32) + SC_FORMAT_24B int 0x60 ;this will end up in a 16-bit -> 24-bit precision conversion (no dithering ;is applied) the number of channels in the output format is ignored ;input and output samples are 32-bit float ;note: this function is not applied by other sc functions ; you must call this function after reassignment and before converting ; from internal format. the returned value in rax is zero ;if you are audacious you can get access to the internal conversion buffers: mov eax , 150 ;audio processing syscall mov ebx , 27 ;27 - Get internal buffers mov rcx , [sciface] ; sciface lea rdx , [inbuf] ; "input" internal buffer or zero lea r8 , [outbuf] ; "output" internal bufefr or zero int 0x60 ;after function call inbuf/outbuf will contain the addresses of the input and ;output internal conversion buffers respectively. the format of these buffers ;are 32-bit float. inbuf can be used only after converting to internal format ;(data can be safely modified) outbuf can be used only after ch reassignment ;(reassigned data can be safely modified) these buffers are aligned on a ;16-byte boundary, they are useful to do user-specific up/downmixings or dsp ;operations. the returned value in rax is zero mov eax , 150 ;audio processing syscall mov ebx , 22 ;25 - Sample converter deinit mov rcx , [sciface] ; sciface int 0x60 sciface dq 0 ;pointer to sciface inbuf dq 0 ;pointer to "input" internal buffer outbuf dq 0 ;pointer to "output" internal buffer ;you probably recognized that these functions have many paramenters set to ;zero. you may use only parts of the conversions by using these parameters ;eg. you want to change the sample type from 24-bit integer to 32-bit float ; in this case you don't have to give memory allocation intermation to ; the "sample converter init" function, and you have to specify an ; output address to "convert samples to internal format" wave format converter This is a wave format type reader. Currently it has minor support, only wav files with "PCM" and "IEEE_FLOAT" tags are supported. Raw format is supported, but the application's responsibility to handle raw formats. The aim of "wfc" is to read samples in pcm format, and in an interpretable way for "sc", so "wfc" can be used in cooperation with "sc" to convert any kind of wave formats to an internal 32-bit format. The writing feature (ie. converting from internal format to any format) is not supported at the moment. In: rbx - 71 - Wave format converter init rcx - stream type rdx - stream param 1 r8 - stream param 2 r9 - *wfciface [out] r10 - *wfcinfo [out] Out: rax - wfc_lasterror In: rbx - 72 - Wave format converter deinit rcx - wfciface Out: rax - WFC_ERR_OK In: rbx - 73 - Wave format converter read rcx - wfciface rdx - *out r8 - number of samples to read Out: rax - number of samples read Required constants for wave converter: WFC_INFO_FORMAT = 0 ;file format eg. "WFC_FORMAT_RAW" or "WFC_FORMAT_WAV" WFC_INFO_FORMAT_TAG = 4 ;format tag eg. "WFC_FORMAT_TAG_PCM" WFC_INFO_CHANNELS = 8 ;number of channels WFC_INFO_RATE = 12 ;sampling rate in hz WFC_INFO_BITSPERSAMP = 16 ;bits per sample WFC_INFO_SAMPSIZE = 20 ;sample size (ie. CHANNELS * BITSPERSAMP / 8) WFC_INFO_SCFORMAT = 24 ;input "sc" format, can be used in cooperation with "sc" WFC_INFO_SIZE = 28 ;stream size in bytes WFC_INFOSIZE = 32 WFC_FORMAT_RAW = 0 WFC_FORMAT_WAV = 1 WFC_FORMAT_TAG_PCM = 1 WFC_FORMAT_TAG_IEEE_FLOAT = 3 WFC_ERR_OK = 0 WFC_ERR_NOT_ENOUGH_MEMORY = -40 An Example is given here: mov eax , 150 ;audio processing syscall mov ebx , 71 ;71 - Wave format converter init mov rcx , STREAM_TYPE_MEM mov rdx , memory ; pointer to memory mov r8 , memorysize ; memory size lea r9 , [wfciface] ; pointer to *wfciface lea r10 , [wfcinfo] ; pointer to *wfcinfo structure int 0x60 test rax , rax jnz error ;note: the first three values are stream-specific parameters, currently ; only memory streaming is available mov rax , [wfcinfo] ;check for WAV file mov eax , [rax + WFC_INFO_FORMAT cmp eax , WFC_FORMAT_WAV jnz error mov eax , 150 ;audio processing syscall mov ebx , 73 ;73 - Wave format converter read mov rcx , [wfciface] ; wfciface lea rdx , [output] ; pointer to output buffer mov r8 , 4096 ; number of samples to read int 0x60 cmp rax , r8 ;if rax != r8 we are finished jz end ;the returned value in rax is the number of samples read mov eax , 150 ;audio processing syscall mov ebx , 72 ;72 - Wave format converter deinit mov rcx , [wfciface] ; wfciface int 0x60 wfciface dq 0 ;pointer to wfciface wfcinfo dq 0 ;pointer to wfcinfo

iii) FFT convolution kernel

This is a high quality and fast FFT convolution kernel. It is the same as a time-domain convolution kernel (also known as a FIR filter). "FFTCV" uses the split-radix DIF FFT to perform convolution. "FFTCV" is also able to dynamically switch between different length of FFT's during processing. Both filter and phase-shifter coefficients generation is provided of course. Note that "FFTCV" has a leading and trailing delay (see later) The multi-channel interface allows convolutions of mono and multi-channel formats up to 48 channels. The only restriction that the input/output channels must be equal. In: rbx - 31 - FFT convolution init rcx - *fftcviface rdx - max length Out: rax - fftcv_lasterror In: rbx - 32 - FFT convolution deinit rcx - fftcviface Out: rax - FFTCV_ERR_OK In: rbx - 33 - FFT convolution calculate coefficients rcx - fftcviface rdx - n (fft length 2^n) shl 32 + numbands r8 - *bandindices r9 - *bandgains r10 - *phasetab Out: rax - FFTCV_ERR_OK In: rbx - 35 - FFT convolution use coefficients rcx - fftcviface rdx - *impulse r8 - length shl 32 + in format Out: rax - fftcv_lasterror In: rbx - 39 - FFT convolution use complex coefficients rcx - fftcviface rdx - *RI (interleaved real & imaginary array, double) r8 - n (fft length 2^n) Out: rax - fftcv_lasterror In: rbx - 41 - FFT convolution init multi-channel rcx - *fmchiface [out] rdx - max length r8 - in sc format shl 32 + out sc format Out: rax - fftcv_lasterror In: rbx - 42 - FFT convolution deinit multi-channel rcx - fmchiface Out: rax - FFTCV_ERR_OK In: rbx - 43 - FFT convolution process multi-channel rcx - fmchiface rdx - *in r8 - *out r9 - number of samples or -1/-2 for trailing delay / delay length Out: rax - number of samples processed In: rbx - 44 - FFT convolution set coefficients multi-channel rcx - fmchiface rdx - *list of fftcvifaces Out: rax - fftcv_lasterror In: rbx - 45 - FFT convolution flush buffers multi-channel rcx - fmchiface Out: rax - FFTCV_ERR_OK In: rbx - 46 - FFT convolution set state multi-channel rcx - fmchiface rdx - statemask shl 32 + state Out: rax - FFTCV_ERR_OK In: rbx - 47 - FFT convolution stream get info rcx - stream type rdx - stream param 1 r8 - stream param 2 r9 - wfcinfo [out] Out: rax - FFTCV_ERR_OK or a valid errorcode In: rbx - 48 - FFT convolution stream process rcx - max length shl 32 + stream type rdx - stream param 1 r8 - stream param 2 r9 - in sc format shl 32 + out sc format r10 - *list of fftcvifaces r11 - *output [out] Out: rax - FFTCV_ERR_OK or a valid errorcode Required constants for FFTCV: FFTCV_EQ_64BANDS = -1 FFTCV_EQ_64BANDSV1 = -3 FFTCV_EQ_32BANDS = -2 FFTCV_EQ_32BANDSV1 = -4 FFTCV_LENGTH_64BANDS = 16384+1 FFTCV_LENGTH_64BANDSV1 = 4096+1 FFTCV_LENGTH_32BANDS = 8192+1 FFTCV_LENGTH_32BANDSV1 = 2048+1 FFTCV_ST_LEADING_DELAY = 1 FFTCV_ST_LD_APPLIED = 2 FFTCV_TRAILING_DELAY = -1 FFTCV_TRAILING_DELAY_LENGTH = -2 FFTCV_ERR_OK = 0 FFTCV_ERR_NOT_ENOUGH_MEMORY = -11 FFTCV_ERR_INVALID_FORMAT = -13 FFTCV_ERR_INVALID_OPERATION = -14 A detailed Example is given here: ;we create a multi-channel interface: mov eax , 150 ;audio processing syscall mov ebx , 41 ;41 - FFT convolution init multi-channel lea rcx , [fmchiface] ; pointer to *fmchiface mov rdx , FFTCV_LENGTH_32BANDS mov r8 , SC_FORMAT_16B_ST shl 32 + SC_FORMAT_24B_ST int 0x60 test rax , rax jnz error ;"rdx" is set to the max length of samples we want to use for convolution ;you can use any length but samples will always be expanded with zeros to ;obtain the length of (2^n + 1) ;note: if you want to use the 32- or 64-band inbuilt equalizer, pass ; FFTCV_LENGTH_32BANDS or FFTCV_LENGTH_64BANDS ; FFTCV_LENGTH_32BANDSV1 or FFTCV_LENGTH_64BANDSV1 in "rdx" ; ;"r8" holds the input/output formats respectively taken from "sc" ;the following pieces of code will calculate the equalization coefficients ;for an "empty" interface, and use it for the multi-channel interface mov eax , 150 ;audio processing syscall mov ebx , 31 ;31 - FFT convolution init lea rcx , [fftcviface] ; pointer to *fftcviface mov rdx , FFTCV_LENGTH_32BANDS int 0x60 test rax , rax jnz error mov eax , 150 ;audio processing syscall mov ebx , 33 ;31 - FFT convolution calculate coefficients mov rcx , [fftcviface] ; fftcviface mov rdx , FFTCV_EQ_32BANDS xor r8 , r8 ; pointer to bandindices lea r9 , [bandgains] ; pointer to gaintable lea r10 , [phasetab] ; pointer to phasetable int 0x60 ;"rdx" is set to (FFTLENGTHN shl 32) + NUMBANDS where "FFTLENGTHN" ;is the length of the FFT n eg. FFTLENGTH = 10 we will use 2^(10+1) = 2048 point FFTs ;"NUMBANDS" is the number of eq bands, must be used with "r8" ie. bandindices ;note: if you want to use the inbuilt 32- or 64-band equalizer, pass ; FFTCV_EQ_32BANDS or FFTCV_EQ_64BANDS ; FFTCV_EQ_32BANDSV1 or FFTCV_EQ_64BANDSV1 in "rdx" ; in this case "bandindices" will be ignored ; ;"bandindices" is a table, its length is "NUMBANDS" (see appendix A) ;"bandgains" is the gain table consisting of floats, its length is "NUMBANDS" ;"phasetab" is the phase table consisting of floats, its length is "NUMBANDS" ; ;the returned value in rax is zero ;now, set up the list for the multi-channel interface (max 48 channels) lea rdi , [fftcvifaces_list] mov rax , [fftcviface] mov rcx , 48 rep stosq mov eax , 150 ;audio processing syscall mov ebx , 44 ;44 - FFT convolution set coefficients multi-channel mov rcx , [fmchiface] ; fmchiface lea rdx , [fftcvifaces_list] int 0x60 ;the returned value in rax is zero on success ;the coefficients have been calculated, we destroy the "empty" interface mov eax , 150 ;audio processing syscall mov ebx , 32 ;32 - FFT convolution deinit mov rcx , [fftcviface] int 0x60 mov eax , 150 ;audio processing syscall mov ebx , 43 ;43 - FFT convolution process multi-channel mov rcx , [fmchiface] ; fmchiface lea rdx , [input] ; pointer to the input wave lea r8 , [output] ; pointer to the output wave mov r9 , 4096 ; number of samples (can be any value, even zero) int 0x60 mov ecx , SC_FORMAT_24B_ST ;update output position and ecx , 15 ;bits 3-0 is the sample type or "size" imul rax , rcx ; mov ecx , SC_FORMAT_24B_ST ;bits 15-8 is the number of channels shr ecx , 8 ; and ecx , 255 ;see "sc" imul rax , rcx ; add r8 , rax ;the above piece of code will process 4096 samples, output position is ;updated. the returned value in rax is the number of samples processed ;note: "FFTCV" will return samples in blocks, therefore many calls to ; "process multi-channel" may return zero, the maximum length of the ; returned block is the length of the FFT. eg. if you passed 10 as ; FFTLENGTHN to "calculate coefficients" rax is maximum 2^11 = 2048 mov eax , 150 ;audio processing syscall mov ebx , 43 ;43 - FFT convolution process multi-channel mov rcx , [fmchiface] ; fmchiface xor rdx , rdx ; mov r9 , FFTCV_TRAILING_DELAY ;get trailing delay after processing int 0x60 ;note: "FFTCV" has a leading and trailing delay, both have the length 2^(n-1) ;eg. if you passed 10 as FFTLENGTHN to "calculate coefficients" delay length ;is 2^9 = 512. the filter delay is required because of "clicks" but after ;processing all of the samples you have to call "process multi-channel" ;with the "number of samples" parameter beeing FFTCV_TRAILING_DELAY ; ;it is possible to remove these delays by calling "set state multi-channel" ;this function must be called before processing, and note that by removing ;delays, you still have to call "process multi-channel" with FFTCV_TRAILING_DELAY mov eax , 150 ;audio processing syscall mov ebx , 46 ;46 - FFT convolution set state multi-channel mov rcx , [fmchiface] ; fmchiface mov rdx , FFTCV_ST_LEADING_DELAY shl 32 + FFTCV_ST_LEADING_DELAY int 0x60 ;if you want to seek in a waveform, just call mov eax , 150 ;audio processing syscall mov ebx , 45 ;45 - FFT convolution flush buffers multi-channel mov rcx , [fmchiface] int 0x60 ;to destroy the multi-channel interface, just call mov eax , 150 ;audio processing syscall mov ebx , 42 ;42 - FFT convolution deinit multi-channel mov rcx , [fmchiface] int 0x60 fftcviface dq 0 fmchiface dq 0 fftcvifaces_list times 48 dq 0 bandgains times 32 dd 1.0 phasetab times 32 dd 0.0 An easier way: The stream convolver interface is provided for the easiest usage and especially attractive for permanent and non-realtime processing. With this interface it's easy to equalize even a WAV file. mov eax , 150 ;audio processing syscall mov ebx , 47 ;47 - FFT convolution stream get info mov rcx , STREAM_TYPE_MEM mov rdx , memory mov r8 , memorysize lea r9 , [wfcinfos] ;wfcinfos structure, see "wave format converter" int 0x60 test rax , rax jnz error mov eax , 150 ;audio processing syscall mov ebx , 48 ;48 - FFT convolution stream process mov rcx , FFTCV_LENGTH_32BANDS shl 32 + STREAM_TYPE_MEM mov rdx , memory mov r8 , memorysize mov eax , [wfcinfos + WFC_INFO_SCFORMAT] ;the input format must be obtained mov r9 , rax ;and pass it in the high dword shl r9 , 32 ; and eax , (255 shl 8) ;the output format is constructed by or eax , SC_FORMAT_FLOAT32 ;using the channel info of the input format or r9 , rax ;and pass it in the low dword lea r10 , [fftcvifaces_list] lea r11 , [output] int 0x60 test rax , rax jnz error ;note: the coefficients must be calculated in the same way described in the previous ; subsection, the list also have to set up in the same way ;"r9" holds the input and output formats ;"r9" is built up in a way that the input format is obtained and the number of channels ; in the input and output formats must be equal ;"r10" is a pointer to a list of fftcv interfaces (see previous subsection) ;"r11" is a pointer to *output lea rcx , [buffer] ;you can calculate the number of bytes sub rcx , [output] ;processed after "stream process" neg rcx output dq buffer fftcvifaces_list times 48 dq 0 wfcinfos times WFC_INFOSIZE dd 0 ;32 bytes

iv) Sinc resampling kernel

This is a high quality sample rate converter. Quality and speed can be changed by choosing the appropriate filter table at initialization time. The resampler is able to handle irrational sampling rates. Time-varying conversion (eg. LFO) is not supported at the moment. Note that the resampler has a trailing delay, leading delay is removed in contrast with the FFT convolution kernel. The multi-channel interface allows resamplings of mono and multi- channel formats up to 48 channels. The only restriction that the input/output channels must be equal. In: rbx - 51 - Create SINC filter table rcx - sinc table name rdx - *sinc table [out] r8 - *table size or zero [out] Out: rax - sinc_lasterror In: rbx - 52 - Destroy SINC filter table rcx - sinc table Out: rax - SINC_ERR_OK In: rbx - 61 - SINC init multi-channel rcx - *smchiface [out] rdx - sinc table r8 - in sc format shl 32 + out sc format xmm0 - max downsampling ratio (double) xmm1 - max upsampling ratio (double) Out: rax - sinc_lasterror In: rbx - 62 - SINC deinit multi-channel rcx - smchiface Out: rax - SINC_ERR_OK In: rbx - 63 - SINC process multi-channel rcx - smchiface rdx - *in r8 - *out r9 - number of samples or -1/-2 for trailing delay / delay length xmm0 - input rate (double) xmm1 - output rate (double) Out: rax - number of samples processed In: rbx - 64 - SINC flush buffers multi-channel rcx - smchiface Out: rax - SINC_ERR_OK In: rbx - 65 - Sinc resampling stream get info rcx - stream type rdx - stream param 1 r8 - stream param 2 r9 - wfcinfo [out] Out: rax - SINC_ERR_OK or a valid errorcode In: rbx - 66 - Sinc resampling stream process rcx - sinctable name shl 32 + stream type rdx - stream param 1 r8 - stream param 2 r9 - in sc format shl 32 + out sc format r10 - input rate shl 32 + output rate r11 - *output [out] Out: rax - SINC_ERR_OK or a valid errorcode Required constants for Sinc Resampling: SINC_TABLE_VERY_HIGH_QUALITY = 0 ;should be used for permanent resampling SINC_TABLE_HIGH_QUALITY = 1 SINC_TABLE_MEDIUM_QUALITY = 2 ;for average real-time resampling SINC_TABLE_LOW_QUALITY = 3 SINC_TABLE_VERY_LOW_QUALITY = 4 SINC_ST_LEADING_DELAY = 1 SINC_ST_FETCH = 2 SINC_ST_DECIMATION = 4 SINC_ST_FIRST_START = 8 SINC_TRAILING_DELAY = -1 SINC_TRAILING_DELAY_LENGTH = -2 SINC_ERR_OK = 0 SINC_ERR_NOT_ENOUGH_MEMORY = -1 SINC_ERR_INVALID_FORMAT = -3 A detailed Example is given here: mov eax , 150 ;audio processing syscall mov ebx , 51 ;51 - Create SINC filter table mov ecx , SINC_TABLE_HIGH_QUALITY lea rdx , [sinctable] ; pointer to *sinctable xor r8 , r8 ; pointer to *sinctablesize or zero int 0x60 test rax , rax jnz error ;the above piece of code creates the specified table for processing, the table ;can be saved if desired, but not recommended because of version changes ;the returned value in rax is zero or a valid errorcode ;we create a multi-channel interface: mov eax , 150 ;audio processing syscall mov ebx , 61 ;61 - SINC init multi-channel lea rcx , [smchiface] ; pointer to *smchiface mov rdx , [sinctable] ; sinctable mov r8 , SC_FORMAT_16B_ST shl 32 + SC_FORMAT_24B_ST movsd xmm0 , [maxdownsampratio] ;maximum downsampling ratio movsd xmm1 , [maxupsampratio] ;maximum upsampling ratio int 0x60 test rax , rax jnz error ;the maximum downsampling ratio (outrate/inrate) is used for buffering, ;and has importance only if you want to perform downsampling ;the maximum upsampling ratio (outrate/inrate) is used for upsampling only ;the returned value in rax is zero or a valid errorcode mov eax , 150 ;audio processing syscall mov ebx , 63 ;63 - SINC process multi-channel mov rcx , [smchiface] ; smchiface lea rdx , [input] ; pointer to the input wave lea r8 , [output] ; pointer to the output wave mov r9 , 4096 ; 4096 samples movsd xmm0 , [inrate] ; input rate movsd xmm1 , [outrate] ; output rate int 0x60 mov ecx , SC_FORMAT_24B_ST ;update output position and ecx , 15 ;bits 3-0 is the sample type or "size" imul rax , rcx ; mov ecx , SC_FORMAT_24B_ST ;bits 15-8 is the number of channels shr ecx , 8 ; and ecx , 255 ;(see "scextern.inc") imul rax , rcx ; add r8 , rax ; ;the above piece of code will resample 4096 samples from "44khz 16-bit stereo" to ;"48khz 24-bit stereo", output position is updated. the returned value in rax is ;the number of samples processed ;note: the returned value is maximum one value greater than ; "number of samples" * (outrate/inrate) mov eax , 150 ;audio processing syscall mov ebx , 63 ;63 - SINC process multi-channel mov rcx , [smchiface] ; smchiface xor rdx , rdx ; mov r9 , SINC_TRAILING_DELAY ;get trailing delay movsd xmm0 , [inrate] ; input rate movsd xmm1 , [outrate] ; output rate int 0x60 ;note: the resampler has a delay of an arbitrary size. this comes out in a ; "leading" and "trailing" delay, but leading is removed by default. ; after processing all of the samples you have to call "process multi ; channel" with the "number of samples" parameter beeing SINC_TRAILING_DELAY ; you can obtain the "trailing" delay length by calling "process multi ; channel" with SINC_TRAILING_DELAY_LENGTH, use this for buffer allocation ;if you want to seek in a waveform, just call mov eax , 150 ;audio processing syscall mov ebx , 64 ;64 - SINC flush buffers multi-channel mov rcx , [smchiface] ; smchiface int 0x60 ;to destroy the multi-channel interface, just call mov eax , 150 ;audio processing syscall mov ebx , 62 ;62 - SINC deinit multi-channel mov rcx , [smchiface] ; smchiface int 0x60 mov eax , 150 ;audio processing syscall mov ebx , 52 ;52 - Destroy SINC filter table mov rcx , [sinctable] int 0x70 sinctable dq 0 smchiface dq 0 maxdownsampratio dq 1.0 maxupsampratio dq 1.08843537414965986 ;48000.0/44100.0 inrate dq 44100.0 outrate dq 48000.0 An easier way: The stream resampling interface is provided for the easiest usage and especially attractive for permanent and non-realtime processing. With this interface it's easy to resample even a WAV file. mov eax , 150 ;audio processing syscall mov ebx , 65 ;65 - Sinc resampling stream get info mov rcx , STREAM_TYPE_MEM mov rdx , memory mov r8 , memorysize lea r9 , [wfcinfos] ;wfcinfos structure, see "wave format converter" int 0x60 test rax , rax jnz error mov eax , 150 ;audio processing syscall mov ebx , 66 ;66 - Sinc resampling stream process mov rcx , SINC_TABLE_HIGH_QUALITY shl 32 + STREAM_TYPE_MEM mov rdx , memory mov r8 , memorysize mov eax , [wfcinfos + WFC_INFO_SCFORMAT] ;the input format must be obtained mov r9 , rax ;and pass it in the high dword shl r9 , 32 ; and eax , (255 shl 8) ;the output format is constructed by or eax , SC_FORMAT_FLOAT32 ;using the channel info of the input format or r9 , rax ;and pass it in the low dword mov r10d, [wfcinfos + WFC_INFO_RATE] ;r10 - sample rate shl r10 , 32 ;high dword - in rate or r10 , 48000 ; low dword - out rate lea r11 , [output] int 0x60 test rax , rax jnz error ;"r9" holds the input and output formats ;"r9" is built up in a way that the input format is obtained and the number of channels ; in the input and output formats must be equal ;"r10" is built up in a way that the input rate is obtained ;"r11" is a pointer to *output lea rcx , [buffer] ;you can calculate the number of bytes sub rcx , [output] ;processed after "stream process" neg rcx output dq buffer wfcinfos times WFC_INFOSIZE dd 0 ;32 bytes

Appendix A

ABOUT BANDINDICIES: ------------------- the equalizer is created in a way that the frequency response is designed by hermite interpolation, then inverse-FFT'd into the time domain and finally windowed by a windowing function. now the length of the coefficients is 2^n + 1 at this stage the coefficients are zero expanded to the length of 2^(n+1) and forward FFT'd again what is FFT? it's a function which converts from the time-domain into the frequency domain. for example you have a 48-khz (in reality this is 24khz) sampled waveform you convert a "chunk" let's say 2048 samples from that waveform with FFT to the frequency domain (now you got a spectrum from that chunk) you choose which freqs do you want to hear; keep those freqs (clear the others) then inverse FFT that chunk back into the time domain, and the cleared freqs will no longer exist. so "bandindices" is a pointer into an integer table, which are entries into the frequency domain, eg. if you have two bands only, let's say that they are 10 and 20 their gains are 0.5, 1.0 respectively, and the sampling freq is 48khz (24khz in reality) you've got the following thing: let's say that you pass 10 as FFTLENGTHN to "FFTCV calculate coefficients" so n is 10; the length of the FFT chunk is 2^(10+1) = 2048 NOTE: the frequency-domain FFT convolver is the same as the time-domain convolver (FIR filter) however the FFT convolver requires to use FFTs in the length of 2^(n+1) by the way the complex FFT uses negative frequencies too therefore you can give values to "bandindicies" in the range 1..512 10/512*24000 = 468.75hz we will adjust volume to 50% for this band 20/512*24000 = 937.5 hz we will adjust volume to 100% for this band volume for bands between these indices is interpolated with hermite curve above example (2-bands eq): FFTLENGTHN equ 10 NUMBANDS equ 2 bandindices dd 10 , 20 bandgains dd 0.5, 1.0 phasetab dd 0.0, 0.0