Readme file for Oxygen beamforming source code distribution ----------------------------------------------------------- This is a very very beta distribution of beamforming source code from MIT LCS. There is only one source file, main.c, and one header file, main.h. You can compile everything using the Makefile included in the distribution. The input to the program is a text file containing floating point values for the signal read on each of the microphones. For n microphones, each line of the text files represents a temporal sample and should contain n floating point values separated by spaces, e.g.: -1.8569790e-004 -9.0919049e-004 3.6711283e-004 -1.0073081e-005 ... There are several modes of operation for the program: 1. The most basic mode is to process the microphone data and to calculate the output based on one beam focused on a particular point in space. The coordinates for the microphones and the focus point are specified inside main.c (eventually to be moved to a separate file). 2. Far field search mode. This mode assumes a far-field source (which means we have a planar wavefront) and a linear array, and sweeps over 180 degrees in the plane of the array. The microphone coordinates are specified in main.c, and the NUM_ANGLES constant defines how many angle values should be tested (a value of 180 means one beam per each degree). The energy of the signal over a particular window (ANGLE_ENERGY_WINDOW_SIZE) is computed for each beam. The direction with maximum energy is considered the direction that the speech signal is coming from, and is printed out by the program. 3. Near-field hill climbing mode. This mode accepts a starting coordinate and attempts to "hill-climb" through the space seeking the maximum energy. Each of the x, y, and z coordinates are perturbed in the positive and negative directions at each time interval (GRID_ENERGY_WINDOW_SIZE) by a step size (GRID_STEP_SIZE). This perturbation, along with the original coordinate, produces seven coordinates to be tested. The direction with the maximum energy replaces the current reference coordinate. For instance, if we have a starting reference coordinate of (1,1,0) and our step size is 0.01, we will evaluate the energy for the following seven beams: (1,1,0) (0.99,1,0) (1.01,1,0) (1,0.99,0) (1,1.01,0) (1,1,-0.01) (1,1,0.01) Now let's say the beam (1,1.01,0) has the maximum energy; then this coordinate will replace the original reference coordinate of (1,1,0). For methods 2, and 3, we are not outputting anything to disk, we are just printing the result. This is because we have just started to work with these methods, and have not applied them in real systems. This code is currently being ported to RAW. To get a list of parameters for the delay_and_sum executable that is generated when the source is compiled, just type ./delay_and_sum . There is some sample data included with the program, in the data directory. There is some data for a near-field and far-field source. The README.txt file in each directory specifies the microphone and source position. The data1 file, when processed with a beamformer aligned in the proper direction should produce something like a sinc function (see http://ccrma-www.stanford.edu/~jos/Interpolation/sinc_function.html). The data2 file should produce an audio signal of a woman saying "the simplest method". If the beamformer is aligned properly, the noise should be reduced significantly over the source signal from only one of the microphones (use print_datafile.pl to isolate one microphone). You can convert the data file that the program produces to wave files using sox. --------------------------------- Eugene Weinstein ecoder@mit.edu