关于Java:将音频立体声转换为音频字节

Convert audio stereo to audio byte

我正在尝试进行一些音频处理,我真的对立体声到单声道的转换感到困惑。我在互联网上看过立体声到单声道的转换。

据我所知,我可以采用左声道,右声道,将它们求和并除以2。但是当我再次将结果转储到WAV文件中时,我得到了很多前景色。我知道在处理数据时可能会产生噪音,字节变量中有一些溢出。

这是我从MP3文件中检索byte []数据块的类:

公共类InputSoundDecoder {

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
private int BUFFER_SIZE = 128000;
private String _inputFileName;
private File _soundFile;
private AudioInputStream _audioInputStream;
private AudioFormat _audioInputFormat;
private AudioFormat _decodedFormat;
private AudioInputStream _audioInputDecodedStream;

public InputSoundDecoder(String fileName) throws UnsuportedSampleRateException{
    this._inputFileName = fileName;
    this._soundFile = new File(this._inputFileName);
    try{
        this._audioInputStream = AudioSystem.getAudioInputStream(this._soundFile);
    }
    catch (Exception e){
        e.printStackTrace();
        System.err.println("Could not open file:" + this._inputFileName);
        System.exit(1);
    }

    this._audioInputFormat = this._audioInputStream.getFormat();

    this._decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 1, 44100, false);
    this._audioInputDecodedStream = AudioSystem.getAudioInputStream(this._decodedFormat, this._audioInputStream);

    /** Supported sample rates */
    switch((int)this._audioInputFormat.getSampleRate()){
        case 22050:
                this.BUFFER_SIZE = 2304;
            break;

        case 44100:
                this.BUFFER_SIZE = 4608;
            break;

        default:
            throw new UnsuportedSampleRateException((int)this._audioInputFormat.getSampleRate());
    }

    System.out.println ("# Channels:" + this._decodedFormat.getChannels());
    System.out.println ("Sample size (bits):" + this._decodedFormat.getSampleSizeInBits());
    System.out.println ("Frame size:" + this._decodedFormat.getFrameSize());
    System.out.println ("Frame rate:" + this._decodedFormat.getFrameRate());

}

public byte[] getSamples(){
    byte[] abData = new byte[this.BUFFER_SIZE];
    int bytesRead = 0;

    try{
        bytesRead = this._audioInputDecodedStream.read(abData,0,abData.length);
    }
    catch (Exception e){
        e.printStackTrace();
        System.err.println("Error getting samples from file:" + this._inputFileName);
        System.exit(1);
    }

    if (bytesRead > 0)
        return abData;
    else
        return null;
}

} ??

这意味着,每次我调用getSamples时,它都会返回一个类似于以下内容的数组:

buff = {Lchannel,Rchannel,Lchannel,Rchannel,Lchannel,Rchannel,Lchannel,Rchannel ...}

转换为单声道的处理例程如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
    byte[] buff = null;
        while( (buff = _input.getSamples()) != null ){

            /** Convert to mono */
            byte[] mono = new byte[buff.length/2];

            for (int i = 0 ; i < mono.length/2; ++i){
                int left = (buff[i * 4] << 8) | (buff[i * 4 + 1] & 0xff);
                int right = (buff[i * 4 + 2] <<8) | (buff[i * 4 + 3] & 0xff);
                int avg = (left + right) / 2;
                short m = (short)avg; /*Mono is an average between 2 channels (stereo)*/
                mono[i * 2] = (byte)((short)(m >> 8));
                mono[i * 2 + 1] = (byte)(m & 0xff);
            }

} ??

并使用以下命令写入wav文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
     public static void writeWav(byte [] theResult, int samplerate, File outfile) {
        // now convert theResult into a wav file
        // probably should use a file if samplecount is too big!
        int theSize = theResult.length;


        InputStream is = new ByteArrayInputStream(theResult);
        //Short2InputStream sis = new Short2InputStream(theResult);

        AudioFormat audioF = new AudioFormat(
                AudioFormat.Encoding.PCM_SIGNED,
                samplerate,
                16,
                1,          // channels
                2,          // framesize
                samplerate,
                false
        );

        AudioInputStream ais = new AudioInputStream(is, audioF, theSize);

        try {
            AudioSystem.write(ais, AudioFileFormat.Type.WAVE, outfile);
        } catch (IOException ioe) {
            System.err.println("IO Exception; probably just done with file");
            return;
        }


    }

以44100作为采样率。

请记住,实际上我所拥有的byte []数组已经是pcm,因此mp3-> pcm转换是通过指定

完成的

1
2
 this._decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 1, 44100, false);
this._audioInputDecodedStream = AudioSystem.getAudioInputStream(this._decodedFormat, this._audioInputStream);

正如我所说,在写入Wav文件时,我会产生很多噪音。我假装将FFT应用于每个字节块,但我认为由于声音嘈杂,结果是不正确的。

因为我要拍两首歌,其中一首是另一首的20秒裁切,并且将裁切fft结果与原始20秒子集进行比较时,根本不匹配。

我认为这是立体声->单声道转换错误的原因。

希望有人对此有所了解,

致谢。


正如评论中指出的那样,字节序可能是错误的。同样,转换为带符号的short并将其移位可能会导致第一个字节为0xFF。

尝试:

1
2
3
4
5
6
int HI = 0; int LO = 1;
int left = (buff[i * 4 + HI] << 8) | (buff[i * 4 + LO] & 0xff);
int right = (buff[i * 4 + 2 + HI] << 8) | (buff[i * 4 + 2 + LO] & 0xff);
int avg = (left + right) / 2;
mono[i * 2 + HI] = (byte)((avg >> 8) & 0xff);
mono[i * 2 + LO] = (byte)(avg & 0xff);

然后切换HI和LO的值以查看是否变好。