Hello,
I'm using UOS for voice recognition input MIC with Vosk (Raspberry3B+ Lazarus). I created a Loopback proc to execute the Vosk procedure "acceptwaveform." The problem is an error (Vfinal := FTVoskRecognizer.AcceptWaveform(PSingle(FBufRcFromUos), length(FBufRcFromUos)) due to the type. How can I get an FBufRcFromUos of type Pansichar? <nabble_embed> / Get current buffer function uos_InputGetBuffer(PlayerIndex: cint32; InputIndex: cint32): TDArFloat;</nabble_embed> |
Hello,
I'm using UOS for voice recognition input with Vosk. I created a Loopback proc to execute the Vosk procedure "acceptwaveform." The problem is an error (Vfinal := FTVoskRecognizer.AcceptWaveform(PSingle(FBufRcFromUos), length(FBufRcFromUos)) due to the type. How can I get an FBufRcFromUos of type Pansichar? / Get current buffer
|
In reply to this post by lucarnet
Thanks. (first time usage of your forum)
le code: .... procedure Tmainvosk3_for.btn_record_stopClick(Sender: TObject); begin for_message('================================================='); for_message(FTVoskRecognizer.GetFinalResult); for_message('================================================='); uos_Stop(PlayerIndex1); ClosePlayer1; end; procedure Tmainvosk3_for.btn_record_startClick(Sender: TObject); var outformatst: string; outformat, numchan: integer; VFileNomSave:string; vmestmp:string; begin //In1Index := uos_AddFromDevIn(PlayerIndex1); //In1Index := uos_AddFromDevIn (PlayerIndex1,-1, -1, -1, -1, -1, -1, -1);//-1); //In1Index := uos_AddFromDevIn(PlayerIndex1,-1, -1, -1, -1, 1, -1, -1); //For intput integer 32 sample: In1Index := uos_AddFromDevIn(PlayerIndex1,-1, -1, -1, -1, 2, -1, -1); //For intput integer 16 sample: // add input from mic with custom parameters // PlayerIndex : Index of a existing Player // Device ( -1 is default Input device ) // Latency ( -1 is latency suggested ) ) // SampleRate : delault : -1 (44100) // OutputIndex : OutputIndex of existing Output // -1 : all output, -2: no output, other integer : existing output) // SampleFormat : -1 default : Int16 : (0: Float32, 1:Int32, 2:Int16) // FramesCount : -1 default : 4096 ( > = safer, < = better latency ) numchan := uos_InputGetChannels(PlayerIndex1, In1Index); uosPlayers[PlayerIndex1].StreamIn[In1Index].Data.Channels:=1; //forcage a MONO pour VOSK .... end; procedure Tmainvosk3_for.LoopProcPlayer1; var aBufLen: integer; FBufRcFromUos: TDArFloat; // Mon buffer Vfinal:cardinal; LBuffer:TBytes;//TSingleArrayDyn; i,j:integer; VtbytesTmp:tbytes; Vsingle:cfloat;//single; vind:integer; begin try // ShowPosition; // ShowLevel; FBufRcFromUos:=uos_InputGetBuffer(PlayerIndex1, In1Index); aBufLen:=Length(FBufRcFromUos); //Traiter LBuffer if FTVoskRecognizer=nil then for_message('FTVoskRecognizer=nil') else begin Vfinal := FTVoskRecognizer.AcceptWaveform(Psingle(FBufRcFromUos),abuflen); // @returns 1 if silence is occured and you can retrieve a new utterance with result method // 0 if decoding continues // -1 if exception occured *) case Vfinal of 1 :for_message(FTVoskRecognizer.GetResult); 0 :for_message(FTVoskRecognizer.GetPartialResult); -1 :for_message('exception vosk_recognizer_accept_waveform'); end; end; end; except on E:Exception do for_message('LoopProcPlayer1:'+E.message); end; end; |
Administrator
|
Hello Lucarnet and welcome to uos forum.
> numchan := uos_InputGetChannels(PlayerIndex1, In1Index); > uosPlayers[PlayerIndex1].StreamIn[In1Index].Data.Channels:=1; //forcage a MONO pour VOSK You may not force the input device to mono or stereo, it is defined by the harware of your sound card. But you may use a DSP to convert to mono/stereo, see the demo simple player. > Vfinal := FTVoskRecognizer.AcceptWaveform(Psingle(FBufRcFromUos),abuflen); Sorry but I dont know that library, where are the source? |
Administrator
|
Re-hello.
> How can I get an FBufRcFromUos of type Pansichar? Not sure to understand, the audio buffer used by uos are samples that can be int16, int32 or float32, so numbers. There are not ansichar. |
Administrator
|
This post was updated on .
In the case your library only accept sample 16 bit from a array of 16 bit (uos uses a array of float, even to store sample 16 bit), you may do something like this:
var arrayofint16 : array of cint16; i : integer; .... FBufRcFromUos:=uos_InputGetBuffer(PlayerIndex1, In1Index); aBufLen:=Length(FBufRcFromUos); setlength(arrayofint16,aBufLen). for i := 0 to aBufLen -1 do arrayofint16[i] := round(FBufRcFromUos[i]); Vfinal := FTVoskRecognizer.AcceptWaveform(@arrayofint16[0],abuflen); But what will you do next? If you want to apply the arrayofint16 to the loop of uos, you better to use a DSP, there are demo in Simple Player how to do (for example stereo to mono). In that case you should re convert arrayofint16 into a array of float: for i := 0 to aBufLen -1 do Updated_FBufRcFromUos[i] := arrayofint16[i]; But I dont know your library so all this is only from my crystal ball. |
Hello, thank you for this advice.
The issue has been resolved. Next problem: I want to stop recording after a detected silence of x seconds. Can you help me? I'm getting a bit lost with all the UOS functions. Thanks. |
In reply to this post by fredvs
Hello, thank you for this advice.
The issue has been resolved, and my voice recognition is working fine. Next problem: I want to stop recording after a detected silence of x seconds. Can you help me? I'm getting a bit lost with all the UOS functions. Thanks. |
Administrator
|
This post was updated on .
Hello.
Nice that your voice recognition project is working. To detect silence, a easy way is to update the LoopProcPlayer1. For example: // global variable var incbuffer: integer = 0; // in btn_record_startClick, after adding input add this: procedure Tmainvosk3_for.btn_record_startClick(Sender: TObject); begin .... In1Index := uos_AddFromDevIn(PlayerIndex1,-1, -1, -1, -1, 2, -1, -1); //For intput integer 16 sample: uos_InputSetLevelEnable(PlayerIndex1, InputIndex1, 2); // add this incbuffer := 0; // add this ... end; // update LoopProcPlayer1 procedure Tmainvosk3_for.LoopProcPlayer1; begin if uos_InputGetLevelLeft(PlayerIndex1, InputIndex1) + uos_InputGetLevelRight(PlayerIndex1, InputIndex1) < 0.1 then inc(incbuffer) // adjust the level for silence maybe = 0 is too strict. else incbuffer := 0; // the silence was broken then reinitialize incbuffer = 0 if incbuffer > 10 then uos_stop(PlayerIndex1) // you need do decide how many buffers to be your silent time. else begin // your code for recognition. end; end; |
Free forum by Nabble | Edit this page |