Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions testing/windows_local_audio_client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import os
import sys
import struct
import numpy as np
import time
import requests
import pyaudiowpatch as pyaudio

def translator():
DOCKER_IP = "http://0.0.0.0:8000/"
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a global or a parameter passed into the translator function because the IP address of the server is not static and may need to be changed by the user.
Consider allowing the user to pass in the addr/port as command line arguments.

print("TRANSLATOR IS RUNNING")
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad practice to use print statements like this. Use the logging module instead


# open PyAudio manager via context manager
with pyaudio.PyAudio() as p:
# open audio stream via context manager
with p.open(format=pyaudio.paInt32, channels=2, rate=48000, input=True, frames_per_buffer=1024) as stream:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

audio settings change depending on the user's setup, it would be nice to not have these be hard-coded values

Also, the input device is never specified. Look at the examples in the PyAudioWPatch repo for how you are supposed to use this.

print("Recording started...")
while True:
# read a chunk of raw audio data from the stream
raw_audio_data = stream.read(1024)

# convert raw audio data to floating-point values
float_audio_data = np.frombuffer(raw_audio_data, dtype=np.int32) / (2 ** 31) # normalize to range [-1.0, 1.0]

# gather audio data in real time
data = np.abs(float_audio_data[-20:])
avg = np.average(float_audio_data[-20:])
peak = np.max(float_audio_data[-20:])

payload = {
"avg": float(avg),
"peak": float(peak),
"data": data.tolist(),
"source": "Windows Device"
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source tag is supposed to refer to which audio input is being used - saying 'windows device' is fine but it would be more helpful to grab the actual name/id of the input device that is getting recorded.

}

# send audio data to Docker container
with requests.Session() as session:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a session that is immediately closed defeats the purpose of the session. It would be better to either use requests.post or to pass the session into this function so each post request can use the same session.

response = session.post(DOCKER_IP + "audio_in", json=payload).json()

def main():
# make sure the fifo file exists (if required)
# not needed for Windows adaptation

try:
translator()
except Exception as e:
print(f"Error: {e}")
sys.exit(1)

if __name__ == "__main__":
main()