dsproxy

Introduction

It was desired to have complete control over a retargetable /dev/dsp. Unfortunately, solutions such as esddsp and /dev/dspd have their problems. Namely in that a partially implemented set of the OSS ioctls means that it is not possible to retarget audio from programs that explicitly attempt to manipulate audio fragment parameters and the like. They produce choppy or no audio. To this end, dsproxy was created.

Hooking into the kernel

After compiling the module with a make, simply insert the module into the kernel via an insmod dsproxy.o. You will notice that memory requirements are quite minimal for the code itself (however, add 116k for every "virtual" device created [see below]):

[root@arjuna ram]# lsmod
Module                  Size  Used by
dsproxy                 9284   0
nls_iso8859-1           2020   1  (autoclean)
nls_cp437               3548   1  (autoclean)
vfat                    9180   1  (autoclean)
fat                    30464   1  (autoclean) [vfat]
opl3                   11208   1 
sb                     33620   1 
uart401                 5968   1  [sb]
sound                  57240   0  [opl3 sb uart401]
soundlow                 300   0  [sound]
soundcore               2372   6  [sb sound]
ncr53c8xx              52264   0

At this point you have to define some character special devices in /dev:

[root@arjuna ram]# cd /dev
[root@arjuna ram]# mv dsp dsp_x
[root@arjuna ram]# mknod dsp c 121 2
[root@arjuna ram]# mv mixer mixer_x
[root@arjuna ram]# mknod mixer c 121 3

You can also use make install to add the module to your system and create the neccessary devices.
Now you are ready to liberate the audio out of your programs and into files on the disk.

An example of using dsproxy

After inserting the kernel module, start the reader process:

8 :/tmp/modules> ./dsproxy_reader -e -f madonna.pcm
SNDCTL_DSP_SETFMT   => AFMT_S16_LE AFMT_S16_NE
SNDCTL_DSP_STEREO   => yes
SNDCTL_DSP_SPEED    => 44100

Start the writer process (mtv):

2 :/pub/videos_old> mtv "Madonna_-_Beautiful_Stranger_(calderon_club_mix).mpg"

Let the audio source you are hijacking run to completion then kill the reader via CTRL-C. (Note that the SNDCTL debug messages will appear after starting mtv.)

During recording, you will see a debug message with the running time of the audio captured so far.

Now convert the audio to an xmms-friendly and mp3-able format:

9 :/tmp/modules> sox -r 44100 -c 2 -w -s -t raw madonna.pcm madonna.wav

To avoid the intermediate raw file, you can pipe the audio directly into an encoder, e.g. lame:

dsproxy_reader -e -x -s | lame -r -x -s 44.1 -h -b 128 --add-id3v2 --tt 'Title' --ta 'Artist' --tl 'Album' --ty 2005 - audio.mp3

Note that the pipe might run into a problem after 2^32 bytes, which is approx. 6:45:47.89 hours at 44100 Hz, 16 bit stereo.

Kernelspace messages

The dmesg log shows the following informations (annotated):

(1) dsproxy_main.o: installed module
(2) dsproxy_main.o: Opening device for uid 500 (dsproxy_s -> cc05e000)
(3) dsproxy_main.o: Opening device for uid 500 (dsproxy_s -> cc05e000)
(4) dsproxy_audio.o: fragsize: 4096, nfrags: 1, uid 500
(5) dsproxy_main.o: Releasing device 2 (write), uid 500
(6) dsproxy_main.o: Releasing device 2 (read), uid 500
(7) dsproxy_main.o: freeing device for uid 500 (dsproxy_s -> cc05e000)
(8) dsproxy_main.o: removed module

(1) the dsproxy module was inserted via insmod dsproxy.o
(2) the dsproxy_reader program was started via ./dsproxy_reader -e -f madonna.pcm
(3) mtv was started via mtv "Madonna_-_Beautiful_Stranger_(calderon_club_mix).mpg"
(4) mtv adjusted the audio fragment parameters
(5) mtv terminated and released /dev/dsp
(6) dsproxy_reader was killed
(7) dsproxy.o detected that the audio pipe was empty on both sides so it was deallocated
(8) the dsproxy module was removed via rmmod dsproxy

Theory of operations

After the dsproxy module is inserted, it waits for calls to the open method in its file_operations table. A reader process then pulls bytes from the dsproxy device and disposes of them as it sees fit. A writer process is then opened that writes bytes into the device. Opening the reader process first is recommended as some programs (such as xmms) act oddly or lose synchronization if the reader is not immediately ready to accept sound data.

Effectively, dsproxy is nothing more than a named pipe that emulates an OSS device. Writes and ioctls are tagged so they can be decomposed in the reader process from the combined data stream. Since dsproxy emulates an OSS device, it can be used to run xmame, rvplayer7, and what have you without the need for a soundcard in the machine creating sounds. It can probably also be used to capture sound from Windows programs running under vmware, but this has not been tried.

Unlike /dev/dsp, multiple "virtual" sound devices may be spawned. dsproxy matches the reader end of the pipe up with the writer end via the process uid, so that each uid appears to have his own sound device. This strange allocation of the sound device was done so that multiple users may run on the same machine through a mechanism such as vnc and from a user/process standpoint all appear to have their own sound card!

Reader command line flags

Usage:
------
./dsproxy_reader [options]

-d              disable fragment datapacing
-e              dump "english language" ioctls to stderr
-f filename     output pcm audio data to specified file
-h              help
-i              dump ioctls to stderr
-n hostname     connect to dsproxy server at hostname port 9138
-r              open/close real dsp device when reset ioctl is encountered
-s              output pcm audio data to stdout
-x              don't output to /dev/dsp[_x]

A sample network server

Instead of running dsproxy_reader, if you invoke dsproxy_server on the machine running dsproxy.o, you can retarget sound data across a network. On the machine you wish to play the sound on, simply enter

./dsproxy_reader -e -n servername

(where servername is the hostname of the machine serving the audio) in order to hear the sound. You may wish to increase buffering by changing DSPROXY_FIFO_SIZE to a larger value such as (12*4096). This may cause some applications to malfunction however as the synchronization ioctls in dsproxy_reader and dsproxy_server need some work. A notable example is xmame.

Change Log
v0.2.17.4 Code cleanup. (Christian Wolff)

v0.2.17.3 corrected include location for kernel header files. (Christian Wolff)

v0.2.17.2 added continuusly updating running time. (Christian Wolff)

v0.2.17.1: adaption to linux 2.4 kernel, added install script, added running time display. (Christian Wolff)

v0.2.17: Integrated 2.4.x compatibility patches from Jonathan Sambrook, thanks!

v0.2.16: Added fragment datapacing in reader. Mainly, it sets the destination soundcard to use a maximum of two fragments such that the rest will "back up" and allow the retargeted sound to play at more-or-less the same speed as it does on the destination machine.

v0.2.12-15: Reporting of GETOSPACE values have been improved for higher compatibility with xmms. Stopped emission of duplicate mixer volume levels in order to cut down on traffic when sliding volume bars in mixer programs. Also added semaphores around mixer value changes in case a user would have multiple mixers running at once (e.g., aumix is one mixer and mtv has its own). Added semaphores around read/write/ioctl since a fork() can cause multiple outstanding write ops, etc. (The inode semaphores guard against this normally but we have disabled them for virtualization.)

v0.2.10-11: Added more semaphore locking for critical ops. Discovered that the 0.2.7 fix causes kernel oopses to happen if the user opens then closes the read end of the pipe without opening a writer. This was causes by a misplaced assignment to the inode field of struct dsproxy_s (should have been in read when it was in write since the user can't open the writer without a pre-existing reader).

v0.2.9: Additional verify_area checks.

v0.2.8: Added low water marker that scales by frequency up to 1024 bytes for writes in order to keep from buffer thrashing with competing reads. This keeps audio from breaking up in x11amp when the read server gets behind and leaves tiny holes to fill in the audio fifo.

v0.2.7: Fixed a potential deadlock condition resulting from "dirty mixer" ioctls not waking up the inq wait_queue. Removed uid fields across the board in the tagged data and shortened the length field down to two bytes and also adjust the i_sem field in the inode accordingly. I have remoted audio from mpg123 to vnc on five machines simultaneously with no skipping using the new version of dsproxy then ran out of more machines to run more vnc sessions on. I will provide the patches for vnc when they are ready for general use.

v0.2.6: Discovered that the high load averages and write() problems when using dsproxy multiuser was that the i_atomic_write field in the inode needs to get appropriately adjusted for the write end of the pipe. The semaphore ops on this field will cause kernel panics on machines compiled with the INODE_PARANOIA flag defined in fs/inode.c. This problem is not manifested on my RedHat 6.2 machine, but may occur on other distributions. You have been warned.

v0.2.5: Now use kernel semaphores for locking of critical sections rather than sti/cli to improve performance.

v0.2.4: Disabled signal handling during write() ops so as not to lose write synchronizations on multiples of four bytes for stereo samples that would cause channel inversion and white noise when Ctrl-C/Z is pressed.

v0.2.3: Redid mechanism for memory allocation as it pertains to open() interlocks. Apparently, all pagefault problems are fixed! Currently, the ioctl() for fragment status is #ifdef'd out. Change that if your program depends on it (e.g., rvplayer, mame). Note that the extra 64k per virtual sound device is needed to fix incorrectly written programs such as mpg123 that do not seem to check on the return value of write()!

v0.2.2: Fixed some pagefault problems dealing with buffer wraps on read() and write().

v0.2.1: Added /dev/mixer functionality. Now remote apps can control the local mixer.

v0.2.0: Added preliminary network capability. Note that occasionally the sound loses "sync" on client machines because samples span multiples of four bytes and sometimes this synchronization is lost. This will be fixed in the future.

v0.1.0: Initial release. /dev/mixer functionality may be needed in the future for some applications, but for now things run fine. This release was based on a version of Henrik Johansson's MPEG2 driver for the audio portion of the Hollywood Plus that was later expanded to include the missing OSS ioctls.

06jun01 Ram Sivrasubramanium (ram@linux-workshop.com)
02aug03 Christian Wolff (sub-dsproxy[at]scara[dot]com)