Abstract
Playing small wave files with DirectSound requires little buffer management; you can simply load the entire sound into memory and play it. With larger wave files, though, you should be more efficient in your memory usage, especially if you will be playing multiple sounds simultaneously. Streaming is a technique of using a small buffer to play a large file by filling the buffer with data from the file at the same rate that data is taken from the buffer and played.
In this article I discuss the techniques required to stream wave files from disk and play them using the DirectSound application programming interface (API). I chose to implement my solution in C++, but the techniques presented here apply to a C implementation as well.
Introduction to DirectSound
DirectSound is the 32-bit audio API for Windows® 95 and Windows NT® that replaces the 16-bit wave API introduced in Windows 3.1. It provides device-independent access to audio accelerator hardware, giving you access to features like real-time mixing of audio streams and control over volume, panning (left/right balance control), and frequency shifting during playback. DirectSound also provides low-latency playback (on the order of 20 milliseconds) so that you can better synchronize sounds with other events. DirectSound is available in the DirectX 2 SDK.
Just the Facts, Ma'am
I'm going to stick to the subject of streaming wave files and not rehash all of the basics of DirectSound.
If you want to experiment with DirectSound or build the STREAMS sample application, you'll need the DirectX 2 SDK. This SDK is available in the July release of Microsoft Developer Network Development Platform. If you don't subscribe to the Development Platform, have we got a deal for you! For a limited time (how limited is still up in the air), you can
download the DirectX SDK from this Web site. You'll have to be a real bit hound though--it's over 34MB! Even with a 28.8 kHz modem, you're looking at 4 to 5 hours of download time. Don't forget to disable call waiting!
If you're already familiar with DirectSound and don't want to read this entire article to get the goodies, skip to the Quick Fix section for a summary of what you need to know about streaming wave files with DirectSound.
How Streaming Works
The purpose of streaming is to use a relatively small buffer to play a large file. Specific implementations vary, but visualize streaming by imagining continually pouring water into a barrel with a hole in it. The idea is to keep enough water in the barrel so that the flow out of it is uninterrupted. In our case, the barrel is a sound buffer and the water is wave data. Let's carry this metaphor a bit further and say that to put water in the barrel, we have to fetch it from a lake with a bucket. The challenge of streaming, then, is to get the proper-sized bucket and a helper who can carry the bucket between the lake and the barrel fast enough to keep up with the outflow from the barrel. If the barrel (buffer) runs out of water (wave data), the flow (sound) is interrupted.
Streaming with DirectSound
If you've worked with the low-level wave API in Windows 3.1, you're probably familiar with the
waveOutWrite function. This function sends a block of wave data to the driver; and when the driver is finished playing the buffer, it notifies the application and returns the buffer. To keep the drivers satisfied, the application must use at least two buffers and be able to fill a buffer with data in less time than it takes the driver to play a buffer. The following diagram illustrates the streaming mechanism used with the low-level wave API:
Double-buffer streaming with 16-bit wave API
The streaming mechanism used with DirectSound is a different beast altogether. With DirectSound, you create a secondary buffer object (I'll explain the "secondary" part of this jargon in a bit). This buffer is owned by DirectSound, and you must query the buffer to determine how much of the wave data has been played and how much space in the buffer is available to be filled with additional data. Conceptually, this mechanism is identical to a traditional circular buffer with head and tail pointers. The following diagram illustrates the streaming mechanism used with DirectSound:

Single-buffer streaming with DirectSound
With single-buffer streaming, the application is responsible for writing sound data into the buffer before the driver plays the data. The application should keep the buffer as full as possible to prevent any interruptions in sound playback. The DirectSound name for these buffers is
secondary buffers. Each of these secondary buffers can have a different format. During playback, DirectSound mixes the data from all of the secondary buffers into a
primary buffer. There is only one primary buffer and its format determines the output format. Applications do not write wave data directly to the primary buffer.
Polling vs. Interrupt-Driven Buffer Monitoring
Single-buffer streaming requires that the application monitor the buffer and supply it with sound data when necessary. There are two approaches to implementing buffer monitoring:
- Continuously polling the buffer.
- Periodically monitoring the buffer with an interrupt-driven routine.
The second approach, using interrupts to periodically monitor the buffer levels, is the most commonly used solution to the problem of maintaining a streaming buffer. This is the solution I chose to implement in the STREAMS sample application. The first approach, continuous polling, needlessly consumes CPU cycles.
A C++ Implementation of Streaming
The STREAMS sample application includes a C++ implementation of streaming with DirectSound. I chose to do a C++ implementation of streaming for several reasons:
- DirectSound's native interface is based on C++
- I have not seen any other C++ implementations of streaming with DirectSound
- I like to program in C++
You don't have to use C++ to work with DirectSound, but since DirectSound is based on the Component Object Model (COM), C++ is the native interface. If you choose to use C, the DirectX 2 SDK provides macros that allow you to access DirectSound methods in C-language programs. For a C-language implementation of streaming with DirectSound, check out the DSSTREAM sample in the DirectX 2 SDK.
Design Goals
My primary design goal was to create some reusable objects that implement streaming with DirectSound. I didn't want to introduce the complexities of COM or OLE, so the objects are reusable at the source-code level. I wanted the objects to have high-level interfaces and be easy to use in an application.
The STREAMS sample application uses the Microsoft Foundation Class (MFC) Library , a C++ application framework. I didn't base any of my streaming classes on MFC, so if you're using a different application framework, you should be able to reuse this code easily.
Building the STREAMS Sample Application
The STREAMS sample-application package includes source code for one target executable, STREAMS.EXE. I've included a project file for Visual C++, Version 4.0. The following table summarizes the files required to make STREAMS.EXE. If you're not using Visual C++, you can use this table to easily recreate the project in your favorite IDE.
| File |
Description |
| ASSERT.C |
Source file containing basic assert services. |
| DEBUG.C |
Source file containing basic debug services. |
| AUDIOSTREAM.CPP |
Source file containing implementation of AudioStreamServicesand AudioStream objects. |
| TIMER.CPP |
Source file containing implementation of Timerobject. |
| WAVEFILE.CPP |
Source file containing implementation of WaveFileobject. |
| STREAMS.CPP |
Source file for application. |
| STREAMS.RC |
Resource script file. |
| WINMM.LIB |
System library file. |
| DSOUND.LIB |
System library file. |
The key source files are AUDIOSTREAM.CPP, TIMER.CPP, and WAVEFILE.CPP. These files contain the source for all of the objects required to implement wave streaming with DirectSound. The ASSERT.C and DEBUG.C files contain source for some simple debug and assert macros. The remaining source file, STREAMS.CPP, contains the source for a basic MFC-based application.
To build the STREAMS sample application, you'll need the Win32 SDK and the DirectX 2 SDK. To run STREAMS.EXE, you need the DirectX 2 runtime libraries and, of course, a sound card.
A Top-Down View
Before I get into the implementation of the objects that support streaming (the
AudioStreamServices, AudioStream, Timer, and
WaveFile objects), let's take a look at how these objects are used in the STREAMS sample application.
STREAMS is built on a basic two-object MFC model for frame window applications. The two objects are
CMainWindow and
CTheApp, based on
CFrameWnd, and
CWinApp, respectively. The following is the declaration of the
CMainWindow class taken from STREAMS.H:
class CMainWindow : public CFrameWnd
{
public:
AudioStreamServices * m_pass; // ptr to AudioStreamServices object
AudioStream *m_pasCurrent; // ptr to current AudioStream object
CMainWindow();
//{{AFX_MSG( CMainWindow )
afx_msg void OnAbout();
afx_msg void OnFileOpen();
afx_msg void OnTestPlay();
afx_msg void OnTestStop();
afx_msg void OnUpdateTestPlay(CCmdUI* pCmdUI);
afx_msg void OnUpdateTestStop(CCmdUI* pCmdUI);
afx_msg int OnCreate(LPCREATESTRUCT lpCreateStruct);
afx_msg void OnDestroy();
//}}AFX_MSG
DECLARE_MESSAGE_MAP()
};
Note the two data members
m_pass and
m_pasCurrent. These data members hold pointers to an
AudioStreamServices and
AudioStream object. For simplicity, the STREAMS sample application allows only a single wave file to be opened at a time. The
m_pasCurrent member contains a pointer to an
AudioStream object created from the currently open wave file.
Creating and Initializing the AudioStreamServices Object
Before a window uses streaming services, it must create an
AudioStreamServices object. The following code shows how the
OnCreate handler for the
CMainWindow class creates and initializes an
AudioStreamsServices object:
int CMainWindow::OnCreate(LPCREATESTRUCT lpCreateStruct)
{
if (CFrameWnd ::OnCreate(lpCreateStruct) == -1)
return -1;
// Create and initialize AudioStreamServices object.
m_pass = new AudioStreamServices;
if (m_pass)
{
m_pass->Initialize (m_hWnd);
}
// Initialize ptr to current AudioStream object
m_pasCurrent = NULL;
return 0;
}
Each window using streaming services must create an
AudioStreamServices object and initialize it with a window handle. This requirement comes directly from the architecture of DirectSound which apportions services on a per-window basis so that the sounds associated with a window can be muted when the window loses focus.
Creating an AudioStream Object
Once a window has created and initialized an
AudioStreamServices object, the window can create one or more
AudioStream objects. The following code is the command handler for the File Open menu item:
void CMainWindow::OnFileOpen()
{
CString cstrPath;
// Create standard Open File dialog
CFileDialog * pfd
= new CFileDialog (TRUE, NULL, NULL,
OFN_EXPLORER | OFN_NONETWORKBUTTON | OFN_HIDEREADONLY,
"Wave Files (*.wav) | *.wav||", this);
// Show dialog
if (pfd->DoModal () == IDOK)
{
// Get pathname
cstrPath = pfd->GetPathName();
// Delete current AudioStream object
if (m_pasCurrent)
{
delete (m_pasCurrent);
}
// Create new AudioStream object
m_pasCurrent = new AudioStream;
m_pasCurrent->Create ((LPSTR)(LPCTSTR (cstrPath)), m_pass);
}
delete (pfd);
}
Two lines of code are required to create an
AudioStream object:
m_pasCurrent = new AudioStream;
m_pasCurrent->Create ((LPSTR)(LPCTSTR (cstrPath)), m_pass);
What looks like typecasting to LPCTSTR on the
cstrPath parameter is actually a
CString operator that extracts a pointer to a read-only C-style null-terminated string from a
CString object. You might also be wondering why I didn't just create a constructor for the
AudioStream class that accepts a pointer to a filename instead of making a
Create member function to take the filename. I didn't do this because it's possible for the operation to fail and in C++ you can't easily return an error code from a constructor.
Controlling an AudioStream Object
Once you've created an
AudioStream object, you can begin playback with the
Play method. The following is the command handler for the Test Play menu item:
void CMainWindow::OnTestPlay()
{
if (m_pasCurrent)
{
m_pasCurrent->Play ();
}
}
And here's the command handler for the Test Stop menu item:
void CMainWindow::OnTestStop()
{
if (m_pasCurrent)
{
m_pasCurrent->Stop ();
}
}
This code is so simple, I don't think it really needs any explanation. The only control methods I implemented for
AudioStream objects are
Play and
Stop. In a real application, you'd probably want to add some more functionality.
The Timer and WaveFile Objects
Now that I've given you a look at how to use the
AudioStreamServices and
AudioStream objects in an application, let's dig into their implementation. I'll begin with two helper objects,
Timer and
WaveFile, that are used by
AudioStream objects.
The Timer Object
The
Timer object is used to provide timer services that allow
AudioStream objects to service the sound buffer periodically. Here's the declaration for the
Timer class:
class Timer
{
public:
Timer (void);
~Timer (void);
BOOL Create (UINT nPeriod, UINT nRes, DWORD dwUser,
TIMERCALLBACK pfnCallback);
protected:
static void CALLBACK TimeProc(UINT uID, UINT uMsg, DWORD dwUser,
DWORD dw1, DWORD dw2);
TIMERCALLBACK m_pfnCallback;
DWORD m_dwUser;
UINT m_nPeriod;
UINT m_nRes;
UINT m_nIDTimer;
};
The
Timer object uses the multimedia timer services provided through the Win32
timeSetEvent function. These services call a user-supplied callback function at a periodic interval specified in milliseconds. The
Create member does all of the work here:
BOOL Create (UINT nDelay, UINT nRes, DWORD dwUser, TIMERCALLBACK pfnCallback);
The
nPeriod and
nRes parameters specify the timer period and resolution in milliseconds. The
dwUser parameter specifies a DWORD that is passed back to you with each timer callback. The
pfnCallback parameter specifies the callback function. Here's the source for
Create:
BOOL Timer::Create (UINT nPeriod, UINT nRes, DWORD dwUser,
TIMERCALLBACK pfnCallback)
{
BOOL bRtn = SUCCESS; // assume success
// Set data members
m_nPeriod = nPeriod;
m_nRes = nRes;
m_dwUser = dwUser;
m_pfnCallback = pfnCallback;
// Create multimedia timer
if ((m_nIDTimer = timeSetEvent (m_nPeriod, m_nRes, TimeProc,
(DWORD) this, TIME_PERIODIC)) == NULL)
{
bRtn = FAILURE;
}
return (bRtn);
}
After stuffing the four parameters into data members,
Create calls
timeSetEvent and passes the
this pointer as the user-supplied data to the multimedia timer callback. This data is passed back to the callback to identify which
Timer object is associated with the callback.
Before I lose you here, take a look at the declaration of the
Timer::TimeProc member function. It must be declared as static so that it can be used as a C-style callback for the multimedia timer set with
timeSetEvent. Because
TimeProc is a static member function, it's not associated with a
Timer object and does not have access to the
this pointer. Here's the source for
TimeProc:
void CALLBACK Timer::TimeProc(UINT uID, UINT uMsg, DWORD dwUser,
DWORD dw1, DWORD dw2)
{
// dwUser contains ptr to Timer object
Timer * ptimer = (Timer *) dwUser;
// Call user-specified callback and pass back user specified data
(ptimer->m_pfnCallback) (ptimer->m_dwUser);
}
TimeProc contains two action-packed lines of code. The first line simply casts the
dwUser parameter to a pointer to a
Timer object and saves it in a local variable,
ptimer. The second line of code dereferences
ptimer to call the user-supplied callback and pass back the user-supplied data. I could have done away with the first line of code altogether and just cast
dwUser to access the data members of the associated
Timer object but I wrote it this way to better illustrate what's going on. Note that when I say "user-supplied" here, I'm talking about the user of the
Timer object, which is in this case, an
AudioStream object.
In similar fashion, any object that uses a
Timer object must supply a callback that is a static member function and supply its
this pointer as the user-supplied data for the callback. For example, here's the code from
AudioStream::Play that creates the
Timer object:
// Kick off timer to service buffer
m_ptimer = new Timer ();
if (m_ptimer)
{
m_ptimer->Create (m_nBufService, m_nBufService, DWORD (this), TimerCallback);
}
And here's the static member function that serves as a callback for the
Timer object:
BOOL AudioStream::TimerCallback (DWORD dwUser)
{
// dwUser contains ptr to AudioStream object
AudioStream * pas = (AudioStream *) dwUser;
return (pas->ServiceBuffer ());
}
All the important work is done in the
AudioStream::ServiceBuffer routine. You could move everything into
AudioStream::TimerCallback, but because it's static, you'd have to use the
this pointer contained in
dwUser to access all class members. I think using a separate nonstatic member function results in code that is easier to read.
The WaveFile Object
In addition to an object to encapsulate multimedia timer services, I needed an object to represent a wave file, so I created the
WaveFile class. The following is the class declaration for the
WaveFile class:
class WaveFile
{
public:
WaveFile (void);
~WaveFile (void);
BOOL Open (LPSTR pszFilename);
BOOL Cue (void);
UINT Read (BYTE * pbDest, UINT cbSize);
UINT GetNumBytesRemaining (void) { return (m_nDataSize - m_nBytesPlayed); }
UINT GetAvgDataRate (void) { return (m_nAvgDataRate); }
UINT GetDataSize (void) { return (m_nDataSize); }
UINT GetNumBytesPlayed (void) { return (m_nBytesPlayed); }
UINT GetDuration (void) { return (m_nDuration); }
BYTE GetSilenceData (void);
WAVEFORMATEX * m_pwfmt;
protected:
HMMIO m_hmmio;
MMRESULT m_mmr;
MMCKINFO m_mmckiRiff;
MMCKINFO m_mmckiFmt;
MMCKINFO m_mmckiData;
UINT m_nDuration; // duration of sound in msec
UINT m_nBlockAlign; // wave data block alignment spec
UINT m_nAvgDataRate; // average wave data rate
UINT m_nDataSize; // size of data chunk
UINT m_nBytesPlayed; // offset into data chunk
};
This class was designed expressly to stream wave file data, hence there are none of the traditional file I/O functions for operations such as seeking, writing, and creating new files. The following table describes the purpose of each of the member functions in the
WaveFile class:
| Function |
Description |
| Open |
Opens a wave file. |
| Cue |
Cues a wave file for playback. |
| Read |
Reads a given number of data bytes. |
| GetNumBytesRemaining |
Returns the number of data bytes remaining to be read. |
| GetAvgDataRate |
Returns the average data rate in bytes per second. |
| GetDataSize |
Returns the total number of wave data bytes. |
| GetNumBytesPlayed |
Returns the number of data bytes that have been read. |
| GetDuration |
Gets the duration of the wave file in milliseconds. |
| GetSilenceData |
Returns a byte of data representing silence. |
I chose to use the Win32 Multimedia File I/O services (MMIO) for implementation of
WaveFile objects because these services take care of the basics of parsing the chunks in Resource Interchange File Format (RIFF) files. Since the point of this article is to explain streaming with DirectSound, I'm not going to explain the
WaveFile code in detail. Take my word for it: the biggest challenge in writing this code was properly handling the myriad of errors that can occur when accessing files.
Silence, Please!
There is one detail I do want to explain. Implementing the
AudioStream class required that blocks of data representing silence be written to the sound buffer (if you read the remainder of this article, you'll learn why). Since the data representing silence depends on the format of the wave file, I added a
GetSilenceData member function to the
WaveFile class. Word size for pulse-code modulation (PCM) formats can range from one byte for 8-bit mono to four bytes for 16-bit stereo, as shown in the following table.
| PCM Format |
Word Size |
Silence Data |
| 8-bit mono |
1 byte |
0x80 |
| 8-bit stereo |
2 bytes |
0x8080 |
| 16-bit mono |
2 bytes |
0x0000 |
| 16-bit stereo |
4 bytes |
0x00000000 |
Rather than make the
AudioStream code deal with the different word sizes for different wave file formats, I took advantage of the fact that regardless of word size, silence data for PCM formats can be represented by a single byte. Thus, the
GetSilenceData functions returns a BYTE. This shortcut saved me from having to write a lot of extra code.
The AudioStreamServices Object
The DirectSound interface consists of two objects,
IDirectSound and
IDirectSoundBuffer. The
IDirectSound object represents the DirectSound services for a single window. Services are apportioned on a per-windows basis to facilitate muting a sound stream when a window loses the input focus. I created the
AudioStreamServices class to wrap the
IDirectSound object:
class AudioStreamServices
{
public:
AudioStreamServices (void);
~AudioStreamServices (void);
BOOL Initialize (HWND hwnd);
LPDIRECTSOUND GetPDS (void) { return m_pds; }
protected:
HWND m_hwnd;
LPDIRECTSOUND m_pds;
};
As you can see, this is a pretty light class. In addition to a constructor and destructor, there are two member functions,
Initialize and
GetPDS. The
GetPDS function returns the pointer to the
IDirectSound object created by the
Initialize function. The
Initialize function takes a window handle and creates and initializes an
IDirectSound object. Here's the code for the
Initialize function:
// Initialize
BOOL AudioStreamServices::Initialize (HWND hwnd)
{
BOOL fRtn = SUCCESS; // assume success
if (m_pds == NULL)
{
if (hwnd)
{
m_hwnd = hwnd;
// Create IDirectSound object
if (DirectSoundCreate (NULL, &m_pds, NULL) == DS_OK)
{
// Set cooperative level for DirectSound. Normal means our
// sounds will be silenced when our window loses input focus.
if (m_pds->SetCooperativeLevel (m_hwnd, DSSCL_NORMAL) == DS_OK)
{
// Any additional initialization goes here
}
else
{
// Error
DOUT ("ERROR: Unable to set cooperative level\n\r");
fRtn = FAILURE;
}
}
else
{
// Error
DOUT ("ERROR: Unable to create IDirectSound object\n\r");
fRtn = FAILURE;
}
}
else
{
// Error, invalid hwnd
DOUT ("ERROR: Invalid hwnd, unable to initialize services\n\r");
fRtn = FAILURE;
}
}
return (fRtn);
}
The
Initialize function creates an
IDirectSound object by calling the
DirectSoundCreate function. The first parameter to the
DirectSoundCreate call is NULL to request the default DirectSound device. The second parameter is a pointer to a location that
DirectSoundCreate fills with a pointer to an
IDirectSoundobject. The pointer returned by
DirectSoundCreate provides an interface for accessing
IDirectSound member functions.
After successfully creating an
IDirectSound object, the
Initialize code calls the
SetCooperativeLevel member function specifying the DSSCL_NORMAL flag to set the normal cooperative level. This is the lowest cooperative level--other levels are available if you require more control of DirectSound's buffers. For example, in normal cooperative level, the format of audio output is always 8-bit 22kHz mono. To change to another output format, you have to set the priority cooperative level (DSSCL_PRIORITY) and call the
SetFormat function.
The AudioStream Object
Now we're down to the good stuff. I've explained how to use
AudioStreamServices and
AudioStream objects in an application. I've described the
Timer and
WaveFile objects that are used to provide periodic timer services and read wave files. Now I'm going to explain the implementation of the
AudioStreamobject, the object that actually streams wave files using DirectSound. Here's the
AudioStream class declaration:
class AudioStream
{
public:
AudioStream (void);
~AudioStream (void);
BOOL Create (LPSTR pszFilename, AudioStreamServices * pass);
BOOL Destroy (void);
void Play (void);
void Stop (void);
protected:
void Cue (void);
BOOL WriteWaveData (UINT cbSize);
BOOL WriteSilence (UINT cbSize);
DWORD GetMaxWriteSize (void);
BOOL ServiceBuffer (void);
static BOOL TimerCallback (DWORD dwUser);
AudioStreamServices * m_pass; // ptr to AudioStreamServices object
LPDIRECTSOUNDBUFFER m_pdsb; // sound buffer
WaveFile * m_pwavefile; // ptr to WaveFile object
Timer * m_ptimer; // ptr to Timer object
BOOL m_fCued; // semaphore (stream cued)
BOOL m_fPlaying; // semaphore (stream playing)
DSBUFFERDESC m_dsbd; // sound buffer description
LONG m_lInService; // reentrancy semaphore
UINT m_cbBufOffset; // last write position
UINT m_nBufLength; // length of sound buffer in msec
UINT m_cbBufSize; // size of sound buffer in bytes
UINT m_nBufService; // service interval in msec
UINT m_nDuration; // duration of wave file
UINT m_nTimeStarted; // time (in system time) playback started
UINT m_nTimeElapsed; // elapsed time in msec since playback started
};
In addition to a standard constructor and destructor, there are four public interface methods:
Create, Destroy, Play, and
Stop. The purpose of these methods should be obvious from the names I've given them.
The main players here are the
Create and
Play methods, and a third method,
ServiceBuffer, that is not an interface. Here is an explanation of the role each of these methods plays in streaming wave files:
- Create opens a wave file, creates a sound buffer, and cues the stream for playback.
- Play begins DirectSound playback and launches a timer to service the sound buffer.
- ServiceBuffer determines how much of sound buffer is free and fills free space with wave data (or with silence data if all wave data has been sent to buffer). ServiceBuffer also maintains an elapsed time count and stops playback when all of wave file has been played.
Creating the Sound Buffer
Before creating a sound buffer, you must open the wave file to determine its format, average data rate, and duration. Here's the corresponding code from the
Create method:
// Create a new WaveFile object
if (m_pwavefile = new WaveFile)
{
// Open given file
if (m_pwavefile->Open (pszFilename))
{
// Calculate sound buffer size in bytes
m_cbBufSize = (m_pwavefile->GetAvgDataRate () * m_nBufLength) / 1000;
m_cbBufSize = (m_cbBufSize > m_pwavefile->GetDataSize ())
? m_pwavefile->GetDataSize ()
: m_cbBufSize;
// Get duration of sound (in milliseconds)
m_nDuration = m_pwavefile->GetDuration ();
. . .
}
}
After opening the file,
Create determines the required size of the sound buffer and the duration of the sound. The size of the sound buffer is calculated from the average data rate and the default buffer length in milliseconds (the
m_nBufLength data member). The default buffer length is set to a constant in the
AudioStream constructor. I chose to use a two-second sound buffer, but it's a good idea to experiment with your particular application. The timer interval for servicing the sound buffer should be no more than half of the buffer length. I used a 500-millisecond service interval, one-fourth the length of the sound buffer. You can adjust the buffer length and buffer service intervals in the STREAMS sample application by changing the
DefBufferLength and
DefBufferServiceInterval constants in the AUDIOSTREAM.CPP file:
const UINT DefBufferLength = 2000;
const UINT DefBufferServiceInterval = 250;
After successfully opening the wave file and calculating the required buffer size,
Create creates a DirectSound sound buffer by initializing a
DSBUFFERDESC structure and calling
IDirectSound::CreateSoundBuffer:
// Create sound buffer
HRESULT hr;
memset (&m_dsbd, 0, sizeof (DSBUFFERDESC));
m_dsbd.dwSize = sizeof (DSBUFFERDESC);
m_dsbd.dwBufferBytes = m_cbBufSize;
m_dsbd.lpwfxFormat = m_pwavefile->m_pwfmt;
hr = (m_pass->GetPDS ())->CreateSoundBuffer (&m_dsbd, &m_pdsb, NULL);
The
lpwfxFormat element of the
DSBUFFERDESC structure points to a
WAVEFORMATEX structure specifying the format of the wave file. Currently, DirectSound will not play compressed wave formats. The
CreateSoundBuffer method will fail for any formats that are not PCM. Note that no flags are specified for
DSBUFFERDESC.dwFlags. This causes
CreateSoundBuffer to create a looping secondary buffer which is the proper type of buffer for streaming.
Filling the Sound Buffer with Wave Data
After successfully creating the sound buffer,
Create calls the
AudioStream::Cue method to prepare the stream for playback.
Cue resets the buffer pointers and the file pointer and then calls
AudioStream:: WriteWaveData to fill the buffer with data from the wave file. The following is the source for
WriteWaveData:
BOOL AudioStream::WriteWaveData (UINT size)
{
HRESULT hr;
LPBYTE lpbuf1 = NULL;
LPBYTE lpbuf2 = NULL;
DWORD dwsize1 = 0;
DWORD dwsize2 = 0;
DWORD dwbyteswritten1 = 0;
DWORD dwbyteswritten2 = 0;
BOOL fRtn = SUCCESS;
// Lock the sound buffer
hr = m_pdsb->Lock (m_cbBufOffset, size, &lpbuf1, &dwsize1, &lpbuf2, &dwsize2, 0);
if (hr == DS_OK)
{
// Write data to sound buffer. Because the sound buffer is circular,
// we may have to do two write operations if locked portion of buffer
// wraps around to start of buffer.
ASSERT (lpbuf1);
if ((dwbyteswritten1 = m_pwavefile->Read (lpbuf1, dwsize1)) == dwsize1)
{
// Second write required?
if (lpbuf2)
{
if ((dwbyteswritten2 = m_pwavefile->Read (lpbuf2, dwsize2)) == dwsize2)
{
// Both write operations successful!
}
else
{
// Error, didn't read wave data completely
fRtn = FAILURE;
}
}
}
else
{
// Error, didn't read wave data completely
fRtn = FAILURE;
}
// Update our buffer offset and unlock sound buffer
m_cbBufOffset = (m_cbBufOffset + dwbyteswritten1 + dwbyteswritten2)
% m_cbBufSize;
m_pdsb->Unlock (lpbuf1, dwbyteswritten1, lpbuf2, dwbyteswritten2);
}
else
{
// Error locking sound buffer
fRtn = FAILURE;
}
return (fRtn);
}
WriteWaveData reads a given number of data bytes from the wave file and writes the data to the sound buffer. To write data to a DirectSound sound buffer you must first call the
IDirectSoundBuffer::Lock method to get write pointers. No that's not a typo,
Lock return
two pointers. Usually, the second pointer will be returned as NULL, but if the write operation spans the end of the buffer the second pointer will be a valid address (the beginning of the buffer). That's the nature of circular buffers. No problem though, the resulting code is still pretty simple and straightforward.
Beginning Playback
The
AudioStream::Play method begins playback by calling the
IDirectSoundBuffer::Play method and creating a timer to service the sound buffer:
// Begin DirectSound playback
HRESULT hr = m_pdsb->Play (0, 0, DSBPLAY_LOOPING);
if (hr == DS_OK)
{
// Save current time (for elapsed time calculation)
m_nTimeStarted = timeGetTime ();
// Kick off timer to service buffer
m_ptimer = new Timer ();
if (m_ptimer)
{
m_ptimer->Create (m_nBufService, m_nBufService, DWORD (this),
TimerCallback);
}
. . .
}
Note that the call to
IDirectSoundBuffer::Play includes the DSBPLAY_LOOPING flag to specify that playback continue until explicitly stopped.
Play also sets the
m_nTimeStarted data member to the current system time (in milliseconds) to allow calculation of the time that has elapsed since playback was started.
Servicing the Sound Buffer
The
Timer object created by
AudioStream::Play periodically calls the
ServiceBuffer routine to perform the following tasks:
- Maintain an elapsed time count.
- Determine if playback is complete and stop if necessary.
- Fill sound buffer with more wave data or with silence data if all wave data has been sent to buffer.
The following is the complete source for
ServiceBuffer:
LONG lInService = FALSE; // reentrancy semaphore
BOOL AudioStream::ServiceBuffer (void)
{
BOOL fRtn = TRUE;
// Check for reentrance
if (InterlockedExchange (&lInService, TRUE) == FALSE)
{ // Not reentered, proceed normally
// Maintain elapsed time count
m_nTimeElapsed = timeGetTime () - m_nTimeStarted;
// Stop if all of sound has played
if (m_nTimeElapsed < m_nDuration)
{
// All of sound not played yet, send more data to buffer
DWORD dwFreeSpace = GetMaxWriteSize ();
// Determine free space in sound buffer
if (dwFreeSpace)
{
// See how much wave data remains to be sent to buffer
DWORD dwDataRemaining = m_pwavefile->GetNumBytesRemaining ();
if (dwDataRemaining == 0)
{ // All wave data has been sent to buffer
// Fill free space with silence
if (WriteSilence (dwFreeSpace) == FAILURE)
{ // Error writing silence data
fRtn = FALSE;
}
}
else if (dwDataRemaining >= dwFreeSpace)
{ // Enough wave data remains to fill free space in buffer
// Fill free space in buffer with wave data
if (WriteWaveData (dwFreeSpace) == FAILURE)
{ // Error writing wave data
fRtn = FALSE;
}
}
else
{ // Some wave data remains, but not enough to fill free space
// Write wave data, fill remainder of free space with silence
if (WriteWaveData (dwDataRemaining) == SUCCESS)
{
if (WriteSilence (dwFreeSpace - dwDataRemaining) == FAILURE)
{ // Error writing silence data
fRtn = FALSE;
}
}
else
{ // Error writing wave data
fRtn = FALSE;
}
}
}
else
{ // No free space in buffer for some reason
fRtn = FALSE;
}
}
else
{ // All of sound has played, stop playback
Stop ();
}
// Reset reentrancy semaphore
InterlockedExchange (&lInService, FALSE);
}
else
{ // Service routine reentered. Do nothing, just return
fRtn = FALSE;
}
return (fRtn);
}
I feel like the code pretty much speaks for itself here (that's why I included all of this rather lengthy routine). There are several things I want to explain, however. The first is the call to
InterlockedExchange. This is a nifty Win32 synchronization mechanism that I'm using to detect if the
ServiceBuffer routine is reentered. It's possible that you could still be servicing the buffer when another timer interrupt comes along. If
ServiceBuffer is reentered, it simply returns immediately without doing anything.
I also want to explain why you need to write silence data to the sound buffer. DirectSound has no concept of when playback of a wave file is complete--it just happily cycles through the sound buffer playing whatever data is there until it's told to stop. The
ServiceBuffer routine keeps track of how much time has elapsed since playback was started and stops playback as soon as enough time has elapsed to play the entire wave file. Since you can't stop playback at the exact millisecond that the last wave data byte is played, you have to follow the wave data with data representing silence. If you don't do this, you will get some random blip of sound at the end of a wave file.
Managing the Read-and-Write Cursors
Two offsets are required to manage data in a circular buffer. Traditionally these offsets are called the head and the tail of the buffer. I can never remember which is the head and which is the tail, so I like to call these two offsets the "read cursor" and the "write cursor." In this case, the read cursor identifies the location in the buffer where DirectSound is reading wave data and the write cursor identifies the location where we need to write the next block of wave data.
If you take a look at the
IDirectSoundBuffer::GetCurrentPosition method, you'll see that it returns a read cursor and a write cursor. Looks easy enough. At least that's what I thought, but that's not exactly correct. It took me several days of hair-pulling to fi gure out that the write cursor returned by
GetCurrentPosition was not the write cursor I needed to manage a sound buffer. Don't you hate it when things don't work like you want them to?
To manage a sound buffer with DirectSound, you must maintain your own write cursor. In the
AudioStream class I represent the write cursor with the
m_cbBufOffset data member. Each time you write wave data to the sound buffer, you must increment
m_cbBufOffset and check to see if it has wrapped around to the beginning of the buffer. It's not difficult code to write, but it certainly took me a while to discover that I couldn't use the write cursor provided by DirectSound! The following code is a helper method called by
ServiceBuffer to determine how much of the sound buffer has already been played (in other words, how much data can be written to the sound buffer):
DWORD AudioStream::GetMaxWriteSize (void)
{
DWORD dwWriteCursor, dwPlayCursor, dwMaxSize;
// Get current play position
if (m_pdsb->GetCurrentPosition (&dwPlayCursor, &dwWriteCursor) == DS_OK)
{
if (m_cbBufOffset <= dwPlayCursor)
{
// Our write position trails play cursor
dwMaxSize = dwPlayCursor - m_cbBufOffset;
}
else // (m_cbBufOffset > dwPlayCursor)
{
// Play cursor has wrapped
dwMaxSize = m_cbBufSize - m_cbBufOffset + dwPlayCursor;
}
}
else
{
// GetCurrentPosition call failed
ASSERT (0);
dwMaxSize = 0;
}
return (dwMaxSize);
}
GetMaxWriteSize provides a good illustration of how to manage the read and write cursors. You may also want to look at the
WriteWaveData method presented earlier and see how
m_cbBufOffset is used with the
IDirectSoundBuffer::Lock method to get an actual write pointer in the sound buffer.
Now I'll bet you're wondering what the deal is with the write cursor maintained by DirectSound. No, it's not broken, that's the way it was designed to operate! DirectSound's write cursor specifies the position in the buffer where it is safe to write data. During playback, DirectSound won't allow you to write to the section of the sound buffer that begins with its play cursor and ends with its write cursor. Typically, this is about 15 milliseconds worth of data. DirectSound does not change its write cursor when you write data to a sound buffer--the write cursor always tracks the play cursor and leads it by about 15 milliseconds during playback.
Quick Fix: A Summary of Streaming with DirectSound
This following list summarizes what you need to know about streaming wave files with DirectSound:
- DirectSound uses a single sound buffer. For streaming, you need to create a looping secondary buffer by calling the IDirectSound::CreateSoundBuffer method without specifying either the DSBCAPS_STATIC or DSBCAPS_PRIMARYBUFFER flags in the DSBUFFERDESC structure.
- The required size of the sound buffer depends on the format of the wave file you are streaming. For example, a 44.1 kHz 16-bit stereo file will require a much larger sound buffer than an 11.025 kHz 8-bit mono file. I recommend using a one- or two-second sound buffer.
- Use the Win32 multimedia timer services to provide a periodic timer interrupt to service the sound buffer. The timer interval depends on the size of the sound buffer and the data rate of the wave file you are streaming. I recommend using a timer interval that is one-fourth the size of your sound buffer. For example, with a two-second sound buffer, use a timer interval of 500 milliseconds.
- There are two pointers used to manage the contents of the sound buffer, a play cursor and a write cursor. DirectSound maintains the play cursor, which you can obtain with the IDirectSoundBuffer::GetCurrentPosition method. You must maintain your own write cursor to determine how much wave data to write into the buffer and where to write the data. Don't use the write cursor maintained by DirectSound for this purpose.
- DirectSound will continue to play the contents of the sound buffer until you tell it to stop. After you've written all of the wave file data into the sound buffer, you must write data representing silence to the buffer until you determine that all of the wave file data has been played. To determine when all of the data has been played, calculate the duration of the wave file and keep track of how much time has elapsed since you began playback.
- DirectSound only plays PCM data formats. Compressed wave formats are not supported. To play compressed wave data, you must first expand the data into PCM format before writing the data to a DirectSound sound buffer.
댓글을 달아 주세요
아주 좋은 블로그와 나는 하드 작업을 정말로 감사합니다 .. 그리고 당신이 매일 블로그를 업데이 트 바란다
2012/04/24 23:36 [ ADDR : EDIT/ DEL : REPLY ]