Documentation Index
Fetch the complete documentation index at: https://mintlify.com/facebookresearch/audioseal/llms.txt
Use this file to discover all available pages before exploring further.
Overview
TheMsgProcessor class is responsible for embedding binary secret messages into the hidden representations produced by the encoder. It converts binary messages into learned embeddings that are added to the audio features before watermark generation.
The
MsgProcessor is an internal component of AudioSealWM and is typically not instantiated directly by users.Initialization
Parameters
Number of bits in the secret message. Must be greater than 0. This determines the capacity of the watermark (e.g., 16 bits = 65,536 unique messages).
Dimension of the encoder output features. Must match the encoder’s output dimension to ensure proper integration.
Methods
forward
Embed a binary message into the encoder’s hidden representation.Parameters
Encoder output tensor of shape
(batch, hidden_size, frames). This is the intermediate audio representation before watermark generation.Binary message tensor of shape
(batch, nbits). Each value must be 0 or 1, representing the bits of the secret message.Returns
Modified hidden representation of shape
(batch, hidden_size, frames) with the message embedded. This tensor is then passed to the decoder to generate the watermark.How It Works
TheMsgProcessor uses an embedding layer to encode messages:
-
Embedding Creation: For each bit position
iin the message, two embeddings are learned:- Embedding for bit
i = 0at index2*i - Embedding for bit
i = 1at index2*i + 1
- Embedding for bit
- Message Encoding: The binary message selects the appropriate embeddings for each bit position.
- Feature Addition: The sum of all selected embeddings is added to every frame of the hidden representation.
- The message is embedded uniformly across all time frames
- Each bit contributes independently to the watermark
- The embedded message is learned during training for optimal robustness
Example Usage
While you typically don’t useMsgProcessor directly, here’s how it works internally:
Integration in AudioSealWM
TheMsgProcessor is integrated into the generator pipeline:
Design Considerations
Message Capacity
The number of bits determines the message space:- 8 bits: 256 unique messages
- 16 bits: 65,536 unique messages
- 32 bits: 4.3 billion unique messages
Hidden Size
Thehidden_size parameter must match the encoder’s output dimension:
- Typical values: 128, 256, or 512
- Larger hidden sizes allow for more complex message embeddings
- Must be coordinated with the overall model architecture
Zero-Bit Watermarking
For detection-only watermarking without messages, theMsgProcessor is not used (msg_processor=None in AudioSealWM).
Attributes
Number of bits in the secret message.
Dimension of the encoder output.
Embedding layer with shape
(2 * nbits, hidden_size). Maps each bit value at each position to a learned vector.Technical Details
Embedding Indices
The embedding indices are computed as:Broadcasting
The message embedding is broadcast across time:See Also
- AudioSealWM - Generator that uses MsgProcessor
- AudioSealDetector - Detector that decodes messages
- Secret Messages Guide - Working with custom messages
