🔊 UniFlow-Audio Inference Demo
Multi-task Audio Generation System based on UniFlow-Audio
        Note: For TTS, due to the restriction of HuggingFace Space, the g2p phonemizer used here is inconsistant with the one used during training, so there may be problems. Please refer to INFERENCE_CLI.md for CLI calling guidance.
    
     Model Name       
 1  10
1  100
 Examples
 | Audio Caption | Guidance Scale | Sampling Steps | 
|---|
Model Name       
 1  10
1  100
 Examples
 | Music Caption | Guidance Scale | Sampling Steps | 
|---|
Model Name       
 1  10
1  100
 Examples
 | Text to Synthesize | Reference Speaker Audio | Guidance Scale | Sampling Steps | 
|---|
Singer       
 Model Name       
 1  10
1  100
 Examples
 | Singer | Lyrics | Note Sequence | Note Durations | Guidance Scale | Sampling Steps | 
|---|
Usage Instructions
- Lyrics Format: Use AP for pauses, e.g., AP你要相信AP相信我们会像童话故事里AP
- Note Format: Separate with |, use spaces for simultaneous notes, userestfor rests
- Duration Format: Note durations in seconds, separated by |
Model Name       
 1  10
1  100
 Examples
 | Noisy Speech | Guidance Scale | Sampling Steps | 
|---|
Model Name       
 1  10
1  100
 Examples
 | Low Sample Rate Audio | Guidance Scale | Sampling Steps | 
|---|
Model Name       
 1  10
1  100
 Examples
 | Input Video | Guidance Scale | Sampling Steps | 
|---|
📝 Notes
- Model Name: Choose from UniFlow-Audio-large,UniFlow-Audio-medium, orUniFlow-Audio-small
- Guidance Scale: Controls the guidance strength of the input condition on the output
- Sampling Steps: Number of flow matching sampling steps
💡 Tip: Models will be automatically downloaded on first run, please be patient