LaSAFT

Latent Source Attentive Frequency Transformation for Conditioned Source Separation

PWC Hits

Official Github Repo : ws-choi/Conditioned-Source-Separation-LaSAFT

Paper : arXiv

Colab Interactive Demo

Track Infomation

  1. Vanishing paycheck - Stella Jang
  2. Villain - Stella Jang
  3. La Vie EN Rose - Stella Jang
  4. Feel this breeze - (Prod. JoeSwan) - HyungWoo & Sunmin
  5. Keep the Love Alive - Kyul, TM
  6. HongJinHo - GUIN
  7. 두근대 - Sllo
  8. Ms. Seductive - Jeff Bernat (cover) - Woosung Choi
  9. Footprints - Woosung Choi
  10. Rain - Woosung Choi

We will upload other tracks after dealing with copyright issues.

# Original Vocals Drums Bass Other
1
2
3
4
5
6
7
8
9
10

Application: Drum practice


Model Configuration

from lasaft.source_separation.\
  conditioned.cunet.models.dcun_tfc_gpocm_lasaft\
  import DCUN_TFC_GPoCM_LaSAFT_Framework

args = {}

# FFT params
args['n_fft'] = 2048
args['hop_length'] = 1024
args['num_frame'] = 128

# SVS Framework
args['spec_type'] = 'complex'
args['spec_est_mode'] = 'mapping'

# Other Hyperparams
args['optimizer'] = 'adam'
args['lr'] = 0.001
args['dev_mode'] = False
args['train_loss'] = 'spec_mse'
args['val_loss'] = 'raw_l1'

# DenseNet Hyperparams

args ['n_blocks'] = 7
args ['input_channels'] = 4
args ['internal_channels'] = 24
args ['first_conv_activation'] = 'relu'
args ['last_activation'] = 'identity'
args ['t_down_layers'] = None
args ['f_down_layers'] = None
args ['tif_init_mode'] = None

# TFC_TDF Block's Hyperparams
args['n_internal_layers'] =5
args['kernel_size_t'] = 3
args['kernel_size_f'] = 3
args['tfc_tdf_activation'] = 'relu'
args['bn_factor'] = 16
args['min_bn_units'] = 16
args['tfc_tdf_bias'] = True
args['num_tdfs'] = 6
args['dk'] = 32

args['control_vector_type'] = 'embedding'
args['control_input_dim'] = 4
args['embedding_dim'] = 32
args['condition_to'] = 'decoder'

args['control_n_layer'] = 4
args['control_type'] = 'dense'
args['pocm_type'] = 'matmul'
args['pocm_norm'] = 'batch_norm'


model = DCUN_TFC_GPoCM_LaSAFT_Framework(**args)