MusicNet (2016)

fer · on Oct 15, 2020

For those interested, there's a piano-centered dataset[0]

[0]https://magenta.tensorflow.org/datasets/maestro

dpflan · on Oct 15, 2020

Somewhat silly question: To what extent does analysis of music/sound by looking at spectrogram images provide "enough" information for usage in deep learning systems (like ResNet) compared to something like MusicNet?

dijksterhuis · on Oct 15, 2020

They're pretty fundamental. speech to text networks like DeepSpeech convert audio to an MFCC power spectrogram. Others use an Stft magnitude spectrogram.

It's a 2-d (amplitude by Freq and time) representation of something that's usually 1-d (amplitude over time).

I think some networks have tried using the discrete wavelet transform too,but that's outside my knowledge area.

dpflan · on Oct 15, 2020

Thanks for the explanation. I've checked out your HN profile and then SoundCloud: are many of the tracks you've posted generating as part of your research into "adversarial audio examples"?

dijksterhuis · on Oct 20, 2020

Sorry, I just saw this.

Ha, noooo. I own a bunch of synths and samplers and noodle about for fun.

Although I haven't played with them much recently. I've gotten all super serious @_@

People do have to host their adversarial examples somewhere when publishing their research, not sure why no one has used SoundCloud yet... Odd.

psychometry · on Oct 15, 2020

I love those videos on Youtube that follow along with scores. I'm sure many of them are copyright violations, though...