Discussion

High quality audio player

3

SP4CEBAR 2024-01-12 21:59 (Edited)

Since any sound can be defined as a spectrum of frequencies, it is possible (with external tools) to find the spectrum for each of NX's sounds:

This information can be used to turn an audio file into a series of combinations of NX sounds that resemble the spectrums of each of the sounds in the file the most. This combination of NX sounds could even be encoded in one or more NX music files. This conversion can be done in NX or entirely with external tools.


SP4CEBAR 2024-01-13 10:39 (Edited)

Such external tools would likely be a programming language with libraries or packages to do the following:

The program itself would need to be able to


McPepic 2024-01-13 16:02

I was actually thinking about this problem before and was thinking -
Wouldn’t this problem be suited to a neural network?

It would then be possible to hook up an audio file to the input nodes and the sound parameters to the output nodes. The program would then be able to train itself by reproducing the resulting wave from the predicted parameters, compare that wave to the input wave, and adjust the weights based on the discrepancy.

Usually, however, there’s a predictable amount of input nodes, however, so maybe you could break the input data into chunks that could be processed individually by the neural network. Then, on playback, NX could cycle through the parameters from the neural network at the start of each chunk.

Just some ideas, though.


SP4CEBAR 2024-01-13 16:13

Yeah, that would work, maybe each sound can be divided up into a low number (like 8) of frequency bands, this low number likely wouldn't affect the final sound that much since it probably is still not going to be a high quality by today's standards


SP4CEBAR 2024-02-01 22:37

I have a linear algebra exam tomorrow, and right now my mind is loaded with formulas, this gave me a new take on this matter:
1. fast fourier transform each sample into a set of N frequency bands
2. make an vector with N dimensions to store the intensities of the frequency bands in for each sample
3. the span of these sample vectors form a vector space
4. find the multiples of each of these vectors to arrive at a target vector (which is the set of frequency band intensities that make up your target audio's sound)

however this approach relies on being able to generate the frequency bands of each NX sound


SP4CEBAR 2024-02-02 09:01

You can solve this math problem (a system of equations) using the reduced echelon form of a matrix of which each column is a vector


SP4CEBAR 2024-02-04 16:50 (Edited)

The system to be solved will look something like this:
vec_target_audio = a*vec_sample_1 + a*f*vec_sample_1_frequency + b*vec_sample_2 + b*f*vec_sample_2_frequency + ...

all single letter variables are scalars that can be solved with math as long as there are less scalars than vector dimensions, and the vectors aren't dependent (such a vector adds no useful new information to the system)

f is the frequency in all 16 bits that NX offers

The other scalars are the volumes of the sounds: they will be reduced to a 4-bit-value after the system has been solved.

Each vector (vec) contains the information of an audio sample, or it contains the changes that happen as the frequency increases. All vectors have as much dimensions as there are frequency bands (this amount is chosen, and will likely be 8 or less).

Each sample is a wave type: including all pulse widths, I count 4+15=19 wave types, to solve this system I may need a lot more frequency bands, and I need to make sure that enough vectors add new information to the system.

This expressions assumes that raising the volume of a sample, makes all frequency bands louder by equal amounts.

The two vectors of each sample probably have to be integrated into a single vector with elements like: "2 + 5*f", otherwise it may be harder to solve.

I think the system can almost be solved with exactly 19 frequency bands (vector dimensions) and 19 independent vectors, if some of the vectors are dependent, then the number of vectors and frequency bands needs to be reduced. The problem is that:

So this system can be solved like this for all scalars except "f"


SP4CEBAR 2024-02-06 03:51

Having too many solutions is more favorable than no solutions, it isn't that much of a problem as it gives some play to the function that truncates the volumes down to 4-bit values and limits the voices to 4


McPepic 2024-02-06 16:41

I don’t really know much about it, but would the LFO be useful at all in a program like this?


SP4CEBAR 2024-02-08 16:32

if you would encode it as an NX music file, the LFO would allow you to reach more frequency values than you could otherwise. Outside of files, the LFO isn't that useful because we have access to the 16-bit frequency registers


Log in to reply.