Abbey Road De-Mix Technology Discussion

QuadraphonicQuad

Help Support QuadraphonicQuad:

This site may earn a commission from merchant affiliate links, including eBay, Amazon, and others.
Always wanted to get the TR-1 interactive disc as it sounded like in would have been fun to fool with the music. By the time I got around to seriously doing the purchase the CD-I was no longer available. I also recall it being rather expensive at the time. I did get the regular disc - not the 'lite' version - eventually. Did my own mix of a couple of the tunes I really liked in ProTools which was nice to be able to do even if it wasn't the same as the CD-I.
 
By the way, for those interested in "Music Source Separation" / De-Mix, there are a ton of tools available now. Some Paid, some free. The "Best" tends to change overtime as a lot of these are research projects competing with each other using structured benchmarks and source material.

A good listing/free access is here: Online music/voice separator based on neural nets

My current favorite, in both quality, and number of stem types available is Vocal Remover | Isolate Voice & Instrumental Online | LALAL.AI
You can pay as you go, so pennies per stem.

My 2nd choice (this one free) would be demucs, 3rd choice (also free) Spleeter.
 
By the way, for those interested in "Music Source Separation" / De-Mix, there are a ton of tools available now. Some Paid, some free. The "Best" tends to change overtime as a lot of these are research projects competing with each other using structured benchmarks and source material.

A good listing/free access is here: Online music/voice separator based on neural nets

My current favorite, in both quality, and number of stem types available is Vocal Remover | Isolate Voice & Instrumental Online | LALAL.AI
You can pay as you go, so pennies per stem.

My 2nd choice (this one free) would be demucs, 3rd choice (also free) Spleeter.
After a nice hike this morning, I spent the afternoon messing around with that first free access de-mix program you noted. I made a 5.1 of Born To Be Wild. It's a simple, amateurish remix with vocals, bass and drums in the front, guitar in the left rear and organ in the right rear. After numerous iterations and tinkering, I'm actually surprised at how good it sounds. And it's even better upmixed with Auro 3D. There's only one 'digital bobble' in the vocal at about 1:30 into the song that was obvious.

I am now tempted to try the LALAL program.
 
Anyone tried a combination of matrix decoding and upmixing - maybe use a matrix decoder [Surround Master?] to create LB and RB and use upmixing to create LF and RF (possibly from the stereo Abbey Road album [attempt to not go OT too much...])?


Kirk Bayne
 
After a nice hike this morning, I spent the afternoon messing around with that first free access de-mix program you noted. I made a 5.1 of Born To Be Wild. It's a simple, amateurish remix with vocals, bass and drums in the front, guitar in the left rear and organ in the right rear. After numerous iterations and tinkering, I'm actually surprised at how good it sounds. And it's even better upmixed with Auro 3D. There's only one 'digital bobble' in the vocal at about 1:30 into the song that was obvious.

I am now tempted to try the LALAL program.
Hehe, you're hooked!

Here are some additional techniques I have used for stem based up-remix:

Making your own "other" or "remaining" stem, in the case where your stems came from different tools or your tool didn't provide an "other" stem for the selection of stems you are using. This is done by summing your stems and subtracting the result (or mixing with inverted phase) from the original stereo.​
This "other" stem can be blended with your stems, at a lower level, to reduce artifacts or make things sound more "together", and or placed in its own channel or set of output channels (including being upmixed).​
Upmixing stems to 5.1 and placing the 6 channels in different final channels or panned between channels. This may be mixing 5.1 down to 4.0 for say the height channels in 7.1.4 or 5.1.4, or maybe the 4 rear channels (two surround, two height surround) or the front 4, etc.​
Drums and Vocals typically get upmixed. In 7.1.4 I tend to put upmixed to quad drums in the heights, and Vocals in their corsseponding 5.1 channels except I use the rear heights instead of the rears (typical backing vocals come from behind and above)​
Combining Stem's Left and Right or Splitting out left and right and then "stereoizing" the results with a crossover. One could "Stereoize" with delay, or reverb but I don't like to add anything to my up-remixes. For 7.1.4 I typically combine the bass stem left and right and split with a (tuned by ear so the output sound is even) crossover and place it in the side channels, or mixed halfway back in a 5.1.4 or 5.1 mix. This crossover technnique keeps things from sounding like they are coming from the center channel, even through they place in other channels.​
Panning stems between channels. E.g. piano wider the fronts or wider than fronts AND elevated in 5.1.4 or 7.1.4.​
(Assuming you don't have Atmos tools) you can pan in immersive formats with free plugins from here:​
1660425822397.png
FYI I tend to use Plogue Bidule for up-remix, vs. a traditional DAW because there aren't any channel/buss/assignment limitations and I can quickly prototype things, etc. I have premade "groups" for building up a stems to 7.1.4 up-remix, that can accommodate mono, stereo, quad, 5.1, or 7.1.4 upmixed stems, etc. Stereo and upmix assignments and pans, crossovers, 12channel plugins (made from lower channel count plugins) etc.​

You can PM me for examples.
 
By the way, for those interested in "Music Source Separation" / De-Mix, there are a ton of tools available now. Some Paid, some free. The "Best" tends to change overtime as a lot of these are research projects competing with each other using structured benchmarks and source material.

A good listing/free access is here: Online music/voice separator based on neural nets

My current favorite, in both quality, and number of stem types available is Vocal Remover | Isolate Voice & Instrumental Online | LALAL.AI
You can pay as you go, so pennies per stem.

My 2nd choice (this one free) would be demucs, 3rd choice (also free) Spleeter.
Web based only? Hmmm...
(Your choice pick, that is.)

I tried spleeter a while ago. Installed all the cryptic command line this and that. Used the high fidelity trained versions. It was impressive on the surface for sure. I took a mix I made and produced stems from it myself to compare against. The spleeter stems sounded lo-fi and a little terrible compared to the real thing.

That's the problem. You can do some still impressive things but the depth and fidelity of the music go out the window. Those original stereo or mono mixes just still sound better. In nuance anyway.

Well, that would have been my problem with spleeter anyway. I might have to try this web based thing. Have you or anyone given them a mix you made yourself where you were able to compare their results to your own stems?

The other thing I'd probably want to do is throw some suspect live recording at it to see what it can take apart. The AI probably isn't programmed for that though.
 
Web based only? Hmmm...
(Your choice pick, that is.)

I tried spleeter a while ago. Installed all the cryptic command line this and that. Used the high fidelity trained versions. It was impressive on the surface for sure. I took a mix I made and produced stems from it myself to compare against. The spleeter stems sounded lo-fi and a little terrible compared to the real thing.

That's the problem. You can do some still impressive things but the depth and fidelity of the music go out the window. Those original stereo or mono mixes just still sound better. In nuance anyway.

Well, that would have been my problem with spleeter anyway. I might have to try this web based thing. Have you or anyone given them a mix you made yourself where you were able to compare their results to your own stems?

The other thing I'd probably want to do is throw some suspect live recording at it to see what it can take apart. The AI probably isn't programmed for that though.
In my one and only shot at making a 5.1 from stereo (post #23 above), listening to the stems alone definitely exposed artifacts. These artifacts sounded similar to those produced by a logic matrix decoder. However, when I put everything back together into the 5.1 mix, the results sounded very good. That surprised me.

I would still prefer to upmix using real time devices such as the Surround Master when the results are satisfying. But with a song such as Born To Be Wild that yield a rather lame result, I might want make a few 5.1 mixes (or better stereo mixes) from stems extracted using a de-mix program. The song really must be near and dear to me to warrant the time to fool with it.
 
So yeah I guess another form of "De-mix", vs. AI/ML music source separation, is using an upmixer to separate things based on where they were placed in the stereo mix.

Some early stereo recordings are good candidates for this, where things are mixed more discretely than in modern music, with lead and backing instruments or even lead and backing vocals panned hard left and right.

That results in those elements firmly in the LS and RS channels, by themselves, in a 5.1 upmix. So for those types of mixes, I have developed post upmix processing scripts that remix things into a more pleasing modern mix, with the lead in both fronts (sounding like stereo) and the backing element in the rears (sounding like stereo).

So far I have scripts for lead vocals hard panned (think early Beatles) and a separate pair for lead and backing instruments hard panned, with vocals in the center.

For the script for hard panned Lead vocals, vocals are placed in C with what was in C moved to LF and RF and mixed with what was already there in the upmix. The backing hard panned channel is split with a crossover, and sent to LS and RS.

Those scripts are labeled "Fix Early Stereo Lead in LS" and "Fix Early Stereo Lead in RS" and included in SpecScript: SpecScript 1.7 - 5.1 AND 7.1 upmix scripts and utilities

Today and earlier this week I have developed the scripts for early stereo where the lead and backing instruments are hard panned, but the vocals are already mixed in the center.

These scripts leave vocals in C, crossover the Lead instrument and mix into LF and RF, with what was already there, and panned between C and fronts.

The backing instrument is crossover and sent to the rear channels.

Adjusting the crossover points will probably be via a dialog box. For instance, I first developed these scripts working with Neil Young's "Down By the River" which has two guitar parts hard panned left and right. Today I tried them with "AR Surround's" mentioned "Born to be Wild" and got a good results except the Hammond Organ needed a different crossover point. By adjusting that crossover point I was able to split the left and right hands of the organ part, so those end up in the rears.

These new scripts will be included in a forthcoming version of SpecScript.

Back to the de-mixing topic, vs. how to re-mix (easily with scripts), SpecScript can carve mixes up into 7 parts, again based on where things were placed in the original stereo mix. I have experimented with going to 9 and 11 parts, but things start to sound "thin", with more objectionable artifacts, at least when using multiple stages of "CenterCut", which is how SpecScript works. I do have other prototypes that sound better at direct upmixing to 9 or 11 parts, but I haven't made end user ready versions of those methods yet.
 
I took a shot at mixing a 5.1 of another song, Drive My Car from the 2009 remaster. It's my favorite song on Rubber Soul. I'm really happy how this one came out. I used better demixing algorithms, Ultimate Vocal Remover HQ to extract the vocals and Demucs3 Model B for the instruments. This mix is far superior than the amateurish job I did on Born To Be Wild.

Hopefully, Rubber Soul will be release in 2024 in Atmos and I won't have to deal with doing the rest of the album...if that's even possible.
 
Hi all,

I've been reading this thread with interest while on holiday and was desperate to add my 2 pence worth but wanted to wait until I got home so that I could upload a copy of a batch script I have been working on for a while now but have not had the time to release into the wild. With not a small amount of help from Zeeround (many thanks!), I have written an up/remix script using some of the same software/techniques employed in his new SpecScript in combination with AI source separation tools - mainly Demucs. I haven't documented it properly yet, but if anyone wants to give it a test drive, attached is a zip file with the script and all necessary support files (all programs used are free/open source). I have done quite a lot of test conversions, but it has not been tested out on multiple computers, so there may be some teething issues.

Use as follows:
  • Unzip to a location of your choice
  • Drag & Drop one or more stereo files onto the 'PrepForDemucs.bat' batch file. MP3, WAV & FLAC all work ok - in theory, it should work with any audio file format that SoX can read
  • Follow the dialogue box prompts to choose a destination and the script will then prepare a set of files ready for separation into stems - a box should pop up to give you instructions for this. I use the online tools at Online music/voice separator based on neural nets, but you can also use a local installation of Demucs/Spleeter if you prefer or you should also be able to use LALAL.AI, but I haven't tried this and the follow on script may need tweaking for this to work.
  • Once the separation into stems is complete, these should all be saved in the 'demucs' folder created in your chosen destination directory
  • Next, double click the 'PostDemucs.bat' file in the destination directory and the script will provide you with some options on how you would like your tracks to be upmixed.
  • Currently, the following options are available for each stem:
    • Mono to Centre - Sum to mono and place in the centre channel
    • Crossover Stereoize - Sum to mono, run lowpass and highpass filters at 150Hz, output lowpass to LF and highpass to LR
    • Stereo to Front - Place the stereo stem to LF and RF
    • Stereo to Rear - Place the stereo stem to Ls and Rs
    • 3.0 Sides to Front - Split to centre and sides using CentreCutCL. Centre to (C) and Sides to LF and RF
    • 3.0 Sides to Rear - Split to centre and sides using CentreCutCL. Centre to (C) and Sides to Ls and Rs
    • Upmix to 4.0 using CentreCutCL
    • Upmix to 4.0 using CentreCutCL and rotate all speakers 180 degrees
    • Upmix to 4.0 using CentreCutCL and rotate all speakers 90 degrees anti clockwise
    • Upmix to 4.0 using CentreCutCL and rotate all speakers 90 degrees clockwise
    • Upmix to 4.0 and mirror font to back - e.g. LF to Ls and RF to Rs
    • Upmix to 5.0 using CentreCutCL
  • Defaults are bass/drums/piano - Upmix to 4.0, vocals/other - Upmix to 5.0. An LFE channel will also be added using all stems with a lowpass filter.
  • The Script will then move, rename and neatly organise the files from the 'demucs' folder to a new subfolder called 'stems' and proceed to up/remix according to your selected preferences. There is an ini file in the bin folder in which you can change several default settings if you wish to tinker
  • Once the re/upmix is complete, the files are run through the same unlimited.vst mastering process used by Zeeround's SpecScript before being tagged. The mastering stage is done on a 'whole album' basis to preserve the dynamic relationship between tracks.
I've had some really good results using this process and have found that the quality of the final flac files is far better than my previous upmix methods, which used a combination of DTS Neural and Spleeter. I have especially found that sources that previously gave disappointing results came out very well by moving things around a little - for example, Massive Attack's 'Blue Lines' and 'Protection' have loads of reverb on the vocals but there was not much else going on in the rears using previous techniques, but by re/upmixing with bass/drums to '4.0', vocals to '5.0' and other to '4.0 mirrored' they sound great!

I think there is scope to add many more upmix options - perhaps incorporating some of the variations available in SpecScript. The number of combinations is virtually endless and the quality of the AI stem separation algorithms will only improve over time (they have come on massively in just the last couple of years). Whilst output is currently only 5.1 (that is all I have available on my setup), it could easily be expanded to 7.1/9.1 and overheads could also be added.

Any feedback would be most welcome...
 

Attachments

  • Demucs+CC_Stereo_to_5.1v0.2b.zip
    5.2 MB · Views: 0
I just noticed this on lalal.ai's blog:

"On our roadmap, there is also a brand new stem that LALAL.AI has never supported, wind instruments."​

No idea when it will happen, but interesting.
 
May be blasphemy, but I was wondering if there's a "crossing the Rubicon" moment coming up with the concept of separating voices and musical instruments in an existing mix.

A computer model of the human voice characteristics and other computer models of the various musical instruments can (IMHO) soon be created, the voice(s) and instrument(s) in the mix wouldn't need to actually be separated, just identified and completely recreated (the voices/instruments in the mix would only be used to identify what they are, an all new completely fake audio [solely of the voice(s)/instrument(s)] would be created which could be remixed any which way).

Anyone think this is where demix tech is headed (music [in the demix/remix] is solely a computer recreation of the original music)?


Kirk Bayne
 
FYI Lalal.ai has a new announcement this morning:

https://www.lalal.ai/blog/piano-syn...ynth&utm_medium=email&utm_campaign=pianosynth
Piano and Synth stems now on their newer plaftorm (faster, higher quality)

Wind instrument stems coming next

Desktop app coming

I split out the piano on Who Are You last night. LALAL.AI tells me that the algorithm used is "Cassiopeia." However, this announcement says the new algorithm is "Phoenix." I just noticed that there is a little toggle switch at the bottom of the separation page. It has to be set to the left for the newest algorithm. If it is set to the right it uses the previous algorithm:

lalal.png


phoenix.png


I will redo the split using "Phoenix" for comparison to "Cassiopeia" and report back here later.
 
Here is the comparison of the piano stem extracted from Who Are You. The top two tracks were extracted last night using "Cassiopeia." The lower two tracks were extracted this afternoon using "Phoenix." The Phoenix extraction is visibly cleaner than the Cassiopeia extraction and has more dynamics. Phoenix also sounds cleaner than Cassiopeia in isolation.

I don't know how great a difference it will make when all of the stems are put back together, but I am definitely glad to have the cleaner piano stem with which to work. Note that I had cleaned up some artifacts in the Cassiopeia extraction last night. What you see below includes my cleanup work on Cassiopeia. (Looks like I missed some 'junk' in the middle there.) The Phoenix extraction is untouched as received from LALAL.AI.

Cassiopeia vs Phoenix Piano.jpg
 
Back
Top