Monthly archives: October, 2016

Moresampler 0.7.2 Release

Moresampler 0.7.2 is in the transition between 0.7.x and 0.8.x - Arpasing is not supported yet but some oto-generation features such as loading files from index.csv are implemented. There are some code refactoring & bug fixes as well.

The oto generator now is able to generate CVVC oto entries. Regardless of the format of input (hiragana/romaji), it outputs in any format you want. Once a voicebank is loaded there will be a prompt in the command line window asking for output format.

In response to feedback that Moresampler's oto generator creates a lot of redundant (in a diphone synthesis sense) unit aliases, I've added an option that only updates the existing entries in a given oto.ini file, but not creating any new entry. Note that to enable this feature you need to have a "vanilla" oto.ini under the voicebank directory first.

Initial Results & Facts on Arpasing

Shortly after the release of Arpasing proposal, Adlez recorded a test voicebank and sent back to me in almost no time. I'd like to thank Uchuu and BagHeadChan for their contribution as well. Those data really helped me to work out a roughly working solution to voicebank labeling so I could start actually making songs with Arpasing voicebanks.

Here are two short samples I've created with Adlez's voicebank,

Unravel (English ver, lyrics by Lucy)

("Oh won't you tell me, please just tell me, explain how this should work. I fear who could it be, that lives inside of me. My conscience cracking, mind reacting, surrounded by the world. But here you're smiling bright, completely blind to life.")

Tokio Funka (English ver, lyrics by Jayn)

("In a hazy town with deception all around, there's a dusty cloud that drifts downward to the ground.")

Proposal of Arpasing, English UTAU Reclist & Specs

At the end of this two-month initial design phase, I proudly present Arpasing, a scientifically-designed English recording script and naming standard for UTAU.

Followed immediately after Moresampler 0.7.0's release, people were asking me if the oto generation feature would ever support English. I said yes, but later found the method for Japanese wasn't directly applicable for English, because of the myriad of different recording schemes each with its own phonetic notation. Writing wrappers for each of them seems to be a quite troublesome business, so I began to try if it's possible to unify the existing solutions into one.

And I also tried to tailor and apply the corpus design methods in academic publications on this very specific problem. The result may appear a bit bizarre and intimidating at a first glance, but much more concise than ever before. It achieves 96% diphone coverage in 120 tri-syllable utterances, and 42% triphone coverage in another 100 utterances.