Category: UTAU/Moresampler

Help me with revising the experimental Arpasing recording script

Hello folks,

After 300 CPU-hours of intense computation, a new recording script for Arpasing has been generated.

Unlike the lists you've seen before, this one consists of actually meaningful words and phrases selected from a small subset of public-domain books from Project Gutenberg, yet it is different from CMU Arctic (which is also based on Gutenberg) in that the new Arpasing script is designed for singing synthesis. You can take a glimpse at the first few lines,

1 was Matthew's consolatory rejoinder
2 he stood before the wide opening
3 we shall begin our researches here
4 with his usual craving on him
5  but this little line of dancing men

Moresampler 0.8.3 Release

This release fixes two crashing bugs that do not happen all the time, though chances of occurring still exist and occurrences have been reported by only two users so far. There's no immediate need to upgrade to 0.8.3 if the previous version works fine under your setup.

Download Moresampler 0.8.3 here


Moresampler 0.8.2 Release

Our users in Japan had long discovered that Moresampler adds a vibe to the voice when modulation parameter goes to 100%. (FYI: what modulation does is to restore the pitch fluctuation in the recording.) While this feature technically isn't deliberately designed, it comes as a surprisingly nice byproduct of Moresampler's sophisticated algorithm and architecture. Though the mod 100% vibe isn't perfect yet - in some cases just like using mod 100% on many other engines, the pitch goes completely off making the voice sound almost "drunk". In this release I present a fix to this problem and I'd recommend everyone to give it a try.

Inspired by the modulation parameter, a new flag 'Mp' is added that randomly perturbs the pitch curve. The number after Mp controls the degree of perturbation. While the range is from 0 to 100, a small number around 5 should be enough to notice the difference.

An Arpasing-related bug in the oto generator is also fixed. Moresampler 0.8.2 is fully compatible with the recently updated Arpasing 0.2.


Moresampler 0.8.1 Release

Thanks to @ 's report, I found a severe bug in the recently upgraded pitch estimator where by mistake a standard deviation value was treated as a variance. This resulted in a strong noise being added to the input before running pitch estimation, and it tremendously reduces the accuracy of pitch and voicing detection. This bug is fixed in Moresampler 0.8.1 as the only change made, but it is a very important bug fix.

Download Moresampler 0.8.1 here


0.8.1 (Mar. 29, 2017) Download

  • Bug fix: a severe bug in the recently upgraded pitch estimator.

Introducing Arpasing for English UTAUloids

A few months ago I uploaded a document proposing a new English UTAU recording script with detailed specifications. As an attempt to basically replicate a unit-selection based speech synthesizer in UTAU, the new standard was named Arpasing. Thanks to a few users who recorded the very first Arpasing voicebanks in spite of the lack of clear instructions, we're now able to further explore the uncharted land with Moresampler 0.8.0's built-in support for Arapsing oto generation. Here I'm launching another tool, and this time for actually creating USTs with Arpasing.

Please keep in mind that Arpasing is an experiment, and we don't yet know if it's going to work well, until more efforts are made to revise the tools & voicebanks.


Moresampler 0.8.0 Release

Here is the long-awaited Moresampler 0.8.0.


Moresampler 0.7.2 Release

Moresampler 0.7.2 is in the transition between 0.7.x and 0.8.x - Arpasing is not supported yet but some oto-generation features such as loading files from index.csv are implemented. There are some code refactoring & bug fixes as well.

The oto generator now is able to generate CVVC oto entries. Regardless of the format of input (hiragana/romaji), it outputs in any format you want. Once a voicebank is loaded there will be a prompt in the command line window asking for output format.

In response to feedback that Moresampler's oto generator creates a lot of redundant (in a diphone synthesis sense) unit aliases, I've added an option that only updates the existing entries in a given oto.ini file, but not creating any new entry. Note that to enable this feature you need to have a "vanilla" oto.ini under the voicebank directory first.


Initial Results & Facts on Arpasing

Shortly after the release of Arpasing proposal, Adlez recorded a test voicebank and sent back to me in almost no time. I'd like to thank Uchuu and BagHeadChan for their contribution as well. Those data really helped me to work out a roughly working solution to voicebank labeling so I could start actually making songs with Arpasing voicebanks.

Here are two short samples I've created with Adlez's voicebank,

Unravel (English ver, lyrics by Lucy)

("Oh won't you tell me, please just tell me, explain how this should work. I fear who could it be, that lives inside of me. My conscience cracking, mind reacting, surrounded by the world. But here you're smiling bright, completely blind to life.")

Tokio Funka (English ver, lyrics by Jayn)

("In a hazy town with deception all around, there's a dusty cloud that drifts downward to the ground.")


Proposal of Arpasing, English UTAU Reclist & Specs

At the end of this two-month initial design phase, I proudly present Arpasing, a scientifically-designed English recording script and naming standard for UTAU.

Followed immediately after Moresampler 0.7.0's release, people were asking me if the oto generation feature would ever support English. I said yes, but later found the method for Japanese wasn't directly applicable for English, because of the myriad of different recording schemes each with its own phonetic notation. Writing wrappers for each of them seems to be a quite troublesome business, so I began to try if it's possible to unify the existing solutions into one.

And I also tried to tailor and apply the corpus design methods in academic publications on this very specific problem. The result may appear a bit bizarre and intimidating at a first glance, but much more concise than ever before. It achieves 96% diphone coverage in 120 tri-syllable utterances, and 42% triphone coverage in another 100 utterances.


Looking for someone to join Moresampler development

Within a year Moresampler has grown from a small project with frequent crashes to something fast and robust built on the top of 10 libraries; latest research outcomes from the academia of audio and speech processing have been implemented, bringing synthesis quality up to an unprecedented level.

Up to this date Moresampler is designed, developed and maintained by a single person (me), but it's getting increasingly harder to keep up with the growing number of feature requests & bug reports while on the other hand, there are ideas and areas I would like to explore but are beyond the reach of a single developer (or may take tremendous time/efforts to accomplish). To make Moresampler even better, I'm looking for a collaborator to join the R&D.