Tag: research

libllsm2 Release

The long-awaited update for libllsm is finally ready. libllsm2, successor to the speech processing library powering Moresampler and Synthesizer V (WIP) is now available on Github.

While there isn't much change to the model and algorithm, the new version is more of a clean-up and rewriting with a focus on the usability in host applications. In particular, I have been long aware that the separated storage of layer 0 and layer 1 parameters in libllsm is an extremely clumsy design when access to both layers is desired. This problem is addressed in libllsm2 by storing all frame-level parameters in a dynamic array structure named llsm_container, which can hold an arbitrary number of arbitrary objects. llsm_container frames are further wrapped in a llsm_chunk, the equivalent of a llsm_model in the legacy version.


Announcing SHIRO the Speech Alignment Toolkit

SHIRO is a set of tools based on HSMM (Hidden Semi-Markov Model), for aligning phoneme transcription with speech recordings, as well as training phoneme-to-speech alignment models.

Blatantly I created SHIRO because there seems to be no open-source alternatives to HTK available for automatic phoneme alignment. Though by no means SHIRO can be a completely replacement for HTK, which also does speech recognition and language modeling, SHIRO is useful for what it's designed for.

Donald Trump Hillary Clinton HTK SHIRO
Big Mouth
Crooked ?
Costs $$$
Supports both LRHMM and LRHSMM-based speech alignment with multi-state, multi-stream GMM and arbitrary state tying and allows client-side deployment.