An Open Xu Silong Voice Model, for Anyone Who Went Looking

Published: 2025-10-18

Model overview

This is a pretrained model built on GPT-SoVITS, a speech synthesis framework that combines a GPT (Generative Pre-trained Transformer) model with SoVITS voice conversion technology. With only a small amount of sample audio, it can produce high-quality voice cloning and text-to-speech output.

For training, a single recording of Xu Silong speaking live was used. The audio was cleaned up with noise reduction, then segmented before training on an NVIDIA GeForce RTX 4090.

Included files

The release contains two files:

虚似龙-e10.ckpt
虚似龙_e4_s32.pth

`虚似龙-e10.ckpt`

File type: CKPT
File size: 148 MB (155,312,957 bytes)
SHA256: AC74F39FDC648BF09FDCC34B7262E201B892715A1D0EF654EE5A986318704017

`虚似龙_e4_s32.pth`

File type: PTH
File size: 81.0 MB (85,007,879 bytes)
SHA256: 3C3ADF3C717B9DE0EF2676359C775B899B51F82FCC6B25F666F82235C4CBB648

Running it locally

If you have a graphics card at home and already know your way around AI models, you can download the files and run the model locally just for fun.

Base project

The model is based on the GPT-SoVITS project by RVC-Boss.