I-EnCodec, ikhowudi entsha yomsindo weMeta

ikhowudi

I-Encodec yikhowudi ecacisa kusetyenziswa inethiwekhi ye-neural enezinga loxinzelelo malunga ne-10x

Mva nje meta (obesakuba nguFacebook) Iveze ikhowudi yayo entsha yomsindo ebizwa ngokuba yi-EnCodec, que isebenzisa ubuchule bokufunda ngoomatshini ukwandisa umlinganiselo woxinzelelo ngaphandle kokulahlekelwa ngumgangatho.

Indlela entsha inokucinezela kunye ne-decompress audio ngexesha langempela ukufezekisa ukunciphisa ubungakanani be-state-of-art. ikhowudi ingasetyenziselwa zombini ukusasaza iaudio ngexesha lokwenyani njengokufaka ikhowudi yokugcina kamva kwiifayile.

Namhlanje, sichaza inkqubela phambili yoPhando lwethu lwe-AI (FAIR) esele yenziwe kwindawo ye-AI-powered audio hyper-compression. Khawube nomfanekiso wakho umamele umyalezo ophulaphulwayo womhlobo kwindawo enonxibelelwano olulambathayo kwaye ungayeki okanye untlitheka. Uphando lwethu lubonisa indlela esinokusebenzisa ngayo i-AI ukusinceda sifezekise oku.

KwiCodec zibonelela ngeemodeli ezimbini ilungele ukukhuphela:

  1. Imodeli ye-causal esebenzisa i-24 kHz isampuli yesampuli, isekela kuphela i-monophonic audio, kwaye iqeqeshelwa kwiintlobo ezahlukeneyo zedatha ye-audio (efanelekile kwi-encoding yentetho). Imodeli ingasetyenziselwa ukupakisha idatha ye-audio yokuhanjiswa kwi-bit rates ye-1,5, 3, 6, 12 kunye ne-24 kbps.
  2. Imodeli engeyiyo i-causal esebenzisa ireyithi yesampulu ye-48kHz, ixhasa isandi se-stereo, kwaye yaqeqeshwa kumculo kuphela. Imodeli ixhasa imilinganiselo ye-bit ye-3, 6, 12 kunye ne-24 kbps.

Kumzekelo ngamnye, kulungiselelwe imodeli yolwimi olongezelelweyo, yintoni ivumela ukwanda okukhulu kwi-compression ratio (ukuya kwi-40%) ngaphandle kokulahlekelwa komgangatho. Ngokungafaniyo neeprojekthi zangaphambili zokusebenzisa iindlela zokufunda koomatshini kuxinzelelo lomsindo, I-EnCodec ingasetyenziselwa ukupakisha intetho kuphela, kodwa kunye noxinzelelo lomculo kunye nesampulu yesampulu ye-48 kHz, ehambelana nenqanaba leeCD zomsindo.

Ngokutsho kwabaphuhlisi be-codec entsha, ngokuhambisa ngesantya esincinci se-64 kbps xa kuthelekiswa nefomethi ye-MP3, bakwazile ukwandisa umlinganiselo woxinzelelo lwe-audio malunga namaxesha alishumi ngelixa begcina umgangatho ofanayo womgangatho (umzekelo, xa usebenzisa i-MP3 ifuna i-bandwidth ye-64 kbps, ukudlulisa kunye nomgangatho ofanayo kwi-EnCodec, i-6 kbps yanele).

Le datha inokuthi emva koko ihlaziywe kusetyenziswa inethiwekhi ye-neural. Sifumene umlinganiselo oqikelelweyo we-10x xa kuthelekiswa neMP3 kuma-64kbps, ngaphandle kokulahlekelwa ngumgangatho. Ngelixa obu buchule bukhe baphononongwa ngaphambili kwintetho, singabokuqala ukuyenza isebenze i-48 kHz yesampulu yomsindo westereo (okt umgangatho weCD), osemgangathweni wokusasazwa komculo.

Uyilo lwekhowudi Yakhiwe kwisiseko sothungelwano lwe-neural ngolwakhiwo “oluguqulayo” kwaye isekelwe kwiibhondi ezine: i-encoder, quantizer, idikhowuda kunye nomcaluli:

  • El encoder ikhupha iiparameters kwidatha yelizwi kwaye iyiguqule ibe ngumjelo opakishwe kwisantya esisezantsi sesakhelo.
  • El umxabisi (i-RVQ, iResidual Vector Quantizer) iguqula umjelo wemveliso ye-encoder ibeseti zeepakethi, icinezela ulwazi olunxulumene nesantya sebit esikhethiweyo. Imveliso ye-quantizer yimbonakaliso ecinezelweyo yedatha efanelekileyo yokudluliselwa kwinethiwekhi okanye ukugcinwa kwidisk.
  • El idikhowuda icofa umelo lwedatha ecinezelweyo kwaye iphinda iqulunqe isandi soqobo.
  • El umcaluli iphucula umgangatho weesampulu ezenziweyo (isampulu) kuthathelwa ingqalelo imodeli yokuva umntu.

Nokuba yeyiphi inqanaba lomgangatho kunye ne-bitrate, imifuziselo esetyenziselwa ukukhowudwa kunye nokuchazwa kwekhowudi yahlukile kwiimfuno zezibonelelo ezithozamileyo (ubalo olufunekayo ekusebenzeni kwexesha lokwenyani lwenziwa kumbindi we-CPU enye).

Okokugqibela, kwabo banomdla, kufuneka wazi ukuba ukuphunyezwa kwereferensi ye-EnCodec ibhalwe kwiPython usebenzisa isakhelo sePyTorch kwaye ilayisenisi phantsi kweCC BY-NC 4.0 (Creative Commons Attribution-NonCommercial) ilayisenisi yokusetyenziswa okungeyontengiso. kuphela.

Ukuba unomdla wokufunda ngakumbi ngayo, ungajonga iinkcukacha ku eli khonkco lilandelayo.


Shiya uluvo lwakho

Idilesi yakho ye email aziyi kupapashwa. ezidingekayo ziphawulwe *

*

*

  1. Inoxanduva lwedatha: I-AB Internet Networks 2008 SL
  2. Injongo yedatha: Ulawulo lwe-SPAM, ulawulo lwezimvo.
  3. Umthetho: Imvume yakho
  4. Unxibelelwano lwedatha: Idatha ayizukuhanjiswa kubantu besithathu ngaphandle koxanduva lomthetho.
  5. Ukugcinwa kweenkcukacha
  6. Amalungelo: Ngalo naliphi na ixesha unganciphisa, uphinde uphinde ucime ulwazi lwakho.