DevChoco

실전 코드와 디버깅 맥락을 남기는 개발 지식 아카이브

Tech News
조회 82분 읽기

Show HN: I trained a 9M speech model to fix my Mandarin tones

저자는 자신의 만다린 발음 문제를 해결하기 위해 900만 개의 음성 모델을 개발했습니다. 이 모델은 약 300시간의 데이터를 기반으로 훈련되었으며, 브라우저에서 실행 가능합니다.

#mandarin#speech-model#conformer-ctc#onnx-runtime#language-learning#ai

출처: Hacker News — https://simedw.com/2026/01/31/ear-pronunication-via-ctc/

  • 모델: 9M Conformer-CTC
  • 데이터: ~300시간 (AISHELL + Primewords)
  • 양자화: INT8 (11MB)
  • 실행: ONNX Runtime Web에서 100% 브라우저 내 실행
  • 기능: 음절 발음 및 톤 평가 (Viterbi 강제 정렬 사용)
  • 사용해보기: 여기
  • 댓글 링크: Hacker News 댓글

의견

댓글/토론에서 나온 의견을 참고용으로 정리했습니다. (사실로 단정하지 말고 맥락 확인 권장)

  • Hacker News · @vunderba: When I was living in Taiwan, one of the ways I forced myself to remember to pronounce the tones distinctly was by waving my hand in front of me, tracing the arc of each character’s tone. It helped a lot even if I did look like an insane expat conducting an invisible orchestra. One more thing: there's quite a bit …
  • Hacker News · @rahimnathwani: This is incredible. When I was first learning Chinese (casually, ~20 years ago), my teacher used some Windows software that drew a diagram of the shape of my pronunciation, so she could illustrate what I was getting wrong in some objective way. The thing you've built is so good, and I would have loved to have it …
  • Hacker News · @jellojello: This is amazing, if you feel like opening an entire language to being learned more easily.. Farsi is a VERY overlooked language, my wife/her family speak it but it's so difficult finding great language lessons (it's also called Persian/Dari)
  • Hacker News · @simedw: Thank you. I had a quick look at Farsi datasets, and there seem to be a few options. That said, written Farsi doesn’t include short vowels… so can you derive pronunciation from the text using rules?
  • Hacker News · @simedw: For accents, I’ve mostly tested with a few friends so far. I’m wondering whether region should be a parameter, because training on all dialects might make the system too lax.

같이 읽으면 좋은 글

같은 주제이거나 태그가 겹치는 글을 연결해 탐색 흐름을 강화했습니다.

Tech News 전체 보기

이전 글

What the Success of Coding Agents Teaches Us about AI Systems in General

다음 글

Swift is a more convenient Rust

댓글

불러오는 중…