NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
The Cantonese Scrolls – A Cantonese language learning mental RPG (cantoscrolls.com)
gfaure 1 days ago [-]
> - Characters starting with the vowel i sound more an e. Therefore, "to invite", 請 (cing2), sounds more like ceng2, and "to hear/listen", 聽 (ting1), sounds more like teng1.

As a Cantonese speaker, I love the effort here! However, the above isn't correct. This is an example of vernacular vs. literary pronunciation, and 請 has both pronunciations, depending on context. For instance, 請 is ceng2 when used as the verb "to invite", but cing2 in compounds like jiu1 cing2 邀請.

It shouldn't be conflated with the phenomenon later in that same paragraph about 懶音 "lazy pronunciation".

fearedbliss 20 hours ago [-]
Thanks for that! Yup I'm well aware of the differences between literary and casual (and of course the differences between Standard Chinese and Written Cantonese). My goal for this project is to help preserve and teach the Cantonese language based on my understanding (which is still improving), but more importantly teaching it as a completely independent language because that's what it is. In this instance Standard Chinese or any sort of literary pronunciation is essentially useless to me since people aren't speaking that way, and I also am a strong believer of writing the way you speak. Mandarin speakers also used to have this problem until the mid 1800s when the transition from 古文 to 白話 took place, and standardized on Beijing dialect.

And you are definitely right about 懶音. They are both explained in the same section not because they are the same thing but because they are both modifications occuring for the sound pronunciations.

gfaure 20 hours ago [-]
> In this instance Standard Chinese or any sort of literary pronunciation is essentially useless to me since people aren't speaking that way

Thank you for creating this! But I'm afraid this is the misunderstanding -- words like san1 cing2 申請 are very much everyday words, even though the reading of the character is deemed literary. You should think of characters like 請 and 聽 as just having multiple in-context pronunciations, some of which you should learn, some of which you probably don't need to.

fearedbliss 19 hours ago [-]
Definitely. I'm not saying the word 申請 itself isn't used in normal speech, but more that the pronunciation of 請 in normal speech would sound more like an e. So I would rather teach people the normal way people would pronounce words and not the literary form, since as I said before, I also do want to stay away from Standard Chinese as much as possible and teach Written Cantonese. It will take me some time to continue to extract the essence of the language and document it at the core level. Once I've extracted the language it could (and should) be used to create full literary writings in Written Cantonese, and not need to use nor ever learn Standard Chinese. If my target audience is to speak to Cantonese people specifically and not every single person of any Chinese language in existence, then writing in Written Cantonese is enough for my purposes and goals.

I definitely appreciate the feedback :). Thank you!

qazxcvbnm 16 hours ago [-]
As a native speaker I assure you that 請 is pronounced differently in context, and that both readings are perfectly Cantonese.

For a more clear example, see 平: 大平賣 ("vernacular" reading; peng4) (lit. big cheap sale; i.e. sale) vs 平面 ("literary" reading; ping4) (lit. flat surface; i.e. surface). peng4 is the "vernacular" reading but used exclusively for meaning cheap. ping4 is "literary" but used everywhere else. "vernacular" versus "literary" is a linguistic classification, but do not necessarily represent either being more common than the other reading. Both readings exist.

fearedbliss 10 hours ago [-]
I'm happy to hear that both can be used, normally I always have heard more of the "e" shift in a relatively consistent way. I normally say "大減價" for a big sale, but I understand your point was more about showing a demonstration where the "i" may be used. Although I would say that it really depends on the speaker and how they feel. I would most likely use the "e" shift consistently and be perfectly understood.
qazxcvbnm 10 hours ago [-]
The example is meant to demonstrate that whether I say 平 as ping or peng changes the meaning of the sentence. It does not depend on how I feel like it.

Another example: I buy some raisins, and the lady gives me some. I say: 咁多! If I pronounce "gam3do1", I say "That's so much!". If I pronounce "gam3doe1", I say "That's so little!".

(Yes, contrived. Almost always the former is used. But both can be.)

fearedbliss 10 hours ago [-]
I definitely read that as "gam3 do1". I never heard "doe1" for this character, there is always something more to learn ;).
fredrikholm 1 days ago [-]
Heartwarming to see learning material for Cantonese.

Does anyone have any good material for speaking? I've grown up watching Cantonese movies and have tones and pronunciation down ok, but I'm finding it hard to progress through private study with regards to reaching conversational levels of vocab/grammar.

jhsvsmyself 1 days ago [-]
Hop on OpenAI voice mode and start speaking. I do this for Mandarin and Spanish, it even speaks my native Afrikaans! Valuable and only $20 a month. Also, it's great because it can speak about anything, which I find refreshing rather than having domain specific resources.
tdeck 20 hours ago [-]
I've found it often answers a Cantonese question in Mandarin. I'm not sure if that's because it converts to written Chinese, then answers the text, then reads it in Mandarin. But sometimes if you use a Cantonese specific word like 乜野 it'll be (presumably) converted to written Cantonese and then will answer in Cantonese.
wenc 19 hours ago [-]
What model are you using? I'm using ChatGPT 4o (paid) in Advanced Voice Mode and it's speaking to me in perfect Cantonese.
fearedbliss 20 hours ago [-]
哈哈。嗰個AI覺得廣東話同國語一樣.

-.-

k_sze 20 hours ago [-]
Right off the bat, 小念頭, siu2 lim6 tau4 seems to be incorrectly romanized. 念 should be nim6, not lim6.

lim6 is what we call 懶音 (literally "lazy pronunciation").

The statement that "Cantonese has no formal standardization for its phonetic and writing systems" deserves a better explanation. There are in fact multiple standards, as can be seen on the website of the Multi-function Chinese Character Database developed by the Chinese University of Hong Kong:

- https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/initials.php

- https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/finals.php

- https://humanum.arts.cuhk.edu.hk/Lexis/lexi-mf/tones.php

When you search for a Chinese character on the Multi-function Chinese Character Database website, by default, it will give you the jyutping (粵拼) romanization (you can select a different romanization system from the drop-down at the bottom left of the screen).

fearedbliss 20 hours ago [-]
I'm aware of this and I've made a conscious decision to use the lazy pronunciation. I'm actually thinking of switching to lazy pronunciation as the standard pronunciation for a lot of words, the same as what some authors have done. It's more difficult and weird to teach people a pronunciation that not many people are using. At this point the lazy pronunciation has pretty much dominated the normal language development.

我都打詠春。喺學院,我師傅同埋我哋都話"lim". 如果你話"nim"會係好奇怪。我推介你試吓明粵卷唔使係一百%啱。如果係八十到九十%啱同大部分人明你講乜嘢,我覺得OK。學生可以繼續學廣東話。越來越好。

snapetom 18 hours ago [-]
I agree 100% to use the lazy pronunciation. I'm a native speaker, and I don't think I've ever heard the formal except when talking to others who are trying to learn Cantonese.
qazxcvbnm 15 hours ago [-]
I cannot corroborate the experience of the other commenter. 念 is very definitively nim6, with lim6 as a common variant/lazy pronunciation. It is not at all uncommon that words have a variant pronunciation when used in certain contexts (e.g. as jargon or slang), such as Wing Chun in your case, but the word itself is certainly not properly or even commonly pronounced the "lazy" way otherwise (c.f. 執念 zap1nim6; 萬念俱灰 maan6nim6keoi1fui1; 惡念 ok3nim6).

There are many cases where variant pronunciations sound natural, but (individually) "correct" pronunciations sound unnatural (e.g. 土瓜灣 tou2gwaa1waan1 "correct" but unnatural, tou2gwaa1waan4 "variant" and natural). I have never encountered a case where a correct pronunciation would be unnatural where "lazy" pronunciations would be natural.

On another note, your Cantonese sounds a bit unnatural (without many of the connecting words between thoughts) (typical for learners). A more natural phrasing might be something like this: 我打詠春嘅。學院(?)入面我師父同我哋都係讀lim嘅。反而如果有人讀nim會好奇怪。我想你明白粵卷唔使百分百啱晒嘅。講嘢可能淨係使啱八成到九成,大部份人都明你講嘅嘢。咁都OK嘅。學廣東話嘅學生可以多啲地方繼續學廣東話就可以越來越好。

A few notes on your usage:

"我都打詠春" 都 here seems unnecessary, as you have not discussed 詠春 previously.

"學院" slightly dubious, usually used for more academic purposes. For martial arts, 館 or 武館 may be more typical, but I suppose it is possible that your school brands itself as 學院.

"我師傅" "師傅" and "師父" sound the same, but mean quite different things. 師傅 is a polite name for someone working in some field, but 師父 is your master. (In fact this is another example of "lazy" or variant pronunciations, which is emphatically not always the correct pronunciation; 父 here is fu2, but by itself is definitively fu6)

"都話"lim"" "話" probably refers more to the topic of what someone says rather than the pronunciation, which the word "讀" would make clear, that we discuss pronunciation

"我推介你試吓明粵卷唔使係一百%啱" too be honest I'm not completely sure what this sentence means. I hope what I put was what you intended.

"一百%" 百分百 is more natural

"八十到九十%" 八成到九成 (lit. 8 10%s to 9 10%s) is more natural

"明你講乜嘢" (sounds like: understand what on earth you're saying) "乜嘢" emphasises the object like a question (i.e. sounds like a rhetorical question; when not a question, usually would be used when expressing frustration/complaining/scolding); no need to emphasise, "明你講嘅嘢" would be fine.

"學生可以繼續學廣東話。越來越好。" disjointed sentences, not completely sure your intended meaning. I hope what I put was what you intended.

fjdjshsh 11 hours ago [-]
Most of these corrections seem stylistic / "I would say it this way to make it clear" rather than actual mistakes.
qazxcvbnm 11 hours ago [-]
Yes, I have mainly put awkward usages in the latter section, which will hopefully be useful to the author. If you compare the paragraphs, the outright grammatical problems I simply fixed. The author’s language is definitely comprehensible to a patient listener, but it can be much better still.
fearedbliss 11 hours ago [-]
Thank you for the corrections, I definitely agree that while I've learned a lot of Cantonese in the past 11 years, I still have a long way to go and loving the process. Given that Cantonese is my third language (I've studied a bunch of other languages, but Cantonese is third language I'm actually specializing in for personal reasons), it makes sense that it sounds unnatural, at least for now. For a portion of it I do agree with the other commenter that it is a stylistic choice. The % part is just me using what's already popular in Cantonese speaking which is to use the "pou-cent" type of style rather than the more traditional way. You are right regarding 學院, my school advertises itself in this way hence my usage of it ;D. But overall, you got what I was trying to say which is the most important part of the process, I'll continue to improve over time :).
Arn_Thor 11 hours ago [-]
In my 10 years in HK, including months worth of weekly language classes, “n” and “l” were treated utterly interchangeably. Not to say that it’s not worth the effort in some contexts to make a distinction, but if the aim is to get a functional understanding of the language the “lazy” version is useful.
qazxcvbnm 11 hours ago [-]
A counterexample for you: 聶 nip6, never lip6. 獵 lip6, never nip6. 焫 naat3, "lazily" laat3. 辣 laat6, never naat6. (Yes, almost everywhere interchangeable)
Arn_Thor 3 hours ago [-]
Touché
10594891 1 days ago [-]
I'm confused, is there supposed to be some kind of game on the site that isn't loading for me?

There are references to things like "dungeons" and "encounters" and so on, but all that I see are a bunch of pages with lists of words/phrases in boxes.

fearedbliss 20 hours ago [-]
Haha. As described in the title of the game, it's a Mental RPG. The game takes place in your mind. The site only gives you the environment for you to simulate in your mind. Think about Dungeons and Dragons "Theater of the Mind" but for language learning. Every encounter with a word is a battle. You defeat the enemy once you understand it. It's honor system based given that only you would know if you actually understand it.
littlekey 24 hours ago [-]
I was about to say the same thing, the whole thing is just confusing to me. There doesn't appear to be any game.
arch-choot 20 hours ago [-]
Pretty cool! I've been living in HK for 7 years now and not moved past the basic few phrases - mostly because English gets you so far there's no "forcing factor" (vs. in Tokyo you'd be kinda forced to learn Japanese).

One suggestion, though it would be quite high effort: have you considered also adding a button or something for the pronunciation? I think the hardest part for learners is knowing their reading of the jyutping sounds correct (especially with all the tones).

fearedbliss 20 hours ago [-]
Good question. I have actually and was recently playing around with that. The problem isn't necessarily the recordings (since I've accepted that I'm willing to record all the possible Cantonese sounds for all the words myself - I'm not a fan of AI and I don't use it in my personal life if I can avoid it), but more because of the nature of the project. I haven't figured out a way to be able to get the base directory of the site so that I can load all of the audio files from a specific directory, specially when being used in offline mode. If I was only running this project from a webserver, it's easy to get the website root and just append the audio location, but if you download the project for offline use, there doesn't seem to be an easy way to access the base dir from the file:// protocol perspective. I have some ideas but I'll need to experiment with that over time. As a stop gap measure I've added the "Core Audio Reference" page which is my recordings of all of the tones, initials, and finals. But it's still not the best experience.

Once I figure out the base dir thing I'll be able to easily just make every automatic romanization that you see an a href which points to the mp3.

dbtc 19 hours ago [-]
You couldn't just have an 'audio' folder next to 'dungeons'?
fearedbliss 11 hours ago [-]
Hey @dbtc, I did think about doing this but originally decided to try and find another solution. Reason being that atm my current directory structure is nested (with a variable length of nesting), and I also wanted to be able to use the audio files from anywhere in the app, which means outside of the dungeons directory. But you bringing it up again makes me think if I'm realistically going to be using them outside of the dungeons either way. Also, even if the main folder containing the audio files is called something else, as long as I have a flatter directory structure, I can make it work. I'll consider a flatter more predicable approach since having the audio files for each word is much more beneficial than not.

Thank you!

tlyleung 19 hours ago [-]
I'm thinking of improving my Cantonese by reading a 金庸 novel like 倚天屠龍記. Hoping to find an online resource similar to this website that has the traditional Chinese characters, Jyutping romanisation and translation all aligned, along with the spoken audio as well.
999900000999 1 days ago [-]
Very neat!

I've only studied Mandarin though, are you aware of a similar game for that ?

fearedbliss 1 days ago [-]
Not that I'm aware of. I believe I'm the first person to frame (and in a way "open source") a language program in this type of way. My perspective is a mixture of a lot of different things including my love for Diablo 1 (and its minimalism and simplicity) and 2 (Original, not Resurrected).
Svoka 1 days ago [-]
This is amazing! Consider potentially adding support for WebSpeech api to make it easier. Something like

    const voice = speechSynthesis.getVoices().filter(e => e.lang==='zh-HK').at(-1)
    const utterance = new SpeechSynthesisUtterance('你好')
    utterance.voice = voice
    speechSynthesis.speak(utterance)
fearedbliss 20 hours ago [-]
Thanks for the suggestion. Take a look at my other reply above, but The Cantonese Scrolls is a project that needs to be able to run offline, directly in your web browser, with no network calls. I allow everyone to download the entire project and self study offline or self host it if they want. This serves as both a backup mechanism, and a defense mechanism against any potential government censorship by countries that don't want Cantonese to exist.
mdaniel 7 hours ago [-]
It's your project, I just wanted to point out that the code snippet the GP posted works offline, subject to the user having the voices installed. My `speechSynthesis.getVoices().filter(e => e.lang==='zh-HK')` returned nothing but it still tried to speak something anyway, even with the networking turned off on my computer

https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_...

1 days ago [-]
chongli 1 days ago [-]
I have been using Duolingo for a couple years to learn some Mandarin. Its resources are pretty good, if a bit dodgy here and there (some occasional weird pronunciations or translations).

I'm really excited about a Cantonese resource though. I have a few Cantonese-speaking friends and I would love to learn some myself!

fearedbliss 20 hours ago [-]
Thank you :). Definitely feel free to use, download and share 粵卷 for your learning. I'll be continuing to make improvements to it over time. This is a life long project of mine so expect more improvements as my Cantonese develops.
ilamont 22 hours ago [-]
I've seen weird tones, but IMHO the biggest limitation of Duolingo Mandarin is the lack of a traditional character option.
peterburkimsher 23 hours ago [-]
Pingtype isn't a game, but you might find it interesting. It does the pinyin and word-by-word romanisation of whatever text you put in.

https://pingtype.github.io

KerryJones 24 hours ago [-]
Came here to ask for the same thing.
neuroelectron 19 hours ago [-]
Just looking at the game, I'm guessing you need to already have some knowledge of Cantonese writing?
fearedbliss 10 hours ago [-]
Not necessarily, the way I see it you can just jump in and start learning and practicing. Now is there a correct way to write the strokes, technically yes there is (which fun fact, Chinese writing is designed for right handed people. My wife is a native speaker from Hong Kong but she's a lefty, so she adjusted all stroke writing from the left handed perspective ;D), however, I would say writing the strokes properly when the goal is to learn how to read/speak/hear Cantonese is the main goal is alright. Will you have to unlearn bad habits or re-train yourself to write the strokes properly? Yes, this may be a possibility. Of course you could just research how to write the strokes from the beginning.

The Cantonese Scrolls is a project to document and teach the language, and record the characters for its direct written form. It doesn't contain every single aspect or tool for language learning, but the core material to preserve the language, let's say 250 years later when someone unearths it. This is also why I called it a scroll, since I view it as a sort of digital ancient document written on parchment paper haha.

koshergweilo 21 hours ago [-]
This is awesome, there aren't many good resources to learn Cantonese, especially compared to Mandarin
awongh 20 hours ago [-]
It's crazy that in some ways cantonese is a dying language, or at least one that's being sidelined by the government, esp. as Hong Kong is being fully transitioned back to China.

But just as many people speak cantonese as speak italian.

fearedbliss 20 hours ago [-]
It won't die as long as we keep using it. Use it or lose it pretty much. I do think there are a lot of people that might want to learn Canto but since it doesn't have any official standardization and there being an active push by greater forces to suppress it, it would be difficult for anyone outside of Hong Kong to learn it. This is where my 粵卷 comes in. I want to continue improving and refining my teaching strategy over the years and create a standard that people can use to learn the language.

The Cantonese Scrolls is designed to run directly in your web browser (thus cross platform), portable (in that it's conveniently located anywhere you have a computer - small mobile phone, or a desktop), so you can study on the train, but most importantly it's designed to be offline and downloadable. I want the 粵卷 to be a Cantonese learning resource that is freely available for anyone in the world that view/download it, for anyone that has the desire to learn, and for it to outlive me. I do see this being a life long project so I hope I can continue improving this project over the course of my life.

inkyoto 15 hours ago [-]
Cantonese is hardly a dying language as there are approximately 110 million native speakers of Cantonese across Southern China, South East Asia and Malaysia. Its status in Hong Kong, on the other hand, is indeed changing due to the HK government having to appease to the central government.

But just like most people in Gwong Dung learn to speak Cantonese as their first language despite having to speak the national language (Mandarin) too, the same will continue in Hong Kong as well. There is just too much history and cultural legacy associated with Cantonese that will keep the language going strong for generations to come, 咁…加油啦!

fearedbliss 11 hours ago [-]
Cantonese has no:

- Standard

- Country that speaks it (only Guangdong, Hong Kong, Macau, and the Overseas Diaspora speak it).

- Learners of the language are primarily within the above territories and learn it through exposure, not through actual proper education in Cantonese itself. But rather learn Standard Chinese writing in school, and just "auto-translate" to Cantonese in their mind. I would even say most Cantonese people don't know how to write Cantonese given the "defaulting to Standard Chinese". A lot of the times when I ask a Cantonese speaker how to write something in Cantonese, they don't know. This is due to many reasons which I won't elaborate atm in order to not make this post longer than what it already is. While this is good in "getting away with being able to write Chinese and communicate with the broader Chinese speaking world", it is not good for Cantonese itself. This is why we have so many words in Cantonese that don't even have a character. It also explains why eventually people did want to create more and more characters and luckily we were at least able to get the Hong Kong Supplementary Set standardized: https://en.wikipedia.org/wiki/Hong_Kong_Supplementary_Charac...

There has never been a point in Chinese history where the Chinese government has had this much power, and we already have plenty of evidence that they suppress and would like to eliminate the language from existence, and create their own view of what a "Chinese Person" and "Chinese Identity" is. Furthermore, we've already lost and mostly lost a lot of other Chinese languages. Hakka being one of them. It would be extremely difficult to find a Hakka speaker and Hakka learning resources at this point in time.

My time line for when Cantonese will die is far into the future, outside of my life time. But I would put it within the next 300-500 years or lower. Given this context, it is better to start the preservation and education efforts as early as possible.

I'm not expecting you to agree with me, but I wanted to give you some context. I'm not alone in my views.

ViktorRay 1 days ago [-]
This is pretty cool! Well done!
fearedbliss 20 hours ago [-]
Thank you :).
ww520 1 days ago [-]
This is excellent. Gamify learning. Nice way to learn.
fearedbliss 20 hours ago [-]
Thank you :).
johnzim 1 days ago [-]
Awesome!
fearedbliss 20 hours ago [-]
Thank you :).
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 22:51:36 GMT+0000 (Coordinated Universal Time) with Vercel.