I have been dutifully adding items to my FLEx dictionary as I study Japanese. Additionally, I have entered some text for the parser to work with. Now, it appears that the parser expects spaces between words – which is a bit of a problem for Japanese. But it appears from the description of “compounding rules” that I should be able to use them in some measure to overcome this.
So, a concrete example maybe. Take the phrase 高い山. This should be parsed as “high (adj) non-past-suffix (adj) mountain (n).” However, FLEx refuses to recognize the non-past suffix. Here are some facts:
- I have each of these lexemes entered properly as far as I can tell (adjectives have the headword without the final -い, though it is present in the citation form).
- I have created a Left-Headed compound rule with Adjective as the right stem and Noun as the left stem.
- I have created an Adjective template with an obligatory affix slot for time, and added the affix -い to it.
- I have tested with this affix marked both as suffix and as suffixing-interfix. I think it must be marked suffixing-interfix to allow FLEx’s compound parsing to work, but it doesn’t seem to make it succeed, so not sure.
Okay, so it is more complicated than that, actually. I have the adjective いい “good” entered in the system (similar difference between lexeme and citation form) . And because of this, the parser is actually returning “high (adj) good (adj.) mountain (n.).” Basically, rather than treat the suffix as a suffix, it is mixing it up with “good”. (If this compound actually were to be written, it would be 高いいい山 or 高い_いい_山). It appears that the parser is using the lexeme forms of all terms without respecting their templates (containing the required suffix)
So playing, I marked the adjective template as “Requires More Derivation”, which actually did work, sort of. Now it returns BOTH of the previous parses – the proper one AND the one containing “good”. I certainly wasn’t expecting that!
If I were to write out 高い_山 instead of 高い山, the parser works things out correctly – so it must be something I am not understanding about how compound rules are supposed to work, or about category templates. Any suggestions? Otherwise, I may just have to enter spaces between words, which I didn’t want to do because the texts I am entering don’t have them.