Where this notation comes from and what it means
Where this notation comes from and what it means
I have seen in a few places such as here and here this sort of notation:
el-ler-imiz-in (Turkish)
hand-plr.-1st plr.-genitive case, ‘of our hands’
or:
kapi-ja-u-lau-nngit-tuq
stab-PASS-be-PST-NEG-3S.INTR.DECL
‘It was not stabbed.’
or:
nanuq kapi-ja-u-lau-nngit-tuq angunasukti-mut
polar bear stab-PASS-be-PST-NEG-3S.INTR.DECL hunter-OBL
‘The polar bear was not stabbed by the hunter.’
or:
iso|i|ssa auto|i|ssa in (the) big cars
big.ADJ‐def‐in car.N‐def‐in
It seems like it.is.dot.separated
to some degree, alternating between words and some sort of keywords. But I'm not totally sure, and don't know what the keywords mean in many cases.
it.is.dot.separated
Wondering (a) what this notation is, and (b) what are all the labels that can be used and their meaning (or a resource that lists these out). This way I would be able to interpret better the meaning of what they are saying. Maybe there is a standard format to it to, or the abbreviations are standard or something like that.
Basically looking for a list that says what the pieces are:
ADJ: Adjective
def-in: ...
PASS: ...
...
Not just the ones listed here but any that I may encounter. Thank you.
As for
def-in
, look at the Finnish word: iso|i|ssa
. i
encodes def
(definite) and ssa
encodes in
(the English word "in".) so "isoissa" means "In the big..."– Wilson
Sep 3 at 8:11
def-in
iso|i|ssa
i
def
ssa
in
I meant to say, it's not
def-in
it's def
and in
, separated by hyphen– Wilson
Sep 3 at 8:11
def-in
def
in
2 Answers
2
Some of your examples have switched the roles of dots and of hyphens.
It seems like it.is.dot.separated
to some degree
it.is.dot.separated
That's right. We want to use spaces to mark word boundaries, so we need some other way to mark morpheme boundaries. The dots are suggesting boundaries between morphemes. I say suggesting, since it's not always clear where they are:
gås (Norwegian)
goose.SING
gjess (Norwegian)
goose.PL
In that example, by attached .SING
or .PL
, I showed that the grammatical number is encoded as part of the word. If I said
.SING
.PL
cikap (Ainu)
bird
or
cikap utar (Ainu)
bird PL
"birds"
by writing the PL
separately I show that those are separate words. Another your example:
PL
kapi-ja-u-lau-nngit-tuq
stab-PASS-be-PST-NEG-3S.INTR.DECL
That morpheme -tuq
(I think it's usually spelled -toq
?) really does mean 3S.INTR.DECL
, third person singular, intransitive, declarative.
-tuq
-toq
3S.INTR.DECL
Basically looking for a list that says what the pieces are:
If you're after a definitive, complete list, I could be wrong but I don't think there is such a thing. That's because for singular/singulative, I've sometimes seen S
, sometimes SG
, sometimes SING
. There is also no definitive and complete list of all grammatical features. So these tokens are kind of ad-hoc or not formally specified. But in my opinion, a good book will have a glossary at the end explaining what you need to know.
S
SG
SING
But an incomplete list here should help you through a lot of the parsing you'll want to do. And a knowledge of the language the gloss is in is of course a big help.
Oh that list is great, thank you!
– Lance Pollard
Sep 3 at 9:32
@LancePollard: See also the glossaries in Crystal's Cambridge Encyclopedias, books where all of your questions so far can find answers.
– jlawler
Sep 3 at 15:38
Wilson's answer is great, but I'd like to clarify one point.
As a general rule, hyphens separate morphemes in the source language, and dots separate morphemes in the target language that aren't separate in the source language.
As an example, take the Latin word amō, "I love". I would gloss it as follows.
am-ō
love-1P.SING.PRES
In other words, the single morpheme -ō here indicates first person, singular number, and present tense. Since they're all part of the same morpheme in the source language, I separate them with dots instead of hyphens.
The standard authority for glossing is the Leipzig Glossing Rules. As they put it:
Segmentable morphemes are separated by hyphens, both in the example and
in the gloss. There must be exactly the same number of hyphens in the
example and in the gloss.
And later:
When a single [source language] element is rendered by several […] elements (words or abbreviations) [in the gloss], these are separated by periods.
The example they give is:
çık-mak
come.out-INF
The Leipzig rules also give a full list of "standard" glossing abbreviations, which are the ones you'll generally find without explanation. Anything not on that list will usually be explained thoroughly by the author. While not all linguists follow the Leipzig rules, most major publications do.
EDIT: As Dannii points out in the comments, the first example has the abbreviation plr.
with a period, which is decidedly non-standard and goes against the Leipzig rules. Just ignore the period in that example; it adds nothing.
plr.
The one extra thing to note is that in the first example it's using a dot to show an abbreviation: "plr." This is non-standard, and it's no wonder it confused the OP.
– curiousdannii
Sep 3 at 22:20
Maybe more precise to say that hyphens split and dots join ...?
– Adam Bittlingmayer
Sep 4 at 11:16
Thanks for contributing an answer to Linguistics Stack Exchange!
But avoid …
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
All glosses and abreviations are typically listed in the beginning of each book. Usually, the author choses them himself. Hyphens often indicate morpheme boundaries, dots - grammemes within a morpheme.
– Aharon M. Vertmont
Sep 3 at 8:04