Very good question indeed!
(and you're right)
Actually, the corpus contains many proper nouns and thus uppercase should, in the most accurate processing, be taken into account.
However, to be able to really cope with it, we should either have (or even both)
However, to be able to really cope with it, we should either have (or even both)
- distinct common noun/proper noun tags, which we don't in the "universal" tagset;
- or end-of-sentence detector, which we did not introduced.
Once this has been decided, then, you're right, we shall be consistent and stick to it, exactly the way you mention.
Thanks for pointing out!