Difference between revisions of "Tools: Corpus linguistics"
m |
|||
Line 20: | Line 20: | ||
---- | ---- | ||
==Potential ground base for machine learning== | ==Potential ground base for machine learning== | ||
+ | |||
+ | * Source of data on C17th handwritten orthographical variation | ||
[ADD DATA] | [ADD DATA] |
Revision as of 05:54, January 10, 2018
The MarineLives wiki contains nearly six million words of semi-diplomatically transcribed early and mid-C17th legal and commercial text. This is one of the larger collections of text derived from C17th manuscript sources, and is certainly the largest English language collection derived from early and mid C17th legal and commercial manuscript sources.
The MarineLives project team is keen to explore the corpus linguistic potential of the material, and welcomes approaches from corpus and historical linguists, interested in discussing this potential.
Contents
Potential etymological use
- ASSEVERATION
- "...the sayd Dirick Dobler in discourse with this deponent touching the premisses did assure this deponent with much asseveration that the sayd goods were all free and for Hamburgh and merchants there living onely or to that purpose..." (Jan 1653/54, English style)[1]
- RUCKOO
- "[At Braxil] tooke in a sort of ffish called mannettee and dying-stuff called ruckoo..." (Oct 1655)[2]
[ADD DATA]
Potential Natural language programming use
[ADD DATA]
Potential ground base for machine learning
- Source of data on C17th handwritten orthographical variation
[ADD DATA]
Potential creation of glossaries