Difference between revisions of "Tools: Corpus linguistics"
m |
|||
Line 18: | Line 18: | ||
- "[At Braxil] tooke in a sort of ffish called mannettee and dying-stuff called ruckoo..." (Oct 1655)<ref>[[HCA 13/70 f.612r Annotate|HCA 13/70 f.612r]]</ref> | - "[At Braxil] tooke in a sort of ffish called mannettee and dying-stuff called ruckoo..." (Oct 1655)<ref>[[HCA 13/70 f.612r Annotate|HCA 13/70 f.612r]]</ref> | ||
---- | ---- | ||
+ | ==Potential creation of glossaries== | ||
+ | * EXAMPLE: [[Tools: Textiles, garments, & dyes glossary|Communal C17th Textiles, Garments and Dyestuffs Glossary]] | ||
+ | ---- | ||
==Potential Natural language programming use== | ==Potential Natural language programming use== | ||
Line 33: | Line 36: | ||
[ADD DATA] | [ADD DATA] | ||
---- | ---- | ||
− | |||
− | |||
− |
Revision as of 07:13, January 10, 2018
The MarineLives wiki contains nearly six million words of semi-diplomatically transcribed early and mid-C17th legal and commercial text. This is one of the larger collections of text derived from C17th manuscript sources, and is certainly the largest English language collection derived from early and mid C17th legal and commercial manuscript sources.
The MarineLives project team is keen to explore the corpus linguistic potential of the material, and welcomes approaches from corpus and historical linguists, interested in discussing this potential.
Contents
Potential etymological use
- ASSEVERATION
- "...the sayd Dirick Dobler in discourse with this deponent touching the premisses did assure this deponent with much asseveration that the sayd goods were all free and for Hamburgh and merchants there living onely or to that purpose..." (Jan 1653/54, English style)[1]
- MANNAGERIE
- "...the same were wholely left (as beleeveth) to the mannagerie of the foresayd William Warren who disposed of them at the place predeposed and went with the proceede of them to the Canaries..." (Jun 1658))[2]
- "...hee this deponent is a factor and agent to the articulate Antonio Rodrigues Robles here at London and soe hath bin for these five yeares last past or thereabouts and imployed by him in keepeing his accompts and mannagerie of his merchandizeing affayres..." (Jul 1658)[3]
- RUCKOO
- "[At Braxil] tooke in a sort of ffish called mannettee and dying-stuff called ruckoo..." (Oct 1655)[4]
Potential creation of glossaries
Potential Natural language programming use
In 2014, MarineLives collaborated with a team at the University of Mannheim Informatics Department, led by Professor Kai Eckert, to explore the application of Natural Language Programming to the MarineLives corpus.
The output of this collaboration was a paper given in Reykjavik at a workshop on Language Resources and Technologies for Processing and Linking Historical Documents and Archives in association with the LREC Conference, May 2014.[5]
Potential ground base for machine learning
- Source of data on C17th handwritten orthographical variation
[ADD DATA]
- ↑ HCA 13/68 f.555r
- ↑ HCA 13/72 f.377v
- ↑ HCA 13/72 f.386r
- ↑ HCA 13/70 f.612r
- ↑ Ritze, Dominique and Zirn, Cäcilia and Greenstreet, Colin and Eckert, Kai and Ponzetto, Simone Paolo (2014) Named Entities in Court: The MarineLives Corpus. In: Language Resources and Technologies for Processing and Linking Historical Documents and Archives - Deploying Linked Open Data in Cultural Heritage Workshop : associated with the LREC 2014 Conference, 26 - 30 May 2014, Reykjavik 2014 Reykjavik, Conference or workshop item, accessed 10/01/2018