How to query Lexique with Python

This example shows how to select four random sets of twenty nouns and verbs of low and high frequencies from Lexique382, using Python. (If you have not already, install Python: Go to https://www.anaconda.com/distribution/; select your OS (Windows, MacOS or Linux) and download the Python 3.7 installer.)


""" Example of selecting items from the Lexique382 database """

import pandas as pd

lex = pandas.read_csv('http://www.lexique.org/databases/Lexique382/Lexique382.tsv', sep='\t')

# alternatively, you can download the table locally:
# lex = pd.read_csv("Lexique382.tsv", sep='\t')

lex.head()

# restricts the search to words with a length between 5 and 8 letters
subset = lex.loc[(lex.nblettres >= 5) & (lex.nblettres <= 8)]

# separates nouns and verbs into two dataframes:
noms = subset.loc[subset.cgram == 'NOM']
verbs = subset.loc[subset.cgram == 'VER']

# splits based on lexical frequency
noms_hi = noms.loc[noms.freqlivres > 50.0]
noms_low = noms.loc[(noms.freqlivres < 10.0) & (noms.freqlivres > 1.0)]

verbs_hi = verbs.loc[verbs.freqlivres > 50.0]
verbs_low = verbs.loc[(verbs.freqlivres < 10.0) & (verbs.freqlivres > 1.0)]

# chooses random items from each of the 4 subsets:
N = 20
noms_hi.sample(N).ortho.to_csv('nomhi.txt', index=False)
noms_low.sample(N).ortho.to_csv('nomlo.txt', index=False)
verbs_hi.sample(N).ortho.to_csv('verhi.txt', index=False)
verbs_hi.sample(N).ortho.to_csv('verlo.txt', index=False)

Lexique

Boris New & Christophe Pallier

How to query Lexique with Python

Recent Posts

Meta