This blog post will explain how to load a bigram language model from a json file with python. A language model is something that can be used to generate random text. It is easy to store a simple statistical language model in a json file.
The json library has to be imported for the program to be able to read a json file. The defaultdict library from collections has to be imported so the program can store the language model that is read from a json file.
from collections import defaultdict
import json
An empty defaultdict dictionary is declared to hold the language model that is being extracted from the json file.
model = defaultdict(dict)
The json file is opened and its contents are stored in the variable rows.
with open("result.json", "r") as file:
rows = json.load(file)
The rows are traversed and the language model is built.
for row in rows:
word1 = row["word1"]
word2 = row["word2"]
count = row["count"]
model[word1][word2] = count
The language model is converted into a dict.
model = dict(model)
A search word will need to be used.
search_word = "is"
The variable next will be equal to the result of the model.get function that uses the search word.
next = model.get(search_word)
The matches are printed out.
for word, _ in sorted(next.items()):
print(word)
This is what the whole source code looks like.
from collections import defaultdict
import json
model = defaultdict(dict)
with open("result.json", "r") as file:
rows = json.load(file)
for row in rows:
word1 = row["word1"]
word2 = row["word2"]
count = row["count"]
model[word1][word2] = count
model = dict(model)
search_word = "is"
next = model.get(search_word)
for word, _ in sorted(next.items()):
print(word)
