dedtech.info

Information about computer technology.

Load A Bigram Language Model From A JSON File With Python

This blog post will explain how to load a bigram language model from a json file with python. A language model is something that can be used to generate random text. It is easy to store a simple statistical language model in a json file.

The json library has to be imported for the program to be able to read a json file. The defaultdict library from collections has to be imported so the program can store the language model that is read from a json file.

from collections import defaultdict
import json

An empty defaultdict dictionary is declared to hold the language model that is being extracted from the json file.

model = defaultdict(dict)

The json file is opened and its contents are stored in the variable rows.

with open("result.json", "r") as file:
    rows = json.load(file)

The rows are traversed and the language model is built.

for row in rows:
    word1 = row["word1"]
    word2 = row["word2"]
    count = row["count"]
    model[word1][word2] = count

The language model is converted into a dict.

model = dict(model)

A search word will need to be used.

search_word = "is"

The variable next will be equal to the result of the model.get function that uses the search word.

next = model.get(search_word)

The matches are printed out.

for word, _ in sorted(next.items()):
    print(word)

This is what the whole source code looks like.

from collections import defaultdict
import json

model = defaultdict(dict)

with open("result.json", "r") as file:
    rows = json.load(file)
    
for row in rows:
    word1 = row["word1"]
    word2 = row["word2"]
    count = row["count"]
    model[word1][word2] = count

model = dict(model)

search_word = "is"

next = model.get(search_word)

for word, _ in sorted(next.items()):
    print(word)

Leave a Reply

Your email address will not be published. Required fields are marked *