mip_dmp.qt5.components.embedding_visualization_widget module

Module that defines the class dedicated to the widget that supports the visualization of the initial automated mapping matches via embedding.

class mip_dmp.qt5.components.embedding_visualization_widget.WordEmbeddingVisualizationWidget(parent=None)[source]

Bases: QWidget

Class for the widget that supports the visualization of the automated column / CDE code matches via embedding.

adjustWindow()[source]

Adjust the window size, Qt Style Sheet, and title.

Parameters

mainWindowQMainWindow

The main window of the application.

generate_embedding_figure()[source]

Generate 3D scatter plot showing dimensionality-reduced embedding vectors of the words.

generate_embeddings(inputDatasetColumns: list, targetCDECodes: list, matchingMethod: str)[source]

Generate the embeddings of the columns and CDE codes.

Set the input dataset columns (self.inputDatasetColumns), the target CDE codes (self.targetCDECodes), the input dataset column embeddings (self.inputDatasetColumnEmbeddings) and the target CDE code embeddings (self.targetCDECodeEmbeddings).

The embeddings are generated using the specified matching method (matchingMethod). The matching method can be “glove” or “chars2vec”.

inputDatasetColumns: list

List of the input dataset columns.

targetCDECodes: list

List of the target CDE codes.

matchingMethod: str

Matching method. Can be “glove” or “chars2vec”

set_embeddings(inputDatasetColumnEmbeddings: list, inputDatasetColumns: list, targetCDECodeEmbeddings: list, targetCDECodes: list, matchedCdeCodes: dict, matchingMethod: str)[source]

Set the input dataset column and target CDE code embeddings.

inputDatasetColumnEmbeddings: list

List of the input dataset column embeddings.

inputDatasetColumns: list

List of the input dataset columns.

targetCDECodeEmbeddings: list

List of the target CDE code embeddings.

targetCDECodes: list

List of the target CDE codes.

matchedCdeCodes: dict

Dictionary of the matched CDE codes in the form:

{
    "input_dataset_column1": {
        "words": ["cde_code1", "cde_code2", ...],
        "embeddings": [embedding_vector1, embedding_vector2, ...]
        "distances": [distance1, distance2, ...]
    },
    "input_dataset_column2": {
        "words": ["cde_code1", "cde_code2", ...],
        "embeddings": [embedding_vector1, embedding_vector2, ...]
        "distances": [distance1, distance2, ...]
    },
    ...
}
matchingMethod: str

Matching method. Can be “glove” or “chars2vec”.

set_matching_method(matchingMethod)[source]

Set the matching method.

matchingMethod: str

Matching method. Can be “glove” or “chars2vec”

set_word_list(wordList)[source]

Set the list of words that can be visualized in the 3D scatter plot.

wordList: list

List of words to visualize in the 3D scatter plot

set_wordcombobox_items(wordList)[source]

Set the items of the word combo box.

wordList: list

List of words to visualize in the combo box of the widget that controls the selection of the word to visualize in the 3D scatter plot.

staticMetaObject = <PySide2.QtCore.QMetaObject object>